[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


I Love LDG Edition

Discussion and Development of Local Image and Video Models

Previous: >>108652848

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>mfw Resource news

04/21/2026

>MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping
https://jeoyal.github.io/MegaStyle

>UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
https://github.com/Yovecent/UDM-GRPO

>Noise-Adaptive Diffusion Sampling for Inverse Problems Without Task-Specific Tuning
https://github.com/NA-HMC/NA-HMC

>Evolutionary Negative Module Pruning for Better LoRA Merging
https://github.com/CaoAnda/ENMP-LoRAMerging

>DuQuant++: Fine-grained Rotation Enhances Microscaling FP4 Quantization
https://github.com/Hsu1023/DuQuant

>Generalizable Face Forgery Detection via Separable Prompt Learning
https://github.com/OUC-YER/SePL-DeepfakeDetection

>Adaptive receptive field-based spatial-frequency feature reconstruction network for few-shot fine-grained image classification
https://github.com/ICL-SUST/ARF-SFR-Net.git

>ComfyUI-DiffAid-Patches: Inference-time Diff-Aid-inspired text-conditioning patches for ComfyUI
https://github.com/xmarre/ComfyUI-DiffAid-Patches

>modl: Train LoRAs and generate images on your own GPU. Web UI + CLI
https://github.com/modl-org/modl

>ComfyUI-KleinRefGrid: Turns reference images into reference_latents
https://github.com/xb1n0ry/ComfyUI-KleinRefGrid

>node-banana: Free and open node based generative workflows
https://github.com/shrimbly/node-banana

>Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
https://github.com/AMAP-ML/EMF

04/20/2026

>Elucidating the SNR-t Bias of Diffusion Probabilistic Models
https://github.com/AMAP-ML/DCW

>(1D) Ordered Tokens Enable Efficient Test-Time Search
https://soto.epfl.ch

>Frequency-Aware Flow Matching for High-Quality Image Generation
https://github.com/OliverRensu/FreqFlow

>From Zero to Detail: A Progressive Spectral Decoupling Paradigm for UHD Image Restoration with New Benchmark
https://github.com/NJU-PCALab/ERR
>>
>mfw Research news

04/21/2026

>DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior
https://arxiv.org/abs/2604.17195

>Speculative Decoding for Autoregressive Video Generation
https://arxiv.org/abs/2604.17397

>LIVE: Leveraging Image Manipulation Priors for Instruction-based Video Editing
https://arxiv.org/abs/2604.17021

>AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation
https://arxiv.org/abs/2604.18348

>Coevolving Representations in Joint Image-Feature Diffusion
https://arxiv.org/abs/2604.17492

>Reward Score Matching: Unifying Reward-based Fine-tuning for Flow and Diffusion Models
https://arxiv.org/abs/2604.17415

>UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models
https://arxiv.org/abs/2604.17565

>FlowC2S: Flowing from Current to Succeeding Frames for Fast and Memory-Efficient Video Continuation
https://arxiv.org/abs/2604.17625

>ReCap: Lightweight Referential Grounding for Coherent Story Visualization
https://arxiv.org/abs/2604.18575

>UniCSG: Unified High-Fidelity Content-Constrained Style-Driven Generation via Staged Semantic and Frequency Disentanglement
https://arxiv.org/abs/2604.17850

>Towards Robust Text-to-Image Person Retrieval: Multi-View Reformulation for Semantic Compensation
https://arxiv.org/abs/2604.18376

>mEOL: Training-Free Instruction-Guided Multimodal Embedder for Vector Graphics and Image Retrieval
https://scene-the-ella.github.io/meol

>Depth Adaptive Efficient Visual Autoregressive Modeling
https://arxiv.org/abs/2604.17286

>When Text Hijacks Vision: Benchmarking and Mitigating Text Overlay-Induced Hallucination in Vision Language Models
https://arxiv.org/abs/2604.17375

>Cross-Modal Attention Analysis and Optimization in Vision-Language Models: A Study on Visual Reliability
https://arxiv.org/abs/2604.17217

>Spatiotemporal Sycophancy: Negation-Based Gaslighting in Video Large Language Models
https://arxiv.org/abs/2604.17873
>>
>>108655785
>>108655793
stop deblessing /ldg/ thread schizo
>>
File: 1776241068093602.jpg (1.15 MB, 3840x2160)
1.15 MB JPG
mogged
>>
>>108655803
There's a picture of my penis in there. How the fuck?
>>
>>108655803
here's the context lol
>>108654985
>>108655069
>>
File: roastie.png (9 KB, 287x147)
9 KB PNG
Why people are so naive or ignorant when it comes to NSFW AI generated content, I always get asked "what AI you use bro?" when I share something on reddit or X, they think I just write a prompt on a site and then AI does all the magic?
>>
>>108656011
Almost every human is an absolute retard, that should have been clear by now.
>>
File: FluxKrea_Output_7272727.png (3.01 MB, 1176x1752)
3.01 MB PNG
>>
File: roastie2.png (8 KB, 295x107)
8 KB PNG
>>108656011
For context, this OF whore wrote me a DM asking me how I generate my videos (I use wan, VACE, LTX, post processing, etc, etc), I tell her that I have a local setup and I use several workflows, that is not that simple but I'm happy to collab (for money ofc) and then she writes me that crap >>108656011
>>
>>108656011
>they think I just write a prompt on a site and then AI does all the magic?
yes? the future is AI models using tools to correct themselves and build all pieces together (like here, GPT Image-2 makes an image -> looks at it -> notices the issues -> fixes those issue with an image edit process) >>108655670
>>
>>108656084
generally when dealing with retards you charge a retard tax, I always give a stupid big quote to retards and sometimes they take it and it's worth the headache
>>
>>108656085
Have you not learned anything from the past few years? That's not how it works, especially with SaaS (tools get nerfed, rugpulled by the big corpos) and even more with NSFW content and we're talking about today, not the future you idiot
>>
>>108656105
>not the future
definitely the future, the /lmg/ fags are incorporating tools on gemma 4, as usual /ldg/ is completly clueless about the news and how to move forwards, it's filled with retards like you
>>
Why do anons lurk and post here if they think every other anon here is a retard? Surely they'd find some other place to post...
>>
>>108656122
AI doesn't think so any workflow based on AI "thinking" to itself will just end in piles of aesthetic trash. Even the best models still can't handle 2000 lines of code without going schizo on the task and that's code that is relatively simple, recursion destroys most models same with complex A() -> F() -> C() -> E() -> D() relationships
>>
>>108656122
gemma4 is local thoever? anons here use it to caption and write prompts all the time desu desu.
>>
>>108656147
>thoever
saar?
>>
>>108656122
>/lmg/ fags are incorporating tools on gemma 4
no?
>>
>>108656139
localchads live rent free in their minds.
>>
I saw GPT-Image 2 released. Can anon share some gens?
>>
File: HGc7c9zaEAAwbwW.jfif.jpg (227 KB, 1620x1622)
227 KB JPG
It's over for local
>>
File: ComfyUI_temp_puiqp_00053_.png (3.83 MB, 1792x1312)
3.83 MB PNG
>>108656122
I can tell you that any serious genner/trainer (myself included) is using gemma4, you're just ignoring my original post that was that people overlook the process of generating NSFW AI content, especially images and videos, LLM and text/code based crap is easy as shit thats why /lmg/ threads are filled with happy people and /ldg/ is filled with frustrated anons that can't generate anything that other few anons can
>>
File: 1750641152265431.png (2.18 MB, 1402x1122)
2.18 MB PNG
>>108656165
yes, cute cat
prompt:
>generate cat image
>>
>>108656174
How many rugpulls until you learn that they will take away the good model after the good press dies down? We've done this like 6 times now.
>>
>>108656011
>>108656084
Total OF whore death
>>
>>108656165
>Can anon share some gens?
no, there's a thread for that and the images are already shared here >>108653190
>>
>>108656189
its always the same cycle of these "groundbreaking" models, they get hyped, users start generating viral stuff, copyright holders get mad, tools get nerfed, userbase gets mad and tools dropped
>>
File: 1768660288641324.png (795 KB, 768x789)
795 KB PNG
>>108656227
>its always the same cycle
there's even a name for that
https://en.wikipedia.org/wiki/
>>
>>108656227
No it's the costs, they are expensive to run so the first they do is cut compute, that's outside of the safety. Nano Banana is so bad because they do bullshit like switch you to fast mode even though it's completely shit. Also the core model just got worse slowly but surely.
>>
>>108656174
The killed sora for this.
>>
File: sydney sweeney gptimage2.png (2.37 MB, 1920x1072)
2.37 MB PNG
Tested GPT-Image in the API.
Can do celebs by attaching picture (edit mode).
No NSFW as expected though, maybe allows limited artistic stuff but didn't bother pushing too much.
Was able to give a woman cleavage by cloth swapping, but that's the extent of what I got.
Dogshit for style transfers, hallucinates what's style, what's content, omits or makes up details. 4 years into this and still not a single good model on this front.
Can make detailed infographics with text without slopping the text, for whatever that's worth.
>>
File: 616728553997233.png (3.28 MB, 2016x1152)
3.28 MB PNG
>>
>>108656284
>>108656199
>>
>>108656243
nice wiki page.
>>
>>108656320
lmao
https://en.wikipedia.org/wiki/Enshittification
>>
I want to make an anime style LoRA. I checked the official repository, but it seems like the program is only compatible with Linux. Is there another workaround?
>>108656284
How about copyrighted anime characters?
>>
>>108656153
>no?
yes >>108656365
>>
>>108656376
*Anima
I want to make an Anima LoRa but it seems it is only Linux compatible
>>
File: gpt image 2.png (2.59 MB, 1536x1024)
2.59 MB PNG
>>108656227
nope, its different now
google finally has a worthy image gen competitor, so they wont be able to fuck around with their users anymore.
here's whats going to happen. very soon, google will release nbp 2, which will be better than gpt image 2
it will look similar to the vibecoding war between claude and codex. if either of them starts to enshittify their model, then users will just jump ship.
apichads won, and as a consequence, localcucks will benefit since the chinese models will train off the api outputs
you're welcome
>>
>>108656383
https://github.com/gazingstars123/Anima-Standalone-Trainer
>>
>>108656383
https://github.com/gazingstars123/Anima-Standalone-Trainer

I'm using this on Linux but it shows Binbows support. No idea if it's any good on Winblows but try it I guess.
>>
>>108656389
Indians should be banned from the internet
>>
File: 1763794983641924.jpg (459 KB, 1250x1566)
459 KB JPG
>>108656389
>localcuks benefit since chinese models will train off the api outputs
fuck that shit dude, China must stop eating the shit of API western models and do the Z-image turbo way (the kino way)
>>
>>108656389
NBP is still better that 2 at some things. But it's good that OpenAI has something that isn't complete dogshit.
>>
File: Flux2-Klein_00589_.png (1.03 MB, 800x1280)
1.03 MB PNG
>>108656389
>words words words
APIcucks cant meme
>>
>>108656398
>>108656391
Thanks and you can replicate the same settings tdrusell shows in his official Anima Lora?
>>
>>108656084
think she'd let a humble localchad suck on her toes or something? how big are her tits
>>
>>108656423
stupid sexy gooks
>>
>>108656389
>so they wont be able to fuck around with their users anymore.
LOL
O
L
>>
>>108656376
>I want to make an anime style LoRA. I checked the official repository, but it seems like the program is only compatible with Linux. Is there another workaround?
sd-scripts supports it. I think it works on Windows, not sure.
>How about copyrighted anime characters?
I am done testing for today but I wouldn't expect it to be too pissy about it. If anything you are more likely to run into issues with Disney, Sony, Nintendo etc. characters.
>>
>>108656426
I don't know who that is or what you're talking about. Sorry.
>>
What is the status of copyrighted anime characters with GPT Image 2? We won?
>>
>>108656165
A plain prompt "Warhammer 40k crossover with one piece"
>>
>>108656426
Well I can't replicate the exact lora as he (understandably) does not provide the images used but the settings he provides seem like sane defaults. No "catastrophic" forgetting or anything I've heard claimed about training loras on it.
>>
>>108656466
What I mean is that TDRussell shared some LoRA training settings to use in his only Linux compatible workflow. My question is whether I can select the same settings here >>108656391
>>
>>108656466
He shared the training dataset for his rutkowski lora.
>>
>>108656458
why does it look like someone injected extra noise into the last steps of the diffusion process
>>
>>108656466
Sorry, i inderstood thanks
>>
>>108656486
There's nothing wildly out of the scope of the average trainer that I can tell. So you should be able to use the same settings just fine.
>>
>>108656486
Training settings do not depend on OS so the answer is yes if that tool has implemented every relevant feature.
>>
File: 1767822737003517.jpg (569 KB, 1663x1247)
569 KB JPG
>>108656165
>>
>>108656513
that's crazy good. too bad no porn so it's worthless.
>>
>>108656458
howd they manage to keep the ugly sepia poison
>>
>>108656513
wtf this is next level, holy fuck...
>>
reminder that if you want to talk about that model you have to go here, this is a fucking local thread in case you forgot
>>108653190
>>108653190
>>108653190
>>
>>108656513
the hands are still fucked though
>>
>>108656513
hands are on another level
>>
>>108656389
Midjourney still mogs both NB2 and GPT Image-2 in terms of aesthetics.
>>
>>108656552
emoboy4ever had an accident. leave his hands out of this
>>
>>108656513
No fucking way...
>>
>>108656513
I didn't realize how much better an AI image gets when the text is correct, this makes the difference



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.