/g/ - /ldg/ - Local Diffusion General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/ldg/ - Local Diffusion Genera(...) 12/31/25(Wed)02:12:08 No.107718306

File: 1.jpg (1.88 MB, 3684x2282)

/ldg/ - Local Diffusion General Anonymous 12/31/25(Wed)02:12:08 No.107718306

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107710110

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

Anonymous
12/31/25(Wed)02:16:02 No.107718330

Anonymous 12/31/25(Wed)02:16:02 No.107718330

is the spam filter down? can we have a thread at last?

Anonymous
12/31/25(Wed)02:19:28 No.107718348

Anonymous 12/31/25(Wed)02:19:28 No.107718348

What light lora combo are people using with Wan 2.2 SVI? Everything I try gives me the dreaded slow motion effect, even a combo that works flawlessly without SVI.

Also, haven't posted in awhile and what the FUCK are these new aids tier captchas?

Anonymous
12/31/25(Wed)02:21:20 No.107718356

Anonymous 12/31/25(Wed)02:21:20 No.107718356

>>107718348
>What light lora combo are people using with Wan 2.2 SVI?
frame editing in actual software

Anonymous
12/31/25(Wed)02:29:26 No.107718390

Anonymous 12/31/25(Wed)02:29:26 No.107718390

File: ComfyUI_00021_.png (1.91 MB, 1080x1920)

1.91 MB PNG

Only wan2.2 low noise can output this kind of anatomy perfection... and it's a video model FFS u_u

Anonymous
12/31/25(Wed)02:30:47 No.107718399

Anonymous 12/31/25(Wed)02:30:47 No.107718399

HOLY BLURRY OVER EXPOSED SLOPPY DOPPY

Anonymous
12/31/25(Wed)02:38:00 No.107718425

Anonymous 12/31/25(Wed)02:38:00 No.107718425

>>107718306
thanks for the cozy bread anon

Anonymous
12/31/25(Wed)02:41:54 No.107718438

Anonymous 12/31/25(Wed)02:41:54 No.107718438

>>107718390
what loras tho?
I can get ok-ish stuff with the wan remix workflow, but nowhere near that detailed

Anonymous
12/31/25(Wed)02:44:20 No.107718446

Anonymous 12/31/25(Wed)02:44:20 No.107718446

File: 1763688331719309.png (10 KB, 301x230)

10 KB PNG

which one of these is the bitcoin miner

Anonymous
12/31/25(Wed)02:45:12 No.107718451

Anonymous 12/31/25(Wed)02:45:12 No.107718451

>>107718446
comfyui

Anonymous
12/31/25(Wed)02:46:20 No.107718456

Anonymous 12/31/25(Wed)02:46:20 No.107718456

>>107718451
this

Anonymous
12/31/25(Wed)02:47:29 No.107718461

Anonymous 12/31/25(Wed)02:47:29 No.107718461

File: ComfyUI_00004_.png (1.1 MB, 1152x832)

1.1 MB PNG

>>107718446
Check your RAM and disk usage when genning, especially during the initial load, it's probably spilling onto your pagefile. Beware that it's not only painfully slow, it also rapes your SSD with constant writes

Anonymous
12/31/25(Wed)02:48:30 No.107718464

Anonymous 12/31/25(Wed)02:48:30 No.107718464

>>107718461
peak comfyui gen

Anonymous
12/31/25(Wed)02:50:06 No.107718471

Anonymous 12/31/25(Wed)02:50:06 No.107718471

>>107718461
>it also rapes your SSD with constant writes
great software comfy. what a great feature

Anonymous
12/31/25(Wed)03:01:04 No.107718513

Anonymous 12/31/25(Wed)03:01:04 No.107718513

File: 1756435294175495.png (54 KB, 1100x523)

54 KB PNG

>lol just ask on discord

Anonymous
12/31/25(Wed)03:04:53 No.107718533

Anonymous 12/31/25(Wed)03:04:53 No.107718533

>>107718513
the guy that writes that shit doesn't even know how to use comfy in the first place. the whole org is a shit show

Anonymous
12/31/25(Wed)03:14:52 No.107718571

Anonymous 12/31/25(Wed)03:14:52 No.107718571

File: ComfyUI_00007_.png (1.11 MB, 1152x832)

1.11 MB PNG

>>107718471
That's just how memory management in Bimbows works, it'll start offloading to pagefile long before you hit max memory limit. The reason why Comfy is a hack is he didn't bother with implementing partial splits between available CUDA devices, especially if you have multiple GPUs available. It'll just try to load the whole thing into RAM and then move everything (or most of it at least) into VRAM. Or maybe it's a pytorch limitation, idk I'm not too tech savvy

Anonymous
12/31/25(Wed)03:15:44 No.107718574

Anonymous 12/31/25(Wed)03:15:44 No.107718574

>>107718571
what's up with comfy users? why do they generate kids so often?

Anonymous
12/31/25(Wed)03:58:02 No.107718783

Anonymous 12/31/25(Wed)03:58:02 No.107718783

>>107717857
Hey i was at work thanks for answering, try to look into wan animate than

Anonymous
12/31/25(Wed)04:11:45 No.107718862

Anonymous 12/31/25(Wed)04:11:45 No.107718862

>>107718390
>le flux chin
disgusting

Anonymous
12/31/25(Wed)04:41:35 No.107718988

Anonymous 12/31/25(Wed)04:41:35 No.107718988

this place is dead as fuck, hoooly.
anyway, qwen just curb stomped slam dunked z-img it seems. will find out though, ggoof still downloading.
https://x.com/Alibaba_Qwen/status/2006294325240668255

Anonymous
12/31/25(Wed)04:48:12 No.107719004

Anonymous 12/31/25(Wed)04:48:12 No.107719004

>>107718988
>leaving to celebrate new year in 10min

At least it's not base.

Anonymous
12/31/25(Wed)04:53:11 No.107719022

Anonymous 12/31/25(Wed)04:53:11 No.107719022

>>107718988
DFloat11 when?

Anonymous
12/31/25(Wed)05:05:56 No.107719076

Anonymous 12/31/25(Wed)05:05:56 No.107719076

>>107718988
>curb stomped
They're on the same team lmao
>>107719022
>DFloat11 when?
If you have the memory to run it, you almost certainly have the memory to quantise it to DF11 yourself

Anonymous
12/31/25(Wed)05:06:24 No.107719079

Anonymous 12/31/25(Wed)05:06:24 No.107719079

Low key just waiting for LTX 2 next month.

Anonymous
12/31/25(Wed)05:07:31 No.107719084

Anonymous 12/31/25(Wed)05:07:31 No.107719084

>>107719076
>They're on the same team
there are no teams when it's chinese bloodsports!

Anonymous
12/31/25(Wed)05:12:19 No.107719123

Anonymous 12/31/25(Wed)05:12:19 No.107719123

File: QwenImage-euler-1.0-8-202(...).png (3.05 MB, 1088x1920)

3.05 MB PNG

okay yeah this kinda blows compared to z-img. and that's a TURBO tarded not even fully trained model. that said, it's nice. better than flux at least.

Anonymous
12/31/25(Wed)05:16:40 No.107719143

Anonymous 12/31/25(Wed)05:16:40 No.107719143

File: file.png (1.67 MB, 720x1280)

1.67 MB PNG

>>107718988
> woman lying on grass
dont bother downloading

Anonymous
12/31/25(Wed)05:18:34 No.107719152

Anonymous 12/31/25(Wed)05:18:34 No.107719152

File: QwenImage-euler-1.0-8-202(...).png (1.48 MB, 1088x1088)

1.48 MB PNG

>>107719143
i mean it could just be the lightning lora isn't fully compatible
but it ain't lookin' good chief.
https://huggingface.co/lightx2v/Qwen-Image-Lightning/tree/main

Anonymous
12/31/25(Wed)05:20:21 No.107719162

Anonymous 12/31/25(Wed)05:20:21 No.107719162

>>107719022
>At batch size = 1, inference is approximately 2× slower than the original BF16 model, but the performance gap narrows significantly with larger batches.
DFloat11 is snake oil. If you have low vram you're trading 30% reduction in weight size for 2x slower inference, offloading would be faster. If you have the vram to run larger batches you don't need a 30% reduction in weight size. There's no valid use case.

Anonymous
12/31/25(Wed)05:21:30 No.107719170

Anonymous 12/31/25(Wed)05:21:30 No.107719170

File: ComfyUI-euler-1.0-9-2025-(...).png (2.96 MB, 1536x1536)

2.96 MB PNG

>meanwhile a month ago in 4 steps on z-img turbo

>>107719162
also this, in general, just offload. i've been copemaxxing with low q quants or fp8 when this entire time i could've just offloaded a few niggerbytes on a full q8 model or fp16 if it's reasonably sized. i'm running the new qwen at q8.

Anonymous
12/31/25(Wed)05:21:32 No.107719172

Anonymous 12/31/25(Wed)05:21:32 No.107719172

that DoF looks horrific. oil pastel turbocharged

Anonymous
12/31/25(Wed)05:23:12 No.107719183

Anonymous 12/31/25(Wed)05:23:12 No.107719183

File: ComfyUI_00062_.png (1.3 MB, 720x1280)

1.3 MB PNG

>>107719152
>>107719143 was without lightning lora
picrel with

Anonymous
12/31/25(Wed)05:31:05 No.107719228

Anonymous 12/31/25(Wed)05:31:05 No.107719228

>>107719183
yikes buddy. well time to free up that 20 gigs. (and the lightning loras)

Anonymous
12/31/25(Wed)05:33:33 No.107719238

Anonymous 12/31/25(Wed)05:33:33 No.107719238

File: ComfyUI_05184.png (2.42 MB, 2160x1168)

2.42 MB PNG

>>107719076
>>107719162
I was only pretending to be retarded! Q8/BF16 fits in my 4090 just fine. 2512 froze around 30 steps though and went from ~3.4it/s to 44+ for the rest of the 40 steps. Took 568s for this image. Way slower than I remember the edit models being.

Anonymous
12/31/25(Wed)05:35:15 No.107719245

Anonymous 12/31/25(Wed)05:35:15 No.107719245

>>107718390
catbox?

Anonymous
12/31/25(Wed)05:47:48 No.107719311

Anonymous 12/31/25(Wed)05:47:48 No.107719311

>>107719079
>just waiting for LTX 2 next month.
Same. I tried wan 2.6 on their website wan.video (they're kinda trying to be like sora) and it was disappointing. LTX2 is also looking a little disappointing but once something is out people will finetune it to be better

>>107719162
>>107719170
The valid use case for DF11 is large models run at a large batch size by corps because the inference hit is negated at a higher batch size iirc
So if you're serving an LLM, or you're Google serving a massive autoregressive edit model it makes sense

Most people should use q8_0 for most tasks on most models (vae excluded, text encoder preferably excluded). In fact, even if you can fit bf16 in memory you should still use the q8_0 if you value your time

Anonymous
12/31/25(Wed)05:50:06 No.107719328

Anonymous 12/31/25(Wed)05:50:06 No.107719328

File: ComfyUI_00212_.png (1.09 MB, 752x1392)

1.09 MB PNG

>>107719311
>because the inference hit is negated at a higher batch size iirc

aaahh i see.

Anonymous
12/31/25(Wed)06:05:48 No.107719432

Anonymous 12/31/25(Wed)06:05:48 No.107719432

>>107719183
>mfw i lay on the grass and realize the back of my head is now full of shit stains

Anonymous
12/31/25(Wed)06:06:23 No.107719433

Anonymous 12/31/25(Wed)06:06:23 No.107719433

File: 00008-2054810216.png (2.4 MB, 1824x1248)

2.4 MB PNG

>>107719079
i have mixed feeling toward ltx-2 now that multiple Chinese models have come out this month with combined audio generation with 10-15 second prompt adherence. Even seedance 1.5 has video+audio combined generation and it's uncensored. LTX-2 is very pricey when it comes to using it online with light trick's own credit system and it's censored with the ken/barbie doll bodies. hopefully LTX-2 isn't too vram hungry and will be able to run on 24gb-32gb vram. This is the make it or break it for light tricks to get their shit together. The chinks are competing for the 3rd and 4th place in the ai video generation market with kling2.6, minimax hailuo2.3, wan2.6 and seedance 1.5. There is also runway, luma, vidu and pika in the competition for the ai video generation space. LTX-2 needs to be really good from the start to overthrow the popular dominance that wan 2.2 has over the open source scene.
here are some ltx 2 video if anyone's curious
https://files.catbox.moe/gz8lao.mp4
https://files.catbox.moe/hn9uyw.mp4
https://files.catbox.moe/7vweqv.mp4
https://files.catbox.moe/clx4i3.mp4
https://files.catbox.moe/abb430.mp4

here are some seedance 1.5 video(closed source)
https://files.catbox.moe/b71h5f.mp4
https://files.catbox.moe/adkp98.mp4
https://files.catbox.moe/adkp98.mp4

Anonymous
12/31/25(Wed)06:08:59 No.107719447

Anonymous 12/31/25(Wed)06:08:59 No.107719447

>>107719433
Is wan not coomer friendly/to ressource intensive of why i barely see stuff?

Anonymous
12/31/25(Wed)06:09:43 No.107719454

Anonymous 12/31/25(Wed)06:09:43 No.107719454

this dude gets off to asmr and you know it

Anonymous
12/31/25(Wed)06:27:32 No.107719575

Anonymous 12/31/25(Wed)06:27:32 No.107719575

>>107719311
Still not convinced. There's a point where taking the batch size higher would yield diminishing returns due to compute constraints and overall response times would be a factor, as in, what's the point of squeezing an extra few responses into the batches if it makes all the responses take longer than users want

Anonymous
12/31/25(Wed)06:39:16 No.107719649

Anonymous 12/31/25(Wed)06:39:16 No.107719649

>>107719447
holy esl
you're too late, the peak wan era was this summer

>>107719433
yeah I have nothing to look forward to with the audio in these two models. Sad, but a sora competitor will exist sometime in 2026 for sure

>>107719575
>Still not convinced.
Just taking what I read from the orange sneddit discussion on DFloat11 https://news.ycombinator.com/item?id=43796935

But there's also people in that thread that say shit like
>Blackwell GPUs support dynamic FP4 quantization with group size 16. At that group size it's close to lossless (in terms of accuracy metrics).
which sounds too retarded to believe. I'll check that myself using something like t5-small in a few hours

Anonymous
12/31/25(Wed)06:42:53 No.107719675

Anonymous 12/31/25(Wed)06:42:53 No.107719675

>>107719433
why dont you post this on reddit, fucking subhuman retard?

Anonymous
12/31/25(Wed)07:29:12 No.107719962

Anonymous 12/31/25(Wed)07:29:12 No.107719962

>>107718348
>and what the FUCK are these new aids tier captchas?
read what its asking you to do.

I use this btw

https://civitai.com/models/2190659/dasiwa-wan-22-i2v-14b-tastysin-v8-or-lightspeed-or-gguf?modelVersionId=2466604

Its the best I've used, it has the speed up lora's merged in as well as a bunch of nsfw lora's and other stuff.

Anonymous
12/31/25(Wed)07:59:44 No.107720148

Anonymous 12/31/25(Wed)07:59:44 No.107720148

>>107719962
does it show assholes?

Anonymous
12/31/25(Wed)08:10:29 No.107720211

Anonymous 12/31/25(Wed)08:10:29 No.107720211

>>107720148
Only if you prompt it to show Piers Morgan

Anonymous
12/31/25(Wed)08:26:04 No.107720301

Anonymous 12/31/25(Wed)08:26:04 No.107720301

File: img_00048_.jpg (595 KB, 1288x1656)

595 KB JPG

Anonymous
12/31/25(Wed)08:28:47 No.107720320

Anonymous 12/31/25(Wed)08:28:47 No.107720320

File: img_00051_.jpg (548 KB, 1288x1656)

548 KB JPG

Anonymous
12/31/25(Wed)08:38:16 No.107720379

Anonymous 12/31/25(Wed)08:38:16 No.107720379

>>107719962
why can't any of these retards document their shit properly? the workflow he provides requires patreon models and the info on the left side fails to answer the question on how to use this thing. and it's a gguf which is like a different breed of species. the link sends you to a high version but apparently you need a low too? pure retardation. my normal wan 2.2 i2v just works.

Anonymous
12/31/25(Wed)08:41:48 No.107720393

Anonymous 12/31/25(Wed)08:41:48 No.107720393

>>107718348
>Wan 2.2 SVI
it like base wan does not do scene cuts very well, it will change characters appearance too much imo. It handles moments when their face is hidden between clips but an actual | cut to still changes the person's face some what, probably something to do with wan it self, i suspect wan goes into T2V mode when doing a cut scene, oh well its just best to avoid them.

all actions should be performed one at a time and use simple prompts but you probably already know that. At least that is the case for I2V.

I don't know the right combo of light lora's but lets just say i went back to using base wan and light lora's to see if they would be any better and it completely fucking mangles shit, so i don't know how or what this guy did.

https://civitai.com/models/2190659/dasiwa-wan-22-i2v-14b-tastysin-v8-or-lightspeed-or-gguf?modelVersionId=2466604

so that is why i recommend it because it just works so well.

Anonymous
12/31/25(Wed)08:47:47 No.107720426

Anonymous 12/31/25(Wed)08:47:47 No.107720426

>>107720379
hey, i'll help you out later in the day, i'm working on an also perfect looping workflow for SVI as in no long chaining of prompts and samplers. I'm not getting quality loss despite using the same seed between clips that others are reporting so i'm on to something good here. The only issue I'm getting is micro jumps between clips when i merge them using an external ffmpeg script because that is faster... But I see in the SVI workflow provided here

https://www.reddit.com/r/NeuralCinema/comments/1pyeoci/svi_20_pro_wan_22_84step_infinite_video_workflow/

its using some kj node to combine clips with some kind of 5 frame over lap that i don't understand what its doing, so maybe its interpolation of sorts i need to figure out. the jumps are very subtle but still noticeable like jump edits people do on youtube videos.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.