/g/ - /ldg/ - Local Diffusion General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/ldg/ - Local Diffusion Genera(...) 11/22/25(Sat)13:51:34 No.107294974

File: highlights_g_107284812_17(...).jpg (1.79 MB, 2497x2366)

/ldg/ - Local Diffusion General Anonymous 11/22/25(Sat)13:51:34 No.107294974

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107284812

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo

Anonymous
11/22/25(Sat)13:59:09 No.107295046

Anonymous 11/22/25(Sat)13:59:09 No.107295046

File: 1733841816588819.png (625 KB, 1033x1305)

625 KB PNG

>>107294986
>Calling it now: Flux2-Small, the only version they will release the weights for. Will probably be pretty good actually
https://xcancel.com/bdsqlsz/status/1992248711603454072#m
it might be Flux 2 small, he said that it won't be a chinese model

Anonymous
11/22/25(Sat)14:06:03 No.107295129

Anonymous 11/22/25(Sat)14:06:03 No.107295129

>>107294079
>why does it look bad?????
There are wan 2.1 workflows that are simpler than look better than mine too

I was also asking because I'm not sure what the point of the second sampler in anon's workflow even is beyond an upscale step to get high resolution and I can't replicate it on a basic ksampler

Actually I was looking for the citivai workflow to show you but I found like 3 other ones so honestly I'll just do my own research tonight and find a really good workflow for text to image that uses as little custom nodes as possible

Anonymous
11/22/25(Sat)14:09:36 No.107295158

Anonymous 11/22/25(Sat)14:09:36 No.107295158

File: 1743762478731903.png (1.79 MB, 1120x1440)

1.79 MB PNG

Anonymous
11/22/25(Sat)14:12:07 No.107295185

Anonymous 11/22/25(Sat)14:12:07 No.107295185

File: hunyuan_video_1.5_00003_.mp4 (392 KB, 464x720)

392 KB MP4

>>107295046
>SOTA in terms of portrait photography
>posts the sloppiest base SDXL-esque picture ever

Are these guys just slop blind? That literally looks like a 3D render.

Anonymous
11/22/25(Sat)14:14:22 No.107295208

Anonymous 11/22/25(Sat)14:14:22 No.107295208

>>107295185
>Are these guys just slop blind?
they probably are, in china women have so much makeup they think plastic shit is the default skin or something

Anonymous
11/22/25(Sat)14:16:39 No.107295224

Anonymous 11/22/25(Sat)14:16:39 No.107295224

>>107295208
>in china
>>107295046
>he said that it won't be a chinese model
zero reading comprehension award

Anonymous
11/22/25(Sat)14:17:28 No.107295235

Anonymous 11/22/25(Sat)14:17:28 No.107295235

>>107295224
the guy that says it looks realistic is chinese, are you fucking retarded?

Anonymous
11/22/25(Sat)14:18:10 No.107295245

Anonymous 11/22/25(Sat)14:18:10 No.107295245

>twitter screencap

Anonymous
11/22/25(Sat)14:18:14 No.107295246

Anonymous 11/22/25(Sat)14:18:14 No.107295246

>>107295235
no u

Anonymous
11/22/25(Sat)14:18:53 No.107295258

Anonymous 11/22/25(Sat)14:18:53 No.107295258

>>107295245
>X (the everything app) screencap
FTFY

Anonymous
11/22/25(Sat)14:19:43 No.107295267

Anonymous 11/22/25(Sat)14:19:43 No.107295267

Do you think we might get local Sora in ~5 years? The porn (and electricity bills) would be unreal.

Anonymous
11/22/25(Sat)14:20:24 No.107295275

Anonymous 11/22/25(Sat)14:20:24 No.107295275

>>107295267
This depends entirely on the GPU market.

Anonymous
11/22/25(Sat)14:21:48 No.107295288

Anonymous 11/22/25(Sat)14:21:48 No.107295288

File: hunyuan_video_1.5_00004_.mp4 (365 KB, 464x720)

365 KB MP4

>>107295208
It's pretty bizarre, maybe he thinks more details means SOTA in realism rather than just a proper photoreal look, or he could be talking purely based on aesthetics (think MJ type of aesthetics). MJ's "realism" while up there in details and aesthetics isn't really strictly realistic either.

Anonymous
11/22/25(Sat)14:21:52 No.107295289

Anonymous 11/22/25(Sat)14:21:52 No.107295289

>>107295267
>Do you think we might get local Sora in ~5 years?
Sora has IP characters and IP styles in there, no company is gonna give that to localkeks so, never

Anonymous
11/22/25(Sat)14:22:53 No.107295301

Anonymous 11/22/25(Sat)14:22:53 No.107295301

is it still itoddler only kek

Anonymous
11/22/25(Sat)14:29:20 No.107295378

Anonymous 11/22/25(Sat)14:29:20 No.107295378

is hyvid2 good as a t2i model at least?

Anonymous
11/22/25(Sat)14:31:01 No.107295389

Anonymous 11/22/25(Sat)14:31:01 No.107295389

>>107295185
>one tiny image in a screencap from an unreleased model
>entire model is slop trash

please, please, touch grass

Anonymous
11/22/25(Sat)14:33:09 No.107295413

Anonymous 11/22/25(Sat)14:33:09 No.107295413

>>107295158
That's pretty great, mind sharing prompt?

Anonymous
11/22/25(Sat)14:37:25 No.107295453

Anonymous 11/22/25(Sat)14:37:25 No.107295453

File: hunyuan_video_1.5_00008_.mp4 (522 KB, 464x720)

522 KB MP4

>>107295389
Often a single image is all one needs to see. Like those guys calling Seedream 4 photorealistic but then they're posting slop that looks like SDXL.

Anonymous
11/22/25(Sat)14:47:38 No.107295527

Anonymous 11/22/25(Sat)14:47:38 No.107295527

File: comfuier.png (107 KB, 1514x921)

107 KB PNG

>installed comfyui
>put this safetensor file GonzaLomo XL/Flux/Pony in checkpoint folder and tried unet folder
>it doesn't show up in comfy so I drag the file in
>install the custom nodes
>now this

WTF. Any help?

Anonymous
11/22/25(Sat)14:57:41 No.107295612

Anonymous 11/22/25(Sat)14:57:41 No.107295612

New Hunyuan is just 2-second long???

Anonymous
11/22/25(Sat)14:59:40 No.107295626

Anonymous 11/22/25(Sat)14:59:40 No.107295626

>>107295612
no, 5 seconds takes an age but 2 is reasonable

Anonymous
11/22/25(Sat)15:00:05 No.107295634

Anonymous 11/22/25(Sat)15:00:05 No.107295634

>>107295527
Click on the selector list and pick it once more. Also try to put it into checkpoint folder.

Anonymous
11/22/25(Sat)15:04:23 No.107295670

Anonymous 11/22/25(Sat)15:04:23 No.107295670

>>107295288
>>107295185
SD1.5 as a video?

Anonymous
11/22/25(Sat)15:33:56 No.107295906

Anonymous 11/22/25(Sat)15:33:56 No.107295906

How does hunyuan1.5 compare to wan2.2?
how does it perform with anime?

Anonymous
11/22/25(Sat)15:37:23 No.107295923

Anonymous 11/22/25(Sat)15:37:23 No.107295923

>>107295906
the myriad of snake oil speedups wanx has notwithstanding; wan is better at sequential actions, a broader range of single actions, likeness conservation and clarity. Hyvid is 24fps, and... vramlet friendly

Anonymous
11/22/25(Sat)15:41:17 No.107295950

Anonymous 11/22/25(Sat)15:41:17 No.107295950

>>107295906
>How does hunyuan1.5 compare to wan2.2?
it's bad, stick with wan

Anonymous
11/22/25(Sat)15:47:48 No.107295985

Anonymous 11/22/25(Sat)15:47:48 No.107295985

>>107295906
the current model is incredibly slow. we need a 4steps version. the good news is that the t2v model is uncensored. i don't know about i2v

Anonymous
11/22/25(Sat)15:55:34 No.107296040

Anonymous 11/22/25(Sat)15:55:34 No.107296040

File: 1734615573946135.png (2.9 MB, 1920x1088)

2.9 MB PNG

The secret to good text-to-image with WAN is literally just using karras scheduler instead of simple

picrel 40 steps cfg 2.0 euler/karras low noise only

Anonymous
11/22/25(Sat)15:57:10 No.107296057

Anonymous 11/22/25(Sat)15:57:10 No.107296057

File: hunyuan_video_1.5_00009_.mp4 (566 KB, 464x720)

566 KB MP4

>>107295612
Yep, what this anon >>107295626
said, 2 secs takes only 2 mins on my 3090, good for testing lots of different prompts

Anonymous
11/22/25(Sat)16:03:30 No.107296092

Anonymous 11/22/25(Sat)16:03:30 No.107296092

File: WAN_2_euler_karras_stp10_(...).png (2.78 MB, 1920x1088)

2.78 MB PNG

Here is 1080p imagegen with the lightning Seko rank64 lora. 37 seconds 10 steps euler/karras

>>107296057
I'm getting flashbacks to December 2024 with this garbage
this shit is dead on arrival. The only thing I'd be interested in even downloading these weights for is looking into the fact that they mentioned using both FlashAttention and SageAttention in Hunyuan which is retarded since SageAttention is a replacement for FlashAttention so there may be a free lunch, but like who even cares when wan 2.2 still mogs

Anonymous
11/22/25(Sat)16:05:58 No.107296106

Anonymous 11/22/25(Sat)16:05:58 No.107296106

File: hunyuan_video_1.5_00017_.mp4 (484 KB, 464x720)

484 KB MP4

>>107295923
The clarity in Wan may be there but in T2V the slop is more noticeable and the video's motions are slowed. A decent NSFW tune and Hunyuan comes out on top due to being easier to adopt.

Anonymous
11/22/25(Sat)16:06:29 No.107296111

Anonymous 11/22/25(Sat)16:06:29 No.107296111

File: WAN_2_euler_karras_stp10_(...).png (3.06 MB, 1920x1088)

3.06 MB PNG

With the original 2.2 lightning t2v lora

Anonymous
11/22/25(Sat)16:09:36 No.107296142

Anonymous 11/22/25(Sat)16:09:36 No.107296142

File: WAN_2_euler_karras_stp10_(...).png (2.77 MB, 1920x1088)

2.77 MB PNG

and finally the original wan 2.1 lightx2v lora

I think I am now a Seko-recommender over just using the 2.1 lightning lora for text-to-x purposes

>>107296106
>in T2V the slop is more noticeable and the video's motions are slowed
video motion was mostly fixed for me with the release of the Seko T2V lightning loras and I haven't really seen anything but improvements using it compared to previous lightx2v releases, especially the wan 2.2 ones

Anonymous
11/22/25(Sat)16:15:51 No.107296178

Anonymous 11/22/25(Sat)16:15:51 No.107296178

File: hunyuan_video_1.5_00018_.mp4 (296 KB, 464x720)

296 KB MP4

>>107295985
>we need a 4steps version.

No need to slop it further. That's the issue with Wan. Takes up to 20 mins for a simple smooth non-jagged videos because the community has accepted color shifting speed LoRAs as the standard.

>>107296092
Looks pretty good on some prompts desu. Way better solution than lightx2v Wan.

Anonymous
11/22/25(Sat)16:17:03 No.107296186

Anonymous 11/22/25(Sat)16:17:03 No.107296186

>>107296142
>tested one prompt
>draws definite conclusion

Anonymous
11/22/25(Sat)16:20:19 No.107296198

Anonymous 11/22/25(Sat)16:20:19 No.107296198

>>107296178
>color shifting speed LoRAs
never had the problem with this even for older loras, this is a quantlet/wf issue

Anonymous
11/22/25(Sat)16:22:30 No.107296214

Anonymous 11/22/25(Sat)16:22:30 No.107296214

>>107294974
Is the "1girl and Beyond" rentry guide and it's accommodating workflows still worth using in the current year?
Would it be worth exploring different workflows for WAI? Any worthwhile benefits?

Anonymous
11/22/25(Sat)16:23:02 No.107296216

Anonymous 11/22/25(Sat)16:23:02 No.107296216

>>107295267
>Do you think we might get local Sora in ~5 years?
we will definitely get some form of a video + audio model with lipsync in 2026. all the technology has been invented now it's just up to someone to collect the data and train it

at worst, it's going to to be wan 2.2 + audio addon but at best it'll be a fully new model. and of course it will be no less than 24gb of vram to run, probably 32gb minimum at Q6_K so mentally prepare for that because memory is not getting cheaper or more available any time soon

>>107296178
>Looks pretty good on some prompts desu. Way better solution than lightx2v Wan.
seko is just the newest t2v release by lightx2v
seko v2 actually, i guess v1 sucked since i never heard about it
https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V2.0

>>107296186
you can't demotivate me deebster, I already got within 10% of the workflow with shark-autism nodes in this gen

>>107296198
its hard to take opinions seriously because people may have misconfigured something. i spent 11 hours 3d printing something and thought it was the model that was the problem and I was super pissed at all the retards that gave it 5 stars until I realized I made a mistake

Anonymous
11/22/25(Sat)16:23:32 No.107296220

Anonymous 11/22/25(Sat)16:23:32 No.107296220

File: ComfyUI_00030_.png (1.2 MB, 1024x1024)

1.2 MB PNG

Anonymous
11/22/25(Sat)16:25:13 No.107296234

Anonymous 11/22/25(Sat)16:25:13 No.107296234

>>107296142
is this fucking AI ?

Anonymous
11/22/25(Sat)16:27:35 No.107296250

Anonymous 11/22/25(Sat)16:27:35 No.107296250

File: hunyuan_video_1.5_00019_.mp4 (319 KB, 464x720)

319 KB MP4

>>107296142
>video motion was mostly fixed for me with the release of the Seko T2V lightning loras

It still looks pretty poor on some cases
https://huggingface.co/lightx2v/Wan2.2-Lightning/discussions/64

And they all have something in common, slow motion and slopped (though even base model has the plastic skin so that can't be fixed easily).

Still waiting on updated I2V LoRA though, maybe it's currently better than I think it is for that.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.

Janitor application acceptance emails are being sent out. Please remember to check your spam box!