[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: collage.jpg (953 KB, 3264x1702)
953 KB
953 KB JPG
Discussion of Free and Open Source Diffusion Models

Prev: >>107925157

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Flux Klein
https://huggingface.co/collections/black-forest-labs/flux2

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
repoast:
>>107926593
>pixel-layer watermarking, and C2PA metadata to ensure content provenance and safety.
Oh right, I forgot about this. Can you spot the watermark by playing with levels in photoshop?
>>
>>107926805
I guess so, if you play with the saturation and shit you'll be able to see the dots, like on NBP
>>
>>107926808
Trivial to get rid of.
>>
File: 1762042773121626.png (887 KB, 1049x1200)
887 KB
887 KB PNG
>>107926798
https://huggingface.co/black-forest-labs/FLUX.2-klein-base-9B
>DESU I feel like the wall of text is specifically because they failed at making it safe.
maybe they're just pretending so that the European Union won't nuke their ass with more (((regulations)))
>>
unborn death maggot
>>
>>107926835
triggered?
>>
File: latest-4120322386.jpg (43 KB, 533x355)
43 KB
43 KB JPG
>>107926827
The only thing in that image that actually works IRL is the bottle cap
>>
>>107926848
It doesn't tho. It's so poorly designed that everyone just rips it off.
>>
>>107926848
I live in europe and I hate this shit, I always have to remove that with a knife or something, it's a fucking paainnnnnn
>>
>>107926805
When editing with 9b there'll be zero compression artifacts on the edited portions of the image. Wasn't a visible noise pattern that I could see from my cursory glance though.
>>
File: Logos.jpg (3.71 MB, 4096x2048)
3.71 MB
3.71 MB JPG
>>107926554
top four are 4B Distilled, bottom four are 8B Distilled, 8 steps, Euler / Flux.2 Scheduler
```flaming retro logo text that reads "COMMIT SUICIDE" against a solid black background```
>>
>>107926872
>Wasn't a visible noise pattern that I could see from my cursory glance though.
there's a slight shift in colors, but I think it's just a VAE's problem, not a watermark
>>
File: 1768219577088406.png (1.26 MB, 1200x675)
1.26 MB
1.26 MB PNG
>>
>>107926880
Fucking kek
>>
File: that's right.png (431 KB, 800x582)
431 KB
431 KB PNG
>>107926805
that's the reason I want Z-image edit to win, at least the chinks don't annoy us with safety and watermarks
>>
>>107926880
>>>/b/945055399
you'll fit right in
>>
>>107926875
yeah works here too, I suspected it was just a matter of reframing the prompt or whatever, try without the caps, its so odd how consistently it writes Suiside with "Commit suicide" on prompt
>>
>>107926890
the idea of them censoring text output is laughable, and I just proved easily the other anon simply had giga skill issues
>>
>>107926905
a lot of models fuck up the text way more if it's not in caps for some reason
>>
Absolute kino
https://files.catbox.moe/4vusph.mp4
>>
File: 66.jpg (677 KB, 2048x1024)
677 KB
677 KB JPG
>>107926918
I'll keep that in mind

Commit suicide ahead! vs COMMIT SUICIDE AHEAD!
>>
>>107926918
I think it's because there's less variation in fonts when it comes to caps
>>
>>107926920
great. too bad you choose a meme slop
>>
>>107926985
he worked very hard on it give him his deserved updoots >:(
>>
i don't really know "where" gamergate happened. i was here shitposting the whole time
https://x.com/UnburntWitch/status/916106617493495808?s=20
>>
>>107926805
what could the watermark possibly include? that the image was made with klein?
>>
>>107926827
so that's why klein's anatomy is so shit
>>
>>107926992
wrong thread anon
>>
>>107926997
>that the image was made with klein?
I guess, that's it's an AI image made with klein
>>
>>107926997
Who knows how much they are willing to go in the name of safety
>>
>>
>>107927012
to be fair, they seem to have calmed down on "safety", Klein is way less uncensored than Kontext for example, Kontext used to not modify your image at all if it had its censorship layers triggered or some shit, never happened on Klein so far
>>
>>107926805
This is from memory but "the sample inference code" applies watermarking. As in, the model weights do not perform any watermarking, the Python code around them performs it. You can identify that code and change it.
>>
>>107926998
they undershot the recommended step count to make the distilled models look faster if you ask me. It's way better at 8 steps. Censorship wasn't even the the problem with SD3, SD3 was just technically fucked in multiple ways.
>>
>>107927030
>, the Python code around them performs it.
I doubt Comfy has implemented that shit, he's too north american for that
>>
>>107926848
>>107926853
>>107926858
The bottle cap has two plastic strands, just break one of them and it gets the cap away from your mouth. And now you don't have to babysit a cap in your hand. Anyone complaining about it is not smarter than a bottle cap.
>>
>>107927033
>they undershot the recommended step count to make the distilled models look faster if you ask me. It's way better at 8 steps.
that's my guess too, they went for too low, if they distilled it at 8 steps instead of 4 it would've been closer to Z-image turbo in terms of realism and anatomy, unironically
>>
File: fk9b_00063.png (1.94 MB, 960x1440)
1.94 MB
1.94 MB PNG
sure anon, you generating anime conversions and getting weird outputs is how you catch BFL cheaping out on training
>>
File: 1750443513734476.png (87 KB, 360x360)
87 KB
87 KB PNG
>>107927052
>fk9b
aktually, it's f2k9b
>>
>>107927047
depends what you're doing. some of my gens look cooked at 8 steps.
>>
>>107927068
no, I meant that BFL should've distilled the model at 8 steps instead of 4, and yeah, going for 8 inference steps for a 4-steps distilled model is probably not the right idea (I get less slopped shit personally so I'm ok with it)
>>
What the fuck does the crying emoji mean on civitai?
>>
how do you get rid of aliasing artifacts with LTX-2? The repeating squares.
>>
>>107927093
are you using vae decode (tiled)?
>>
>>107927098
yes. guess I'm fucked.
>>
>>107927088
It's dislike button #2. Crying laughing emoji is dislike #1
>>
File: f2k9b_00002.png (2.15 MB, 960x1440)
2.15 MB
2.15 MB PNG
>>107927059
good looking out croski

>>107927088
we truly may never know
>>
File: 1752777407964525.png (40 KB, 949x390)
40 KB
40 KB PNG
>>107927103
nah, it can work, what are your settings? I'm getting ok results with those
>>
>>107927116
these*
>>
>>107927116
>>107927098
I tried with the normal vae decode and it still gives the same results.

So it's probably not the VAE?
>>
>>107927124
show a screen of your workflow, something must be wrong
>>
Alright. I installed Comfy UI.
Now what?
Do I need to install Stable Diffusion XL, Z-Image, or Z-Image Turbo?
Is Z-Image even safe? It's from the Chinese.
I have 1050 Ti (4GB VRAM) + 16 GB DDR3 RAM, if that matters.
>>
>>107927128
part 1
>>
>>107927142
part 2
>>
File: Klein 9b distill.png (2.64 MB, 1984x1040)
2.64 MB
2.64 MB PNG
https://www.youtube.com/watch?v=KFzhe7PKWdw
>>
>>107927152
>euler ancestral
try regular euler
>>
>>107927162
the upscaler is bypassed.
>>
>>107927142
>gemma 3 fp4
.. it's probably that, text encoders are ultra sensitive to quants
>>
File: 1756922818258551.jpg (232 KB, 1280x720)
232 KB
232 KB JPG
>>
>>107927137
download a quant of z-image
comfy has guides if you're not sure how to install things
>>
File: 1767016860549490.png (328 KB, 798x644)
328 KB
328 KB PNG
>>107927137
>I have 1050 Ti (4GB VRAM)
bruh
>>
>>107927182
Thanks, will try fp8.
>>
>>107927199
kek
he might as well run 1.5
>>
>>107927201
go for Q8 instead, it's the same size and the quality is way closer to bf16
>>
>>107927206
I'll see if fp8 makes a difference first.
>>
>want to try training LTX loras
>realize I don't have enough videos to make anything worthwhile
Am I going to have to scrape leaked OnlyFans archives or something?
>>
>>107927219
fp8 is shit, stop using it
>>
>>107927235
It can be a bit of a fucking nightmare if your concept is niche enough.

I had to make animations in blender to train the concept I wanted and it barely worked. From there I had to cherry pick the best outputs from the initial scuffed LoRA to get a dataset that was more stable.
>>
zit takes as qwen3 4b. Why can't it use the 8b?
>>
>>107927249
Because to the model the output looks like gobbledygook if it wasn't trained on that specific text encoder.
>>
>>107927249
Not the same model.
>>
>>107927249
>>107927275
I think they deemed qwen 3 8b to be too powerful to be Apache 2.0, so they nerfed it
>>
File: ComfyUI_temp_dqhuu_00012_.jpg (571 KB, 1950x1510)
571 KB
571 KB JPG
cfg hacking, this is the same seed.
>>
>>107927047
I think it's as realistic or more than Z DESU, with the right prompt. The new VAE is really good.
>>
File: Klein 9b distill.png (2 MB, 2720x768)
2 MB
2 MB PNG
looks like this model is easy to train, loras look good on it
https://civitai.com/models/2188187/old-school-runescape-style-lora-klein-and-zit?modelVersionId=2615834
>>
>>107927300
kek
>>
>>107927293
this is Klein?
>cfg hacking
how? it looks pretty good anon, you're onto something really interesting
>>
>>107926565
>>107927304
>>
When training wan video loras, will the style of the dataset matter or can I make it just focus on the motion?
>>
>>107927297
>The new VAE is really good.
yeah it's definitely an improvement over flux 1's vae, now the Z-image series look a bit outdated if they keep using the previous version, deep down I hope they used those 2 months to switch VAEs but I'm coping way too hard now lol
>>
File: in the morning.png (3.74 MB, 1536x2048)
3.74 MB
3.74 MB PNG
>>
File: ComfyUI_temp_dqhuu_00017_.jpg (552 KB, 1950x1510)
552 KB
552 KB JPG
>>107927308
i can't tell if one is better or just different is the issue, basically i'm attempting to offset the cfg to skip the first step
>>
>>107927297
>The new VAE is really good.
what new vae? i was just seething about the compression artifacts in my lonesome
>>
listen, I'm gonna need something with the prompt adherence and video quality of wan2.2 combined with the audio, video length and generation speed of ltx2 right now
>>
File: Klein 9b distill.png (2.81 MB, 2720x768)
2.81 MB
2.81 MB PNG
>>107927300
lul
https://civitai.com/models/2280663/basedjak?modelVersionId=2609867
>>
>>107927354
take you eyes off the slut and look at the background. one is clearly better
>>
>>107927354
I like the one on the right it doesn't have that fucking bokeh
>>
>>107927356
>what new vae?
Flux 2 Klein uses Flux 2's vae, it's an improvement over Kontext that was using Flux 1's vae, and Z-image turbo also uses Flux 1's vae
>>
File: Flux2-Klein_00135_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>
File: 1757429347546178.png (1021 KB, 1168x880)
1021 KB
1021 KB PNG
>>107927365
kek
>>
>>107927361
Video quality I get for ltx is shit but I really don't get the ltx prompt adherence is bad, you can time stamp the prompt and it will follow a good 80-90% of it, you time stamp prompt in wan it will follow whatever the first action and take the entire 5 seconds doing that.
>>
File: ComfyUI_temp_dqhuu_00029_.jpg (585 KB, 1950x1510)
585 KB
585 KB JPG
>>107927371
it's weird that if i don't mention bokeh, it can go either way but i prompt they both respect it.
>>
So why did the mentally ill moron spam the last thread?
>>
>>107927408
I haven't tried timestamps with ltx but when the character moves around it seems to go to complete warbled shit
>>
>>107927426
it definitely looks better on the right, look at the light on her hair it's way more natural, reminds me of Z-image turbo a bit, what's your method anon? you made something really cool
>>
>>107927236
ok? u got a link to the q8 for comfy?
>>
>>107927456
https://huggingface.co/Qwen/Qwen3-8B-GGUF
>>
>>107927463
Maybe you should have paid attention to the whole conversation before showing everyone how schizo you are.
>>
>>107927473
What do you mean?
>>
What is the current state of voice and soundeffects diffusion?
I tried MMAudio for my wan gens and it was shit.
Are there still no good options for generating sound from a video input? and what about generating dialogue for specific characters?
>>
>>107927444
Yeah I think that's more of a fault of how compressed the latents are rather than prompt, it will try to do the prompt but since it's so compressed (I believe it's double what Wan does) the model screws up. Now whether the fix for this more time to bake or maybe just a inherent issue I guess time will tell.
>>
>>107927456
https://huggingface.co/unsloth/gemma-3-12b-it-GGUF
>>
File: ComfyUI_temp_dqhuu_00040_.jpg (501 KB, 1950x1510)
501 KB
501 KB JPG
>>107927445
this may just be a way to amplify lora effectiveness. it still has a tendency to add more anatomy issues. i am messing with the "cfg zero/zero init" node. i mentioned this here before klein dropped. but kjnodes has a beta node that works. just using it with the default zero init at zero steps.
>>
>>107927182
>>107927206
>>107927219
So took longer because my docker in WSL2 decided to shit itself (probably ran out of disk space).

fp8 gives no improvements.
>>
File: ComfyUI_temp_dqhuu_00043_.jpg (828 KB, 1950x1510)
828 KB
828 KB JPG
>>107927508
tldr; distilled models like to set up the structure very early in steps, if you skip the initial steps it is way more creative, but also fucks up really easy.
>>
File: 1751224322363611.png (2.27 MB, 1711x976)
2.27 MB
2.27 MB PNG
oof, it compressed the image hard on that one, VAEs on edit models was a mistake
https://www.youtube.com/watch?v=rWyRxQoNHJU
>>
File: 1759722030303206.png (4 KB, 63x58)
4 KB
4 KB PNG
>>107927539
>VAEs on edit models was a mistake
apologize
>>
>>107927539
>>107927544
Once someone makes a pixel edit model to prove it, I'll call bullshit
>>
File: klein_00288_.png (1.82 MB, 1040x1520)
1.82 MB
1.82 MB PNG
>>
File: I believe.png (235 KB, 500x489)
235 KB
235 KB PNG
https://github.com/Tongyi-MAI/Z-Image/issues/126#issuecomment-3769946123
>In reality, the base version has diverged significantly from the initial plan. The original roadmap featured only three variants: base, turbo, and edit. The edit model was developed through additional training and supervised fine-tuning specifically for editing tasks on top of the base version. However, the base version has now evolved into omni base, which inherently incorporates editing capabilities. This signifies that the Edit dataset was incorporated during the initial low-resolution pre-training phase, necessitating extensive retraining. The Chinese community currently anticipates Omni Base's release around the Chinese New Year period.
>>
>>107926791
>https://rentry.org/debo
>https://rentry.org/animanon
can anyone please explain why does some troon keep adding this off-topic shit to the op? we have some proper threads and then the schizo reappears and invades the op like a troon in a girls' bathroom. disgusting and annoying
>>
File: ComfyUI_00001_.png (368 KB, 512x512)
368 KB
368 KB PNG
My first render!
What should I try next?
>>
>another no u
mark it down
>>
>>107927544
I was always in favor of VAEless models, and I hope lodestone will make Klein or Z-image edit VAEless as well
>>
>>107927563
1girl, large breasts, masterpiece, style_cluster948332
>>
>>107927567
>I was always in favor of VAEless models
okay
>and I hope lodestone will make Klein or Z-image edit VAEless as well

You need to stop relying on this do-nothing furfag with an attention span shorter than the average ipad kid to solve your issues.
>>
>>107927563
1girl, fennec fox, standing
>>
>>107927586
who should I rely on then? you? come on anon, you can do it



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.