Discussion of Free and Open Source Diffusion ModelsPrev: >>107925157https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/tdrussell/diffusion-pipe>Flux Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>Z Image Turbohttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>LTX-2https://huggingface.co/Lightricks/LTX-2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>NetaYumehttps://huggingface.co/duongve/NetaYume-Lumina-Image-2.0https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
repoast:>>107926593>pixel-layer watermarking, and C2PA metadata to ensure content provenance and safety.Oh right, I forgot about this. Can you spot the watermark by playing with levels in photoshop?
>>107926805I guess so, if you play with the saturation and shit you'll be able to see the dots, like on NBP
>>107926808Trivial to get rid of.
>>107926798https://huggingface.co/black-forest-labs/FLUX.2-klein-base-9B>DESU I feel like the wall of text is specifically because they failed at making it safe.maybe they're just pretending so that the European Union won't nuke their ass with more (((regulations)))
unborn death maggot
>>107926835triggered?
>>107926827The only thing in that image that actually works IRL is the bottle cap
>>107926848It doesn't tho. It's so poorly designed that everyone just rips it off.
>>107926848I live in europe and I hate this shit, I always have to remove that with a knife or something, it's a fucking paainnnnnn
>>107926805When editing with 9b there'll be zero compression artifacts on the edited portions of the image. Wasn't a visible noise pattern that I could see from my cursory glance though.
>>107926554top four are 4B Distilled, bottom four are 8B Distilled, 8 steps, Euler / Flux.2 Scheduler```flaming retro logo text that reads "COMMIT SUICIDE" against a solid black background```
>>107926872>Wasn't a visible noise pattern that I could see from my cursory glance though.there's a slight shift in colors, but I think it's just a VAE's problem, not a watermark
>>107926880Fucking kek
>>107926805that's the reason I want Z-image edit to win, at least the chinks don't annoy us with safety and watermarks
>>107926880>>>/b/945055399you'll fit right in
>>107926875yeah works here too, I suspected it was just a matter of reframing the prompt or whatever, try without the caps, its so odd how consistently it writes Suiside with "Commit suicide" on prompt
>>107926890the idea of them censoring text output is laughable, and I just proved easily the other anon simply had giga skill issues
>>107926905a lot of models fuck up the text way more if it's not in caps for some reason
Absolute kinohttps://files.catbox.moe/4vusph.mp4
>>107926918I'll keep that in mindCommit suicide ahead! vs COMMIT SUICIDE AHEAD!
>>107926918I think it's because there's less variation in fonts when it comes to caps
>>107926920great. too bad you choose a meme slop
>>107926985he worked very hard on it give him his deserved updoots >:(
i don't really know "where" gamergate happened. i was here shitposting the whole timehttps://x.com/UnburntWitch/status/916106617493495808?s=20
>>107926805what could the watermark possibly include? that the image was made with klein?
>>107926827so that's why klein's anatomy is so shit
>>107926992wrong thread anon
>>107926997>that the image was made with klein?I guess, that's it's an AI image made with klein
>>107926997Who knows how much they are willing to go in the name of safety
>>107927012to be fair, they seem to have calmed down on "safety", Klein is way less uncensored than Kontext for example, Kontext used to not modify your image at all if it had its censorship layers triggered or some shit, never happened on Klein so far
>>107926805This is from memory but "the sample inference code" applies watermarking. As in, the model weights do not perform any watermarking, the Python code around them performs it. You can identify that code and change it.
>>107926998they undershot the recommended step count to make the distilled models look faster if you ask me. It's way better at 8 steps. Censorship wasn't even the the problem with SD3, SD3 was just technically fucked in multiple ways.
>>107927030>, the Python code around them performs it.I doubt Comfy has implemented that shit, he's too north american for that
>>107926848>>107926853>>107926858The bottle cap has two plastic strands, just break one of them and it gets the cap away from your mouth. And now you don't have to babysit a cap in your hand. Anyone complaining about it is not smarter than a bottle cap.
>>107927033>they undershot the recommended step count to make the distilled models look faster if you ask me. It's way better at 8 steps.that's my guess too, they went for too low, if they distilled it at 8 steps instead of 4 it would've been closer to Z-image turbo in terms of realism and anatomy, unironically
sure anon, you generating anime conversions and getting weird outputs is how you catch BFL cheaping out on training
>>107927052>fk9baktually, it's f2k9b
>>107927047depends what you're doing. some of my gens look cooked at 8 steps.
>>107927068no, I meant that BFL should've distilled the model at 8 steps instead of 4, and yeah, going for 8 inference steps for a 4-steps distilled model is probably not the right idea (I get less slopped shit personally so I'm ok with it)
What the fuck does the crying emoji mean on civitai?
how do you get rid of aliasing artifacts with LTX-2? The repeating squares.
>>107927093are you using vae decode (tiled)?
>>107927098yes. guess I'm fucked.
>>107927088It's dislike button #2. Crying laughing emoji is dislike #1
>>107927059good looking out croski>>107927088we truly may never know
>>107927103nah, it can work, what are your settings? I'm getting ok results with those
>>107927116these*
>>107927116>>107927098I tried with the normal vae decode and it still gives the same results.So it's probably not the VAE?
>>107927124show a screen of your workflow, something must be wrong
Alright. I installed Comfy UI.Now what?Do I need to install Stable Diffusion XL, Z-Image, or Z-Image Turbo?Is Z-Image even safe? It's from the Chinese.I have 1050 Ti (4GB VRAM) + 16 GB DDR3 RAM, if that matters.
>>107927128part 1
>>107927142part 2
https://www.youtube.com/watch?v=KFzhe7PKWdw
>>107927152>euler ancestraltry regular euler
>>107927162the upscaler is bypassed.
>>107927142>gemma 3 fp4.. it's probably that, text encoders are ultra sensitive to quants
>>107927137download a quant of z-image comfy has guides if you're not sure how to install things
>>107927137>I have 1050 Ti (4GB VRAM)bruh
>>107927182Thanks, will try fp8.
>>107927199kekhe might as well run 1.5
>>107927201go for Q8 instead, it's the same size and the quality is way closer to bf16
>>107927206I'll see if fp8 makes a difference first.
>want to try training LTX loras>realize I don't have enough videos to make anything worthwhileAm I going to have to scrape leaked OnlyFans archives or something?
>>107927219fp8 is shit, stop using it
>>107927235It can be a bit of a fucking nightmare if your concept is niche enough. I had to make animations in blender to train the concept I wanted and it barely worked. From there I had to cherry pick the best outputs from the initial scuffed LoRA to get a dataset that was more stable.
zit takes as qwen3 4b. Why can't it use the 8b?
>>107927249Because to the model the output looks like gobbledygook if it wasn't trained on that specific text encoder.
>>107927249Not the same model.
>>107927249>>107927275I think they deemed qwen 3 8b to be too powerful to be Apache 2.0, so they nerfed it
cfg hacking, this is the same seed.
>>107927047I think it's as realistic or more than Z DESU, with the right prompt. The new VAE is really good.
looks like this model is easy to train, loras look good on ithttps://civitai.com/models/2188187/old-school-runescape-style-lora-klein-and-zit?modelVersionId=2615834
>>107927300kek
>>107927293this is Klein?>cfg hackinghow? it looks pretty good anon, you're onto something really interesting
>>107926565>>107927304
When training wan video loras, will the style of the dataset matter or can I make it just focus on the motion?
>>107927297>The new VAE is really good.yeah it's definitely an improvement over flux 1's vae, now the Z-image series look a bit outdated if they keep using the previous version, deep down I hope they used those 2 months to switch VAEs but I'm coping way too hard now lol
>>107927308i can't tell if one is better or just different is the issue, basically i'm attempting to offset the cfg to skip the first step
>>107927297>The new VAE is really good.what new vae? i was just seething about the compression artifacts in my lonesome
listen, I'm gonna need something with the prompt adherence and video quality of wan2.2 combined with the audio, video length and generation speed of ltx2 right now
>>107927300lulhttps://civitai.com/models/2280663/basedjak?modelVersionId=2609867
>>107927354take you eyes off the slut and look at the background. one is clearly better
>>107927354I like the one on the right it doesn't have that fucking bokeh
>>107927356>what new vae?Flux 2 Klein uses Flux 2's vae, it's an improvement over Kontext that was using Flux 1's vae, and Z-image turbo also uses Flux 1's vae
>>107927365kek
>>107927361Video quality I get for ltx is shit but I really don't get the ltx prompt adherence is bad, you can time stamp the prompt and it will follow a good 80-90% of it, you time stamp prompt in wan it will follow whatever the first action and take the entire 5 seconds doing that.
>>107927371it's weird that if i don't mention bokeh, it can go either way but i prompt they both respect it.
So why did the mentally ill moron spam the last thread?
>>107927408I haven't tried timestamps with ltx but when the character moves around it seems to go to complete warbled shit
>>107927426it definitely looks better on the right, look at the light on her hair it's way more natural, reminds me of Z-image turbo a bit, what's your method anon? you made something really cool
>>107927236ok? u got a link to the q8 for comfy?
>>107927456https://huggingface.co/Qwen/Qwen3-8B-GGUF
>>107927463Maybe you should have paid attention to the whole conversation before showing everyone how schizo you are.
>>107927473What do you mean?
What is the current state of voice and soundeffects diffusion?I tried MMAudio for my wan gens and it was shit.Are there still no good options for generating sound from a video input? and what about generating dialogue for specific characters?
>>107927444Yeah I think that's more of a fault of how compressed the latents are rather than prompt, it will try to do the prompt but since it's so compressed (I believe it's double what Wan does) the model screws up. Now whether the fix for this more time to bake or maybe just a inherent issue I guess time will tell.
>>107927456https://huggingface.co/unsloth/gemma-3-12b-it-GGUF
>>107927445this may just be a way to amplify lora effectiveness. it still has a tendency to add more anatomy issues. i am messing with the "cfg zero/zero init" node. i mentioned this here before klein dropped. but kjnodes has a beta node that works. just using it with the default zero init at zero steps.
>>107927182>>107927206>>107927219So took longer because my docker in WSL2 decided to shit itself (probably ran out of disk space).fp8 gives no improvements.
>>107927508tldr; distilled models like to set up the structure very early in steps, if you skip the initial steps it is way more creative, but also fucks up really easy.
oof, it compressed the image hard on that one, VAEs on edit models was a mistakehttps://www.youtube.com/watch?v=rWyRxQoNHJU
>>107927539>VAEs on edit models was a mistakeapologize
>>107927539>>107927544Once someone makes a pixel edit model to prove it, I'll call bullshit
https://github.com/Tongyi-MAI/Z-Image/issues/126#issuecomment-3769946123>In reality, the base version has diverged significantly from the initial plan. The original roadmap featured only three variants: base, turbo, and edit. The edit model was developed through additional training and supervised fine-tuning specifically for editing tasks on top of the base version. However, the base version has now evolved into omni base, which inherently incorporates editing capabilities. This signifies that the Edit dataset was incorporated during the initial low-resolution pre-training phase, necessitating extensive retraining. The Chinese community currently anticipates Omni Base's release around the Chinese New Year period.
>>107926791>https://rentry.org/debo>https://rentry.org/animanoncan anyone please explain why does some troon keep adding this off-topic shit to the op? we have some proper threads and then the schizo reappears and invades the op like a troon in a girls' bathroom. disgusting and annoying
My first render!What should I try next?
>another no umark it down
>>107927544I was always in favor of VAEless models, and I hope lodestone will make Klein or Z-image edit VAEless as well
>>1079275631girl, large breasts, masterpiece, style_cluster948332
>>107927567>I was always in favor of VAEless modelsokay>and I hope lodestone will make Klein or Z-image edit VAEless as wellYou need to stop relying on this do-nothing furfag with an attention span shorter than the average ipad kid to solve your issues.
>>1079275631girl, fennec fox, standing
>>107927586who should I rely on then? you? come on anon, you can do it
this time they'll believe i'm just a random anon..!
>>107927356Flux.2 VAE, Klein uses it also.https://bfl.ai/research/representation-comparison
When will LTX-2 be capable of nsfw audio?
>>107927600>https://bfl.ai/research/representation-comparison>Stay tuned for FLUX.3 - coming soon â„¢.lul, I'm looking forward to it, they made a good Klein model so they're not completly useless after all
>>107927607check civtai. it can.
>>107927598Didn't you just spend your whole day spamming a thread? Get a life you sad freak.
>>107927597>who should I rely on then?Nobody. Just stop giving that nobody the deference he doesn't deserve.
>>107927615he was known for fluffyrock before he ever did Chroma though, that thing was in a ton of SD 1.5 merges that didn't even have anything to do with furry stuff
>>107927614who do you think i am lol? you think it's only one person you stole gens from?
>>107927576>>107927592LOLFirst took 37 seconds, second took 27 seconds.1050 Ti (4 GB VRAM)
>>107927614>>107927627I bet there's at least ONE newfren believes you. But only one.
Maybe my model is cooked?
>>107927634>First took 37 seconds, Man. this brings me back to my 1080 days...Cherish this moment anon.
>>107927634based >1050 Ti (4 GB VRAM)ouch :/
>>107927612Are you talking about the furfag lora?
>>107927641nta but it can do people toohttps://civitai.com/images/117716335
This confirms my suspicion, ACEStep 1.5 is already on par if not better than Suno v5 sound quality wise.
Kill ani
>>107927635after seeing all that tran has done to ldg, i dont think there's anyone left who believes their lies. her power lies solely in her boyfriend faggot mod who bans everyone who tries to bring the truth
>>107927645only udio is worth comparing to, suno is not the best music model
>>107927645I just dont buy it. Even though it has clearly massively improved recently
https://civitai.com/models/2322631/klein-pp-uncut-flaccid-penisThis dick lora is a good example of the Flux.2 VAE being noticeably better I think
>>107927653nta, but I tried Udio yesterday and it was trash compared to suno. Like I was actually shocked.
>>107927667yeah, udio ain't what they used to, but there was a time when udio was actually amazinghttps://www.udio.com/songs/wwRF2Bs6fQgbvqchqU6kAehttps://www.udio.com/songs/cnnJ166HGBKhTeHGkxgCtq
>>107927652its sad that he tried so hard to get anon to use his wrapper but it failed because it has no features. and now he just anon posts in the third person. very sad
>>107927544he did a great job with radiance given the means he had, but it's not an edit model?
So what would be an "allstar" version of comfy?>torch 2.8>monkypatched model support>gui from half a year agoAnyone tried to stitch this shit together?
>>107927700>monkypatched model supportwhat's that?
>>107927700>torch 2.8Why 2.8 in particular? TensorRT is one thing I can think of that broke down with 2.9.>monkypatchedQue?
>>107927707You've never run a patch that exists in a single unmerged pr?
Can anyone share an LTX-2 workflow that makes use of Kijai's .gguf?The ComfyUI template workflow uses ckpt_name for it's nodes and it's a pain to adjust.
>>107927780Just install this node and use the loader from ithttps://github.com/city96/ComfyUI-GGUF
>>107927780try growing a brain cell?
>>107927780Just delete the model loader nodes and replace them with these (you will need KJ nodes and city96 gguf nodes) If that is too difficult for you, I don't know what to tell you.
>>107927812iirc the vae loading has been fixed also in mainline, so no need for kj nodes anymore there
>>107927812what is the lora you are using there? and why is it set to -.20?
>>107927843That's the distill LoRA. I am using it with the distill model but setting it to -20 because I want less distill in my distill model. It's the same as using the undistilled weights with the distill LoRA set at a strength of .8
>>107927812I already tried all that, and get this error. Using the exact same nodes.
>>107927854Post your nodes so I can see them.
>>107927672Udio at is peak is absolutely great, and what's great about it is its composition abilityhttps://www.udio.com/songs/nfdtmJRUC7niZfhseaHdNkhttps://www.udio.com/songs/7zrLreMnwCYrdBqQkGtEXMhttps://www.udio.com/songs/hoCg4BmayTYXcJfjo4jvbTSpecially its insane adherence to lyrics, ACEStep 1.5 still has nothing on it, but there's pic rel that can bridge the gap. He has posted examples on Discord and it will absolutely sound insane for fixing up existing songs. There's also "extend" feature. Sound quality wise, Udio is noticeably worse on many songs, you can clearly hear this issue with a good pair of headphones (I use HD 600/bookshelf speakers so I know what I mean). You can really notice the compression on Udio songs when you turn up the volume. That's either largely because Udio compresses their quality since they don't want plebs like you using the best of the best (voice quality is noticeably superior to everything else though), or perhaps their model just isn't as focused on that as Suno/ACEStep 1.5.Composition wise, ACEStep 1.5 is almost there. You be the judge, but if on a good seed it's Udio tier, that means v2 is going to surpass Udio.Regular anime/romantic stuff:https://files.catbox.moe/2t4h82.mp38bit mix:https://files.catbox.moe/7pqlbx.mp3Glitched out synth music:https://files.catbox.moe/klw8a6.mp3Anyways, just be glad local is finally gonna be eating good.
Fuck me. all I had to do was decode the "denoised output."
>>107927916>v2If I go by the discord and trust what they say (I do, they seem reasonable and are upfront about their plans) version 2 will be open source as well.They did say the that if they got a model more powerful than suno they would API it though.
>>107927561>>107927614>>107927652uh oh, melty!
>>107927812>>107927864Forget the vae error (found the obvious problem).Now it's pic related.
>>107927937lmao
>>107927943turn off preview, I think
>>107927943POST A FUCKING SCREENSHOT OF YOUR MODEL LOADER NODES OR I CANNOT FIGURE OUT WHAT YOU'VE DONE WRONG.
i have yet to see a single good character lora for f2kits only good at styles
>>107927943how can we help you if you don't show your workflow anon?
>>107927949Calm your autism, sperg. I already posted them before deleting after 5 minutes when I noticed the first problem, I just assumed it would've been seen.>>107927960https://litter.catbox.moe/6epmp5nvjes21xvf.json
>>107927974sup wanschizo, why have you not been active recently?
>>107927946yeah, haven't seen troonjak melt this hard over her boogeyman scapegoat since yesterday kek
>>107927930The development goal for v2 is to surpass Suno/Udio, but I don't think they're going commercial, at least not fully, it's possible they will just have a proprietary license though. 1.5 will definitely wake up some companies and give them competition (E.G. Alibaba), so them going fully closed is not a real concern.
>>107927977>>107927974wait, isn't that the dude that was pretending to have errors on his workflow so that he can troll everyone with it?
>>107927974Your clip loader doesn't have the connector, but people seem to think you're a schizo so...
can anyone explain this mental illness to me?: >>107927977 >>107927984I really don't get it.
>>107927997We have a consistent cabal of Comfy shills plaguing this general. They pretend like it's impossible to have any issues with Comfy and that anyone posting about them is a troll.
>>107927997ranfaggot is trying to throw off the scent after showing her hand so that the topic of which schizo is spamming the thread isn't her
>>107927997lol come on dude i'm just asking how you're doing, are you still playing those mind games? suit yourself lol
>>107927994What clip loader? DualClipLoader is connected to the Clip Text Encode just like the previous screencap, and it's the only connector clip node connection in the ltx-2 distill template.>but people seem to think you're a schizo so...by all means keep trying to fit in with the mentally ill.
>>107928014Just for the record, here is the unedited default template for ltx-2.
>>107928003I had a problem with comfy yesterday, a python thing. Everyone called me a retard. They were right. I learned how to pin python versions and fixed it in 15 minutes and now I'm genning faster than ever
>>107928014Do you not have something like this in your text encoder folder? As far as I can tell, you just have two gemma models loaded into the dual clip loader.
>>107928020based self learner, don't depend on others to improve on your stuff
>>107928020this so much. python and comfy are a great thing, we should be thankful for them existing. every problem is a user problem, just apply those 15000 bandaids to make it usable bro
>samefag ancient meme pass holder schizo
>>107928023Oh, you're right. I was confused cause this screencap >>107927812 named it "other connector" and the string node in the template said nothing about it.
https://github.com/Comfy-Org/ComfyUI/commit/e755268e7b7843695f52b87595afcb09c1e9fd87>Config for Qwen 3 0.6B model.what model uses Qwen 3 0.6b as a text encoder?
>>107928082Zit you buffoon
>>107928044How many schizos we got up in this shit?
>>107928085ZiT uses qwen 3 4b, are you retarded or something?
>>107928085>>107928093Everybody calm the fuck down!!