Discussion of Free and Open Source Diffusion ModelsPrev: >>107851707https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipe>Z Image Turbohttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>WanXhttps://github.com/Wan-Video/Wan2.2>LTX-2https://huggingface.co/Lightricks/LTX-2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>NetaYumehttps://huggingface.co/duongve/NetaYume-Lumina-Image-2.0https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe|https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg
>>107855134Whoaa that first pic is super duper realistic o_0
>>107855134>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
there is no qwen inpaint, right? you have to use mask and crop/stitch?
>>107855138They're still around it seems>https://github.com/nunchaku-ai/ComfyUI-nunchaku>v1.2.0 Released! Enjoy a 20–30% Z-Image performance boost, seamless LoRA support with native ComfyUI nodes, and INT4 support for 20-series GPUs!Safe to say wan is officially abandoned
>>107855189Yes. I just used that lol.But they are not training anything right now.No pull requests or discussion of anything being in the works somewhere.ZiT's PR was open for a while before they merged and released it.
Sorry if this is a dumb question but I’m looking to do realistic nsfw gens with loras, I have plenty of Loras for flux, do these work with chroma? What’s the best chroma checkpoint? Should I be using something better than chroma? Should I retrain the Loras with chroma somehow? Been kind of out of the loop for a bit.
>>107855268Did you not like the answer anon?>>107855181
Beijing time tracker anon hereit's 7:30AM there, soon they will wake up and be preparing to drop the GLM Image model on us
>>107855277ah sorry, i didnt see it, thought it got swept up before new thread. so SDXL is STILL the best for nsfw?? im kinda shocked, i got much better results with flux overall.
>>107854980i've been downloading basically everything since XL was released so i'm good on that. I have practically every lora for SDXL, ZIT, Qwen, Chroma, Wan2.1 & Wan2.2.
LTX2 is amazing. source image here:Spongebob Squarepants grabs a rifle and says "Hey Patrick, my memory costs five thousand dollars, lets take over the data center!". Patrick says "okay, Spongebob!".https://files.catbox.moe/r8v8q1.mp4
what do we do when comfyui goes ipo?
>>107855372Short it.
>>107855372straight into the S&P500, your retirement fund will be 20% comfyui stocks
https://files.catbox.moe/w8qizx.mp4
>>107855372sell immediately before comfy is allowed to
>tranibake
>>107855444can you guys agree to a truce this week? two models might drop and it'd be a shame if the discussion is drowned out by nonsense
>>107855453we're not getting base so it really doesn't matter.
unblessed thread
>>107855453>might
Unsloth seems to have released higher quality quants for some image models a few days ago. Certain layers are kept at slightly higher quants like Q_5 and Q_6 to boost quality. I think it does a good job for Q4 quant of a 6B model. Shame the quantization implementation for diffusion models suck. It runs slower than bf16 and q8. Use case for this seems very niche. It still runs similarly slower at high resolutions too, so no can't use it for that neither.Much better quality than nunchaku. I almost wonder if they fucked up the nunchaku implementation somehow? 32 rank one is completely fucking raped, blurry, unusable mess. 128 and 256 rank ones are better, but still blurry and Q_4_K_M is noticeably closer to the original image. Although Nunchaku has the advantage of running three times faster, it's not worth it with the current quality imo. I wonder if they made a schizo 1024 rank model it would perform well. Should still run faster than normal quants and use less memory than q8, if my assumptions are correct.Tested with 3060 12gb. And no I am fine with bf16 for ZiT too, tested out of curiosity and to see if it will be useful for the base.Thanks for reading my blog.>>107855453>ifIt WILL get drowned out by that.
>>107855347rich fucker
>>107855302Flux has higher quality, but it doesn't know nsfw.Mixing multiple flux loras together to do nsfw is very iffy.Chroma knows NSFW and has far higher maximum quality than SDXL, but is several times slower and likes to shit out worse than SDXL anatomy commonly.As I said, if you want low quality but fast and reliable gens go with SDXL, if you want to play seed lottery with slow gen times, but occasionally get great gens, go with Chroma.That's more or less all there is to it.
>>107855134Why is AniStudio not in OP?
>>107855478yeah ZiT is so small that i'm surprised people bother with quantsthe video models are a different story. what's interesting is some anon in the wsg thread found that the Q8 quants have a big difference to bf16 outputs which usually isn't the case
>>107855498it's less than 20TB. you can buy a single 24TB HDD for $500 and fit nearly all of it. now if you wanted to keep all loras on sdds, then yeah be prepared to spend thousands
>>107855506I meant to type worse than SD 1.5>>107855523In the current thread? Curious about precisely how.
>>107855536>In the current thread? Curious about precisely how.yeah it's the current thread and i forgot to somehow mention i was talking about LTX. anon did a few comparison video between different quants>>>/wsg/6069549
I almost miss working with SDXLThe better the models get, the more its about seed lottery and "prompt engineering" but I am tired of doing so many iterations its cluttering my SSD
>>107855564what was sdxl about unc?
>>107855353Spongebob speaks the truth
>>107855574fast gens, mixing loras, masking and photoshop edits, because the prompt comprehension was atrociousnewer models like Flux 2, Qwen2511 have superior comprehension but you need to do a lot of prompt engineering to unlock their potential
>>107855564for me? it's the wild variation unets offered. all dit models take everything too literally sometimes and leaves little room for a more exciting result
>>107855564Wasn't the meta with sdxl is gen a boatload and pick from the best one from the monstrosities lol
so since they cancelled z base, what is next on the horizon?
>>107855599ltx 2.1
>>107855552Possibly a bug?The model is new + new major backend stuff (comfy-kitchen) has been merged recently.It's possible that something is implemented wrongly, handling the quantized data incorrectly, rather than quant sucking so much.Q8 of a 19b model different that much from the baseline warrants some search for a decent explanation.
What's the best model for pixel art?I mean both sprites and portrait stuff, I need placeholder art for my game so prototyping is the closest thing to the final stuff.I remember seeing really good sprites uploaded here by an anon a long ago.
>>107855615lmao >>>/wsg/6071832
>>107855510see>>107855175
What's the difference between a text encoder and a llm?
>>107855506so qwen and zimage arent contenders for nsfw anymore? im just so confused at the discourse, it seems like it should be clear what the best image generators are in each area, what about for inpainting and such? i just tried the newest qwen image edit 2511 and it was still terrible, and flux inpainting still seems to be the best for nsfw? i just wish there was somewhere i could get up to date info on this stuff
>>107855652Nothing, same thing.
>>107855652due to there generally only one LLM being trained against a diffusion model, it's the model's text encoderit might output a strange space of concepts rather than text too
>>107855660if you want nsfw use SDXL. Illustrious finetunes are the most coherent SDXL models. My personal favorite finetune is UncannyValley.
I started sending my ai slop to some of my friends and I'm starting to worry because most of them don't realize it is AI
>>107855353also for fun, a qwen edit 2511 editgive them ak-47s:
>>107855660you can do a limited amount of nsfw with qwen, zimage, hyimage, whatever.but illustrious/noob and chroma are the models that are more broadly nsfw trained, they understand vastly more in that regard
>>107855617not saying this is the best but you can try a few of the loras i've published:https://civitai.com/user/n1eze
>>107855660Neither Qwen nor ZiT know NSFW out of the box. NSFW loras either barely exist (Qwen) or are in shit quality and don't mix with other loras (Z-Image).There is no expectation that anyone will make a major NSFW fientune of a big model like Qwen, but Z-Image will likely have people attempt to beat NSFW into it properly once the base version releases (current version is distilled and sucks for finetuning.). We will see if it will work out, but current Chroma and SDXL are your two options for nsfw.I don't know jack shit about inpainting and I am tired of typing paragraphs.
>>107855693qwean can do nsfw fine if you use an ablited text encoder
>>107855694can you make a basedjack lora
>>107855706some other anon already made one for Z image and published it, i don't have a dataset for that
>>107855709benchod
>>107855714??? idk what that means
>>107855690ice cream:
>>107855699It still doesn't know what genitalia or sex actions look like.Don't bother with this.
>>107855727sex is when penis goes into vagina
>>107855696ZIT loras are abysmal. Almost every one changes the subject too much or just gives poor results.
>>107855749Or poophole
What do you use to load the new LTX-2 vae by itself? Using picrel standard vae loader just gives me a black screen.
>>107855189literally making nunchaku where it's the least needed
the porn loras I've seen trainings on for ltx2 look like the most basic stuff without any dataset filtering to get some nice girls out of it, do you think the model will be able to generalize beyond "40-45 yo heavy smoker milf"?
>>107855775you need to use kijai vae loader
anyone know a good inpainting comfyui workflow for nsfw? so far the best one i've found is still flux kontext, there has to be something better that this point isnt there? i've been fairly frustrated with this so far, if anyone can help i would really appreciate it
>>107855808flux kontext is not inpainting, inpainting is when you use a masking node and its usually done with SDXL. attaching controlnets helps
>>107855808>>107855825Flux fill I believe is the dedicated inpainting variant.You can use kontext with masks, though I am not sure about the quality.
>>107855798That worked, thanks.
>>107855882any time
>>107855825i use it for inpainting and it seems to do better than flux fill, especially for nsfw>>107855839the kontext inpainting is the best i've seen so far, seems to outperform qwen but i do need to test a qwen inpainting nsfw lora i foundanyone have anything better?
>>107854466nice
>>107855723kek
imagine contributing to a software because you believe in foss and the guy sells out lmao
At what strength do you use the detailler lora in ltx-2 in i2v?
saarshttps://huggingface.co/zai-org/GLM-Image
>>107856008I'll wait for GoFuckem to test it first and iron out all the kinks.
>>107855928I took some kpin ok I wasnt paying atention
>>107856008Seems promising if example images aren't heavily cherry picked.I feel like I should wait until proper cumfart support + workflows + quants appear though.
>>107856008this is some unholy qwen distill isnt it
downloadin... wish me luck
an autorregressive model just flew over my house
>>107856051Should have zoomed in and looked for more than two seconds, feeling emberassed rn.Sloppy look.It's doa unless it runs very fast and/or trains extremely well.
>already in damage control mode
my senile uncle keeps falling for picrel type AI slop. anyone know which services people are using to make these vids? i want to make a video of him saying ridiculous shit so that maybe he'll believe me that this type of shit isn't real.
plastic? checkbrown tint? checkgeneric showcase of outdated boring prompts? checkslower yet worse? checkchinkshit? checkit's culture time
https://files.catbox.moe/firn4v.mp4https://files.catbox.moe/ue5iur.mp4https://files.catbox.moe/swlmta.mp4
>>107856051>>107856070To adding on the fact that they never advertised images and the fact that it's not available on an API like FAL right now mean that they had no confidence in it.Probably just wanted the investors to tell "we made an image model" in the next quarterly.Will still give it a shot once the workflows are out.>>107856081Not local models, doesn't belong to this thread.But to answer your question, such videos are primarily made with veo or sora.
how to get less terrible sound with ltx2?
>>107856081this was generated with comfyui so it belongs in this thread
Take the giraffe for example. You would want an autoregressive model that "reasons and iterates on your prompt" to add coherent fancy details like decide on a nice cover for its book and quirky background posters that fit the anthropomorphic animal in daily life theme. And yet it's all gibberish sloppa. Its outfit isn't wet despite being out in the rain neither.
>>107855302yes sdxl is the best for nsfw stuff.
>>107856143Either continue from previous audio or wait for zeeb to drop the fixes
>>107856143Seed, gen at higher quality or upscale. It will always be bad honestly until fix just less bad at higher res.
>>107855785>tfw we got cultured with wanchaku before chinese cultureholy shit, i dont know whether to kek or cry
>do you like our open source model?
>>107856179>>107856186damn, ok
ltx2 update (minor) if you got the kijai distilled q8 or whatever:13th of January 2026 update !!IMPORTANT!!Turns out the video VAE in the initial distilled checkpoints has been wrong one all this time, which (of course) was the one I initially extracted. It has now been replaced with the correct one, which should provide much higher detailat this moment this requires using updated KJNodes VAELoader to work correctlyKijai/LTXV2_comfy · Hugging Facehttps://huggingface.co/Kijai/LTXV2_comfyhttps://huggingface.co/Kijai/LTXV2_comfy/blob/main/VAE/LTX2_video_vae_bf16.safetensors
>>107856116Today's creations, not many were worth posting.
Do people use this? It's not used in the default workflow.
>Because the inference optimizations for this architecture are currently limited, the runtime cost is still relatively high. It requires either a single GPU with more than 80GB of memory, or a multi-GPU setup.How would it even work with multiple GPUs?>The target image resolution must be divisible by 32. Otherwise, it will throw an error.So 1024 times compressed latent space? No wonder why the images look shit.>Guidance scale rather than CFG in the example inference codeLooks like the anon who guessed Qwen distill might be right>>107856053
>bro just autoregressively generate 4096 tokens and only then do you get to feed it into a diffusion model to actually make your imageHave I missed something or is this thing destined to be insanely slow? What is the tokens per second of a 9b LLM on a 5090, like 50 or something at best? It's gonna be well over a minute just for those tokens to gen, then you have a whole ass 7b diffusion model on top of that.
>>107856239i downloaded this and keep running OOM because i did NOT read the manual lel
It requires 80gb because all of these retarded inference scripts load everything at once and keep it loaded during every stage.
i was able to gen a glm image locally with a 3090 24GB and cpu offloading, (i have 128GB).took about 2 minutes for an image though.
>>107856353I also have 128GB, what did you use? Comfy?
>>107856353Step 2 sounds like something pol would say.
>>107856353did you write all the text in the final image?
attempt two is naturally, 1girl30 steps, 768x768, it takes a long time before starting to gen just beating the shit out of my ram. this one only took 40 secs to gen after though. it clearly does not like the smaller resolution.>>107856355i used the inference script from the model page. attempting to drop the resolution and steps to see what kinda timing i get but it'll probably look like ass.>>107856364the prompt is from the official page
>>107856378OK thanks anon, I wonder if it'll be able to do llamacpp style offloading to ram in comfy.
>>107856417dropping the res also made it eat shit, going back to defaults and starting again. all the offloading stuff is already present in the gen pipeline so i don't see why it wouldn't be in comfy, it's all python. it's a big as fuck offload though.
At this point I think western closed models will win.China (except for the Z Image team) doesn't understand that good benchmark scores don't equal what people want.There's no way that 2512 and GLM Image are on par with closed source, but the benchmarks say so.
>>107856462Benchmarks were a cancer the moment they were created. They've become a target to hit rather than a measure of model quality. There's a term for the phenomenon but the point is it's shit.
>>107855782neat training sample
>>107856476It wouldn't be THAT bad if they were hitting those targets in a good faith way.They just deliberately train/finetune for benchmark data, and re-run tests many times until they hit the scores they want.
>>107856219You can play them all with videoswarm btw. Best slop player out there
>>107856495Yummy>>107856476>>107856509The term you are looking for is Goodhart's law btw
>>107856529yessssssss
>>107856529Sirs, I'm redeeming her robo vagen.
1girl hours
>>107856378Ask it to gen Taylor Swift nude at 1024p.Let's see if it's uncensored at least.
take all of this with a grain of salt, i barely have the compute to run this. it knows miku though. i just checked and i am running this thing at the wrong resolution so i'll try some more.https://files.catbox.moe/su82v8.png
>>107856580>Chinese model knows miku:O
Forgive me Father for it has been one month since I last trained a LoRA.
>>107856283Please do one where pov hand's thumb is in her mouth u_u
lol!
>>107856590i dont feel envious of glm image tho
>>107856590it's completly garbage >>>/wsg/6072234
>>107856218tried with new video VAE (was using q8 from that repo), it does work better, just make sure you use the kjnodes vae node (its updated), quality is better with the models there.https://files.catbox.moe/pezs9g.mp4
>>107856586>>107856613coomers really gotta calm tf down lol
Keep coping I am making a collage, GLM will dominate>>107856353respect for testing it
First GLM Image result. Not quite what I was hoping for, but I will test more.Consumed 74GB of VRAM and took 26 seconds (50 steps).
if it wasn't so huge it would be pretty fun! i like it so far
>>107856675>3 arms>completly slopped skinwhy? Z-image turbo can do Taylor Swift just fine
>>107856675>>107856674These are trash.
>>107856675Get a brain medication
>>107856674what device are you using?>>107856687guys this is also an edit model
>>107856235Not worth it just gen at 48 fps, does the same thing
>>107856675are you fucking serious? even SDXL can do better realism and hands
>>107856710if its artsy enough jannie wont mind DESU
>>107856719ive been sent to vacation for posting butt cheeks before nty
>>107856695RTX 6000
>>107856675Can it do her NUDE though?Like as in:a) does the te or some other censorship mechanism throw a fit about genning someone, especially a real person nude?b) does it know what's under the clothes?
Is there a way to use LTX2 V2V and enhance the details of the original video? So far I'm not seeing much improvement despite using the ltx-2-19b-ic-lora-detailer
>1girl
>>107856742Are you using the fixed vae?
i think i forget that many people here just want someone to build them a model (at great cost) that will make pixel perfect celebs nude and no other innovation whatsoever.
>>107856762if thats the new model it has nice DoF at least
>>107856765I think you've completely blinded yourself by the fact it's a new model to see it's objectively giving you garbage outputs. I don't care about celebrity nudes. Shit is awful.
>>107856763>Are you using the fixed vae?Yes and the KJ vae loader
I tried a few basic manga artists and GLM Image doesn't seem to substantially recognize any of them.
>>107856765it looks worse than a SD1.5 model, how can you praise garbage like that in the year of our lord 2026, you're probably a zlm employee or something
>>107856765please contact your optometrist
>>107856822Time to delete that shit and invoice Zai for your time.
>>107856831all i said was i haven't figured out the settings yet and it's fun so far. you can draw your conclusions if you want though, i still want to test out more/less steps, resolutions, i2i, etc.
What Lighting Lora is used nowadays?
>>107856765buy an ad, no one is gonna praise this shit model
>>107856870My dude. You blew it. The examples were so bad and showing so little promise it's going to take a miracle to undo the damage.
>>107856697Takes twice the time.
letta nigga test a model out FFS
there is no such thing as prompt engineering outside sd1.4, dall-e 3, and midjourney. modern models have such limited stylistic capabilities and such low seed variance that everything looks the same. loras are the only way to 'engineer' a unique output. brownsaars now started coping with 'workflow design' because the models are so stale and boring. a good model will let you just type of artists, styles, and moods and get completely different yet coherent results. now we just get boring brown sludge
>>107856765If it's able to use that huge LLM to do cool complicated prompt understanding I'm all for it.For basic stuff any model since SDXL is sufficient.
>>107856896yeah not sure what all the anger is about
>>107856630spongebob explains RAM prices:https://files.catbox.moe/q4ilqt.mp4
>>107856896>>107856952nothing wrong with testing a model, but there's something wrong with saying that it's good and that you're wrong for having functional eyes
https://files.catbox.moe/z92vn1.mp4
>>107856906Okay
can ltx do shrek?
>>107856906interesting way to admit you can't into noodles
>>107856952new model out = people will whine
can ltx do hat?
can ltx do shrok?
I haven't been able to get anything impressive from GLM so far. The upside is that it's fast.
>>107857059looks like a SD2.0 image lol
>>107857059Did you make this one?