Discussion of Free and Open Source Diffusion ModelsPrev: >>107851707https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipe>Z Image Turbohttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>WanXhttps://github.com/Wan-Video/Wan2.2>LTX-2https://huggingface.co/Lightricks/LTX-2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>NetaYumehttps://huggingface.co/duongve/NetaYume-Lumina-Image-2.0https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe|https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg
>>107855134Whoaa that first pic is super duper realistic o_0
>>107855134>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
there is no qwen inpaint, right? you have to use mask and crop/stitch?
>>107855138They're still around it seems>https://github.com/nunchaku-ai/ComfyUI-nunchaku>v1.2.0 Released! Enjoy a 20–30% Z-Image performance boost, seamless LoRA support with native ComfyUI nodes, and INT4 support for 20-series GPUs!Safe to say wan is officially abandoned
>>107855189Yes. I just used that lol.But they are not training anything right now.No pull requests or discussion of anything being in the works somewhere.ZiT's PR was open for a while before they merged and released it.
Sorry if this is a dumb question but I’m looking to do realistic nsfw gens with loras, I have plenty of Loras for flux, do these work with chroma? What’s the best chroma checkpoint? Should I be using something better than chroma? Should I retrain the Loras with chroma somehow? Been kind of out of the loop for a bit.
>>107855268Did you not like the answer anon?>>107855181
Beijing time tracker anon hereit's 7:30AM there, soon they will wake up and be preparing to drop the GLM Image model on us
>>107855277ah sorry, i didnt see it, thought it got swept up before new thread. so SDXL is STILL the best for nsfw?? im kinda shocked, i got much better results with flux overall.
>>107854980i've been downloading basically everything since XL was released so i'm good on that. I have practically every lora for SDXL, ZIT, Qwen, Chroma, Wan2.1 & Wan2.2.
LTX2 is amazing. source image here:Spongebob Squarepants grabs a rifle and says "Hey Patrick, my memory costs five thousand dollars, lets take over the data center!". Patrick says "okay, Spongebob!".https://files.catbox.moe/r8v8q1.mp4
what do we do when comfyui goes ipo?
>>107855372Short it.
>>107855372straight into the S&P500, your retirement fund will be 20% comfyui stocks
https://files.catbox.moe/w8qizx.mp4
>>107855372sell immediately before comfy is allowed to
>tranibake
>>107855444can you guys agree to a truce this week? two models might drop and it'd be a shame if the discussion is drowned out by nonsense
>>107855453we're not getting base so it really doesn't matter.
unblessed thread
>>107855453>might
Unsloth seems to have released higher quality quants for some image models a few days ago. Certain layers are kept at slightly higher quants like Q_5 and Q_6 to boost quality. I think it does a good job for Q4 quant of a 6B model. Shame the quantization implementation for diffusion models suck. It runs slower than bf16 and q8. Use case for this seems very niche. It still runs similarly slower at high resolutions too, so no can't use it for that neither.Much better quality than nunchaku. I almost wonder if they fucked up the nunchaku implementation somehow? 32 rank one is completely fucking raped, blurry, unusable mess. 128 and 256 rank ones are better, but still blurry and Q_4_K_M is noticeably closer to the original image. Although Nunchaku has the advantage of running three times faster, it's not worth it with the current quality imo. I wonder if they made a schizo 1024 rank model it would perform well. Should still run faster than normal quants and use less memory than q8, if my assumptions are correct.Tested with 3060 12gb. And no I am fine with bf16 for ZiT too, tested out of curiosity and to see if it will be useful for the base.Thanks for reading my blog.>>107855453>ifIt WILL get drowned out by that.
>>107855347rich fucker
>>107855302Flux has higher quality, but it doesn't know nsfw.Mixing multiple flux loras together to do nsfw is very iffy.Chroma knows NSFW and has far higher maximum quality than SDXL, but is several times slower and likes to shit out worse than SDXL anatomy commonly.As I said, if you want low quality but fast and reliable gens go with SDXL, if you want to play seed lottery with slow gen times, but occasionally get great gens, go with Chroma.That's more or less all there is to it.
>>107855134Why is AniStudio not in OP?
>>107855478yeah ZiT is so small that i'm surprised people bother with quantsthe video models are a different story. what's interesting is some anon in the wsg thread found that the Q8 quants have a big difference to bf16 outputs which usually isn't the case
>>107855498it's less than 20TB. you can buy a single 24TB HDD for $500 and fit nearly all of it. now if you wanted to keep all loras on sdds, then yeah be prepared to spend thousands
>>107855506I meant to type worse than SD 1.5>>107855523In the current thread? Curious about precisely how.
>>107855536>In the current thread? Curious about precisely how.yeah it's the current thread and i forgot to somehow mention i was talking about LTX. anon did a few comparison video between different quants>>>/wsg/6069549
I almost miss working with SDXLThe better the models get, the more its about seed lottery and "prompt engineering" but I am tired of doing so many iterations its cluttering my SSD
>>107855564what was sdxl about unc?
>>107855353Spongebob speaks the truth
>>107855574fast gens, mixing loras, masking and photoshop edits, because the prompt comprehension was atrociousnewer models like Flux 2, Qwen2511 have superior comprehension but you need to do a lot of prompt engineering to unlock their potential
>>107855564for me? it's the wild variation unets offered. all dit models take everything too literally sometimes and leaves little room for a more exciting result
>>107855564Wasn't the meta with sdxl is gen a boatload and pick from the best one from the monstrosities lol
so since they cancelled z base, what is next on the horizon?
>>107855599ltx 2.1
>>107855552Possibly a bug?The model is new + new major backend stuff (comfy-kitchen) has been merged recently.It's possible that something is implemented wrongly, handling the quantized data incorrectly, rather than quant sucking so much.Q8 of a 19b model different that much from the baseline warrants some search for a decent explanation.
What's the best model for pixel art?I mean both sprites and portrait stuff, I need placeholder art for my game so prototyping is the closest thing to the final stuff.I remember seeing really good sprites uploaded here by an anon a long ago.
>>107855615lmao >>>/wsg/6071832
>>107855510see>>107855175
What's the difference between a text encoder and a llm?
>>107855506so qwen and zimage arent contenders for nsfw anymore? im just so confused at the discourse, it seems like it should be clear what the best image generators are in each area, what about for inpainting and such? i just tried the newest qwen image edit 2511 and it was still terrible, and flux inpainting still seems to be the best for nsfw? i just wish there was somewhere i could get up to date info on this stuff
>>107855652Nothing, same thing.
>>107855652due to there generally only one LLM being trained against a diffusion model, it's the model's text encoderit might output a strange space of concepts rather than text too
>>107855660if you want nsfw use SDXL. Illustrious finetunes are the most coherent SDXL models. My personal favorite finetune is UncannyValley.
I started sending my ai slop to some of my friends and I'm starting to worry because most of them don't realize it is AI
>>107855353also for fun, a qwen edit 2511 editgive them ak-47s:
>>107855660you can do a limited amount of nsfw with qwen, zimage, hyimage, whatever.but illustrious/noob and chroma are the models that are more broadly nsfw trained, they understand vastly more in that regard
>>107855617not saying this is the best but you can try a few of the loras i've published:https://civitai.com/user/n1eze
>>107855660Neither Qwen nor ZiT know NSFW out of the box. NSFW loras either barely exist (Qwen) or are in shit quality and don't mix with other loras (Z-Image).There is no expectation that anyone will make a major NSFW fientune of a big model like Qwen, but Z-Image will likely have people attempt to beat NSFW into it properly once the base version releases (current version is distilled and sucks for finetuning.). We will see if it will work out, but current Chroma and SDXL are your two options for nsfw.I don't know jack shit about inpainting and I am tired of typing paragraphs.
>>107855693qwean can do nsfw fine if you use an ablited text encoder
>>107855694can you make a basedjack lora
>>107855706some other anon already made one for Z image and published it, i don't have a dataset for that
>>107855709benchod
>>107855714??? idk what that means
>>107855690ice cream:
>>107855699It still doesn't know what genitalia or sex actions look like.Don't bother with this.
>>107855727sex is when penis goes into vagina
>>107855696ZIT loras are abysmal. Almost every one changes the subject too much or just gives poor results.
>>107855749Or poophole
What do you use to load the new LTX-2 vae by itself? Using picrel standard vae loader just gives me a black screen.
>>107855189literally making nunchaku where it's the least needed
the porn loras I've seen trainings on for ltx2 look like the most basic stuff without any dataset filtering to get some nice girls out of it, do you think the model will be able to generalize beyond "40-45 yo heavy smoker milf"?
>>107855775you need to use kijai vae loader
anyone know a good inpainting comfyui workflow for nsfw? so far the best one i've found is still flux kontext, there has to be something better that this point isnt there? i've been fairly frustrated with this so far, if anyone can help i would really appreciate it
>>107855808flux kontext is not inpainting, inpainting is when you use a masking node and its usually done with SDXL. attaching controlnets helps
>>107855808>>107855825Flux fill I believe is the dedicated inpainting variant.You can use kontext with masks, though I am not sure about the quality.
>>107855798That worked, thanks.
>>107855882any time
>>107855825i use it for inpainting and it seems to do better than flux fill, especially for nsfw>>107855839the kontext inpainting is the best i've seen so far, seems to outperform qwen but i do need to test a qwen inpainting nsfw lora i foundanyone have anything better?
>>107854466nice
>>107855723kek
imagine contributing to a software because you believe in foss and the guy sells out lmao
At what strength do you use the detailler lora in ltx-2 in i2v?
saarshttps://huggingface.co/zai-org/GLM-Image
>>107856008I'll wait for GoFuckem to test it first and iron out all the kinks.
>>107855928I took some kpin ok I wasnt paying atention
>>107856008Seems promising if example images aren't heavily cherry picked.I feel like I should wait until proper cumfart support + workflows + quants appear though.
>>107856008this is some unholy qwen distill isnt it
downloadin... wish me luck
an autorregressive model just flew over my house
>>107856051Should have zoomed in and looked for more than two seconds, feeling emberassed rn.Sloppy look.It's doa unless it runs very fast and/or trains extremely well.
>already in damage control mode
my senile uncle keeps falling for picrel type AI slop. anyone know which services people are using to make these vids? i want to make a video of him saying ridiculous shit so that maybe he'll believe me that this type of shit isn't real.
plastic? checkbrown tint? checkgeneric showcase of outdated boring prompts? checkslower yet worse? checkchinkshit? checkit's culture time
https://files.catbox.moe/firn4v.mp4https://files.catbox.moe/ue5iur.mp4https://files.catbox.moe/swlmta.mp4
>>107856051>>107856070To adding on the fact that they never advertised images and the fact that it's not available on an API like FAL right now mean that they had no confidence in it.Probably just wanted the investors to tell "we made an image model" in the next quarterly.Will still give it a shot once the workflows are out.>>107856081Not local models, doesn't belong to this thread.But to answer your question, such videos are primarily made with veo or sora.
how to get less terrible sound with ltx2?
>>107856081this was generated with comfyui so it belongs in this thread
Take the giraffe for example. You would want an autoregressive model that "reasons and iterates on your prompt" to add coherent fancy details like decide on a nice cover for its book and quirky background posters that fit the anthropomorphic animal in daily life theme. And yet it's all gibberish sloppa. Its outfit isn't wet despite being out in the rain neither.
>>107855302yes sdxl is the best for nsfw stuff.
>>107856143Either continue from previous audio or wait for zeeb to drop the fixes
>>107856143Seed, gen at higher quality or upscale. It will always be bad honestly until fix just less bad at higher res.
>>107855785>tfw we got cultured with wanchaku before chinese cultureholy shit, i dont know whether to kek or cry
>do you like our open source model?
>>107856179>>107856186damn, ok
ltx2 update (minor) if you got the kijai distilled q8 or whatever:13th of January 2026 update !!IMPORTANT!!Turns out the video VAE in the initial distilled checkpoints has been wrong one all this time, which (of course) was the one I initially extracted. It has now been replaced with the correct one, which should provide much higher detailat this moment this requires using updated KJNodes VAELoader to work correctlyKijai/LTXV2_comfy · Hugging Facehttps://huggingface.co/Kijai/LTXV2_comfyhttps://huggingface.co/Kijai/LTXV2_comfy/blob/main/VAE/LTX2_video_vae_bf16.safetensors
>>107856116Today's creations, not many were worth posting.
Do people use this? It's not used in the default workflow.
>Because the inference optimizations for this architecture are currently limited, the runtime cost is still relatively high. It requires either a single GPU with more than 80GB of memory, or a multi-GPU setup.How would it even work with multiple GPUs?>The target image resolution must be divisible by 32. Otherwise, it will throw an error.So 1024 times compressed latent space? No wonder why the images look shit.>Guidance scale rather than CFG in the example inference codeLooks like the anon who guessed Qwen distill might be right>>107856053
>bro just autoregressively generate 4096 tokens and only then do you get to feed it into a diffusion model to actually make your imageHave I missed something or is this thing destined to be insanely slow? What is the tokens per second of a 9b LLM on a 5090, like 50 or something at best? It's gonna be well over a minute just for those tokens to gen, then you have a whole ass 7b diffusion model on top of that.
>>107856239i downloaded this and keep running OOM because i did NOT read the manual lel
It requires 80gb because all of these retarded inference scripts load everything at once and keep it loaded during every stage.
i was able to gen a glm image locally with a 3090 24GB and cpu offloading, (i have 128GB).took about 2 minutes for an image though.
>>107856353I also have 128GB, what did you use? Comfy?
>>107856353Step 2 sounds like something pol would say.
>>107856353did you write all the text in the final image?
attempt two is naturally, 1girl30 steps, 768x768, it takes a long time before starting to gen just beating the shit out of my ram. this one only took 40 secs to gen after though. it clearly does not like the smaller resolution.>>107856355i used the inference script from the model page. attempting to drop the resolution and steps to see what kinda timing i get but it'll probably look like ass.>>107856364the prompt is from the official page
>>107856378OK thanks anon, I wonder if it'll be able to do llamacpp style offloading to ram in comfy.
>>107856417dropping the res also made it eat shit, going back to defaults and starting again. all the offloading stuff is already present in the gen pipeline so i don't see why it wouldn't be in comfy, it's all python. it's a big as fuck offload though.