Discussion of Free and Open Source Diffusion ModelsPrev: >>107846749https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipe>Z Image Turbohttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>WanXhttps://github.com/Wan-Video/Wan2.2>LTX-2https://huggingface.co/Lightricks/LTX-2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>NetaYumehttps://huggingface.co/duongve/NetaYume-Lumina-Image-2.0https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe|https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
Blessed thread of frenship
>>107849302>I tried a lora and it destroyed the quality of zitdon't worry dude, once we get z-image base we'll be able to do actual lorashttps://files.catbox.moe/x20yb0.mp4
>>107850075why does ltx2 has the tendancy to do some MewMaxxing mode on humans, as if it was only trained on gigachad or something lol
>>107850075>once we get z-image base
>>107850112time to be dissapointed by another mid image model, Autoregressive models have always been really shit
>>107850102cutie
>>107850112>model support gets added>model gets never releasedchinese culture
>>107850119b-but, random twitter chinese man said it's gonna be released this week!1!!11
>https://github.com/bytedance/ATI>wan atiuse case?
>>107850112I have github commit fatigue
>>107850126he did not say that blackie, he said soon, learn to read
>>107850144>he said soonhe said "next week" last week browniehttps://xcancel.com/bdsqlsz/status/2009911175019168215#m
>>107850146>he did not in fact specify which modelhm
>>107850089kek >>>/wsg/6071667
>>107850148I love chinese culture
>>107850112is there any image output made with that model? we don't even know what it's capable of
>>107850177
>>107850075Chinese culture
>>107850195some speculation from the previous thread:>>107847181>>107847267
>>107850112>>107850117>Autoregressive models have always been really shitthe glm team is far from being a mid company, their LLMs are really really good, if they can compete against Alibaba (Qwen) on that, I think they also can on image models
>>107850212Anything on the size, is it gonna be trainable without a supercomputer or some abomination that even a 5090 can only run quantized?
>>107850102fuck skirts women should always go like that.
>>107850267it's a 9b modelhttps://github.com/huggingface/diffusers/blob/6cfc83b4abc5b083fef56a18ec4700f48ba3aaba/docs/source/en/api/pipelines/glm_image.md>Autoregressive generator: a 9B-parameter model initialized from GLM-4-9B-0414,
>>107850135git commit -m 'suicide'
>train influencer lora of girl who has somewhat crooked or unique teeth>picks up everything else about the likeness but the teeth are the normal perfect ZIT teeth.I didn't caption it thinking it'd simply pick it up along other things, do I have to caption for it? Or is it a too small detail to be picked up? It learned other things that even showed up in just one image like hair styles and such.
>>107850330dont caption it, add a few close up images with teeth visible, train at 64 rank and train longer maybe until you get it
>>107850332>Redditors not understanding Chinese Culturenot surprising
>>107850399
why is this retard lodestones training Z-Chroma on base ZIT and not the dedistilled version
>>107850474be respectful
>>107850474The man has no forward thinking. Someone slips him a script and he runs it.
>>107850370can someone explain this aspect of chinese culture to me, there's no point in keeping Z SAAS because while good compared to local models the other SAAS models are much better and it wouldn't be a competition. So why not release it, after all ZIT is the proof the base model exists, there's no point in keeping it private outside of blueballing randos on the internet, I don't get it
>>107850492it takes time to train models and training a model to be good at editing is much different and harder than finetuning a model to be good at realism
>https://github.com/Tencent-Hunyuan/HY-WorldPlayhas anyone tried this?
>>107850506>training a model to be good at editingZ-Base is not an edit model
>>107850513Omni can do edit
>>107850516Wehre did they say that, I remember them saying they are separate.
>>107850492What this >>107850506 user doesn't understand is that the base model was done long before zit was even released. What we are seeing now is an aspect of Chinese culture I suggest you all get familiar with. That in particular being the fact they cannot release the model even though they probably intended to and now are incrementally rolling back expectations to save face. That's why we get some inference code updates periodically so nobody outright confronts them on their bullshit. But this is classic Chinese face saving behavior.If you're familiar with izzat, it's like that but less... destructive and more trouble avoidance.
>>107850532what's the point of their team posting things like "Your patience will be rewarded" then, that's the opposite of rolling back expectations
(-`ω´- )人 wafu
>>107850549moron
https://huggingface.co/Kijai/LTXV2_comfy/tree/main/VAEHeads up. LTX shipped a fake and gay vae with ltx distilled and kijai uploaded the good one here.
>>107850491which model is that?
>>107850541>what's the point of their team posting things like "Your patience will be rewarded" then, that's the opposite of rolling back expectationsThere is no point. It's just buying time until they can clear their hands of the burden altogether. What part of saving face don't you understand?
>>107850554is that link the best place for quants?
>>107850554it's the same, they're just separated. makes for better memory management though. and you have to load the audio one with kijai's VAELoader KJ node
>>107850532Yeah, I mean the reason why they aren't releasing the base, not for the face saving shenigans. It's obviously existing and finished, so why not release it? Did the CCP forbid them or what?
>>107850559dude they merged a commit on diffusers and moderscope, if they wanted to say no they wouldn't have done all this effort, they didn't pretend anything when they ended up not releasing Wan 2.5, they just went on with their lifes
>>107850578yu don nied mowe than Qwen Image gwello!
>>107850568Either I'm misunderstanding something or you are, but the file was just uploaded 20 minutes ago and the size is different to the previous separated vae.
>>107850585are they fucking serious? lmaoo
>>107850578All I can say to you is Chinese culture.
>>107850585oh shit you're right dawg, he switched out the file with the newer onesorry i jumped to conclusions because i was already using his separated VAE but the older one
>>107850589you're courting death
>>107850492>he other SAAS models are much betterz-image turbo being a distilled model limits it in terms of finetuning and loras, however not in usage, at least on the level that you can use SAAS models. It maybe doesn't know some concepts but on SAAS models you can't prompt celebrities and you get censor slapped into your face when you try to generate smut. That's even true for Grok who was pretty lewd but is now getting censored more and more every day since normies found on twitter that you can undress thods with it right on their profile and went batshit crazy about it.
>>107850585left is the new version of the vae and right is the old one
>>107850600Actually huge.
>>107850556ZImageTurbo
>>107850600kek, how did they not notice this
>>107850585https://www.reddit.com/r/StableDiffusion/comments/1qbq4mz/updated_ltx2_video_vae_higher_quality_more_details/>EDIT : You will need to update KJNodes to use it (with VAE Loader KJ) , as it hasn't been updated in the Native Comfy VAE loader at the time of writing thiswith this node then? if it's not compatible with native comfyui I guess they changed the architecture a bit?
>>107850600right is so much better
repostingI have spent hours trying to get a WAI controlnet looking right, but got very few usable results because loras and tags for each character end up conflicting and creating messy results.Just to experiment, I tried using Qwen3 Image Edit, giving it reference images for each character and a natural language description of what I want. It produced some very reasonable results, except the original artstyles get lost and replaced with generic anime baked into the model.But then I took one of those outputs, passed it into a WAI image2image workflow using an artstyle lora, along with all the tags that would be associated with this base image. Results actually came out really well, obviously with some detailing work that needed to be done. Much more consistent than the controlnet workflow.I was under the impression that image2image wasn't recommended for multi-character compositions, but for me it worked.
>>107850576>Did the CCP forbid them or what?Porn is illegal in China.
>>>/wsg/6071705
>hunyuan video 1.5verdict?
>>107850639It was already considered dead on release.
>>107850629>I was under the impression that image2image wasn't recommended for multi-character compositionsThat's just incorrect.>loras and tags for each character end up conflicting and creating messy results.You have to use regional prompting and lora clip.
>>107850584love this meme. Now give Yuu a CCP uniform and write "Chinese culture" on it to make it perfect.
>>107850585>>107850600https://files.catbox.moe/oi2b95.mp4it definitely looks more detailled and less slopped, and btw, you can run this new vae with comfyui's native vae loader it still works fine
remember to pull KJNodes if you want to use the new vae, there was a commit 45min ago that you need otherwise you will get a terrible result
>>107850630But it can't generate porn, and chinese teams also released other image or video models, so why ban this one
>>107850645>You have to use regional prompting and lora clip.I was.Tags and loras applied to each region still conflict with each other. One lora used for one region ends up heavily affecting the artstyle and quality of both regions (and the entire image). Also tags applied to one region sometimes get applied to characters in a different region. For example, character 1 would be wearing gloves, and character 2 would not be wearing gloves, even though only character 2 had the gloves tag.This is all using the rentry guide workflow, except with "Load Lora" integrated.
>>107850639it was slightly worse than Wan 2.2, so it's completly useless
>>107850657https://files.catbox.moe/men7yy.mp4I'll never doubt you again.
>>107850658>But it can't generate pornIt can. Or can be made to.>chinese teams also released other image or video Chinese fuck things up. Are you new to this planet? Why would you think that they announce the release of a model and then don't. Chinese culture.
>>107850666>One lora used for one region ends up heavily affecting the artstyle and quality of both regionsIdeally properly made SDXL loras shouldn't do that if using lora CLIP but you're right this is an issue. Regional lora usage actually exists but I've never needed it. https://blog.comfy.org/p/masking-and-scheduling-lora-and-model-weights>tags applied to one region sometimes get applied to characters in a different region. For example, character 1 would be wearing gloves, and character 2 would not be wearing gloves, even though only character 2 had the gloves tag.I don't have that issue, try using non-overlapping regions, that's a common issue causing this kind of bleed I think.>This is all using the rentry guide workflow, except with "Load Lora" integrated.I don't know what you mean exactly but "Load Lora" does not use Lora clip and therefore causes bleed (fully applies without need for keyword)
>>107850666That's because the loras apply to the whole latent space and not just the regions, satan.
>>107850685>Why would you think that they announce the release of a model and then don't.Black Forest Labs did that with Flux VideoG E R M A N C U L T U R E
>>107850689 (me)>"Load Lora" does not use Lora clipI was tripping there, disregard this part it's incorrect.
Media Assets panel keeps shitting itself. It stops showing new gens after a while.
>>107850491>>107850608
>>107850711
>>107850692>Flux VideoIt's still in development. They didn't say "next week" or something. They also didnt release a distilled demo or something. Instead they went with Flux 2.Spot the difference?
>>107850711most of the plastic skin is gone, that's good
>>107850711fuck you for removing the text after the first few frames.
>>107850723>It's still in development.you're still coping? it's been almost 2 years you know
>>107850732>anon can't memorize 4 spotslol
>>107850689I was actually asking about overlap yesterday but didn't get an answer at the time. If two characters bodies are overlapping in a reference image, should I only mask the visible parts of the character being overlapped? If two arms are overlapping directly onto each other and only a thin section of character 2's arm is visible, should I be chiseling a thin mask around it in the mask editor and be careful not to mask the arm in front(chatacter 1)?I was under the impression masking in a controlnet isn't required to be precise and that I should just focus on the general region, but I suspect it's not that clear for more complex poses with a lot of crossover and overlap.
>>107850711why is she smiling on Q6 and not Q8 at the end? did you prompt her to do that?
>>107850744Nope, all the same prompt:"The girl sucks juice out of the cup using a straw"
>>107850733So? They don't have 1.5 billion Chinks in Germany.
>>107850744Same seed too obviously
>>107850740fuck you for making me to
SDXL until Earth gets swallowed by the sun growing into a red giant
>>107850740anon OOM's in real life, cursed gtx 1060 gene
>have a 4090 and 64gb lying around as backup>5090 in current pc>a new system would only be 2k usdI am cockblocked when I gen stuff for work, but if I get a second pc, I'd just use that to gen as well.. Conundrum.
>>107850748Israel has less people than Germany yet they made LTX2 kek, I guess the aryans are ultimately the inferior race compared to the kikes, kek
>>107850741Yeah, for something like pic related you'll have a lot of trouble giving a glove to the character you want, normally you'd be making a region out of this whole arm + the hand underneath and just tag what you want the pov arm to be, + "holding hands". If you have a region for the girl on her back then the model in most cases should be smart enough to understand that e.g. "tan skin" on the pov arm region would not apply to the girl's hand, but something like a glove, that can be trouble.You can try making more exact masks (in an image editor perhaps, transparent pixel = mask), but when you do that, you're limited to using higher ControlNet strengths otherwise it's kinda useless.
>>107850740Context too smal sry