Discussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>106978567https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://comfyanonymous.github.io/ComfyUI_examples/wan22/https://github.com/Wan-Video>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Neta Luminahttps://civitai.com/models/1790792?modelVersionId=2298660https://neta-lumina-style.tz03.xyz/https://huggingface.co/neta-art/Neta-Lumina>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>sneething so much he only his shitty gens in the OPreal bake when?
>>106981016kino collage
Blessed thread of frenship
>>106980892
>>106981043it's a shame wan fucked up the lettering but nice otherwise
https://www.youtube.com/watch?v=HKn8UWvEraE
I dedicate this thread to total SaaS victory
>>106981145Like Albert Camus said:>"One must imagine SIMPhius happy"
>>106981016
>>106981145lol
is there some way to generate additional information on top of an image, similar to the way a detailer works? i'm not talking about masking with denoise to redraw parts of the image. i mean more like drawing on top of a layer without changing what's underneath, but still using the image underneath as context. A completely random example would be, say a face, where i want to put white liquid on top of it. Masking with a detailer will require me to put a high denoise, which will redraw the face underneath. This is just a completely random example, and this specific problem probably has a specific solution, like combining a face lora with a white liquid lora, but i'm looking for something broader that can cover a ton of different cases. Maybe stuff like flex does this? But i would prefer not to put an additional unet into my workflow.
>>106981309The closest thing is edit models, of which you would need to train a lora for. Second option is roughly painting it in yourself and doing multiple inpaint/img2img passes until it looks decent, then layering it on top of the old image and selective erasing to restore face details
>>106981309You could maybe work something with the Krita extension since it uses layers.
>>106981336nta but i could never get the piece of shit to work properly according to the creators video's and following everything exactly. It seemed very limited compared with just copy paste image into gimp, make edit then copy paste back into comfy with out having to fuck around with some tool. maybe i just installed it when it was bugged or something.
>>106981509majority of local copium (ipadapter, regional prompter, rescale cfg) is complete snakeoil trash.
>>106981509>>106981336i mean like it wasn't showing the image in comfyui and vice versa, its nodes weren't doing what they should according the demo video. i followed the instructions exactly and used the provided workflow. I get fed up with it and just gave up. I would be really fucking cool though to have something that just works as an interface between comfy and something like gimp.
>>106981515>rescale cfgI've had great results with this. It's just not something you want always on
>>106981515PLASMA LATENTS DUDE! it was proven that rescale cfg, cfg++ and other garbage are worse than normal cfg (there was a paper about this), but since it produces DIFFERENT results they think they hit it big, the AUTEURS that we have here LMAO, fucking copers
>>106981617>there was a paper about thislink?
>>106981617>>106981638Yeah bro don't trust your eyes, trust the paper
2girls when?
Generated image attached to snarky comment when?
Generated image next to offhand reply to snarky comment when?
So few people have this hobby, why even care. I don't get it.
https://www.reddit.com/r/aivideo/comments/1odt02a/wonderland_wow/When will local be on this level?
>>106981756Sad and desperate
>>106981572that's really cool
>>106981756they just stay in the same position, it's not natural at all
Any local alternatives to ElevenLabs?
>>106981867You can rent indians locally on fiverr
>>106981867Vibevoice, Chatterbox
>>106981870keek
What are everyone's favorite checkpoints these days?
>>106981922Seedream 4
>>106981922NoobAI v-pred and Chroma variants
>>106981922Sora 2
>>106981922Chroma is the only answer.
how much faster is 5090? thinking about upgrading from 5070ti
>>106981984you'll be more happy about the fact it's a 32gb card, with that amount of space you won't have to cope anymore with wan and shit
what version do anon use for chroma now? v50?
>>106981998retard
>>106981724>>106981778Nice style, what model and lora is it?
>>106981922Checkpoints have different uses for me.When I make an image, I can use up to 4-5 different ones.Upscaling and inpainting is where it's at.
>>106981998HD version and various mixes>>106982013Chroma-DC-2K-T2-SL4 + Digital brushes lora I trained
>>106982012elaborate inb4 no reply
>>106982035no need. if you can't understand what version is the most used you don't need to know.
>>106981932>>106981960Post something impressive that makes me buy into chroma hype.>>106981931What's so great about this? Looks like typical sdxl slop I see everywhere.
>>106982024Nice, it actually doesn't look like Flux.
>>106981638man it was about a new CFG replacement method, it was posted hereit compared CFG, CFG++, RescaleCFG and their method.In all the benchs (FID/CLIP/etc...) CFG was still ranked higher than the other meme cfgs.oh wait found it:https://arxiv.org/html/2508.12880v2
>>106982070let's not forget that they always choose only the benchmarks that shows good results, so maybe it's the case on that one also
>>106982095maybe you're right, and the evaluations are still synthetic at the end of the day (which means they can be benchmaxxed to death).This is interesting because only their stochastic guidance was benchmaxxed for their tests, so we can ssume that all the other guidances we see are not touched. in the pdf they also have comparisons with pictures, I've just found this really interesting and at the end of the day, I truly think that messing with guidance is mostly snakeoil
can wan do proper real life cartoon crossover stuff? just had a random thought of making a lora using roger rabbit, space jam, etc
>>106981984enjoy your housefire
>>106982107>I truly think that messing with guidance is mostly snakeoilwhich is really surprising to me, what are the odds that CFG is the optimal guidance? it was literally the first one that existed lmao
>>106982123Pony v7 is pretty good at that.
babe wake up, bytedance released a finetune of wanhttps://huggingface.co/ByteDance/Video-As-Prompt-CogVideoX-5B>Video-As-Promptthat's... interesting...
>>106982166This looks like shit. I'm going back to bed.
>>106982166https://huggingface.co/ByteDance/Video-As-Prompt-Wan2.1-14BWan 2.1 again though
>>106982166>>106982174https://youtu.be/S3zpLIMOU4c?t=12so it's like wan animate but with additional effects?
>>106982166what's the point when Sora2 does this infinitely better?
>>106982166>cogI forgot about that one, need to add it to my failbake list alongside hidream
>>106982196kek
>>106982166why did they make the reference videos semi transparent? are they retarded?
>another bytedance scrapslop release that nobody will useLooks like another batch of interns failed to make it onto the seed team
>>106982239this
>>106982239Let them shill here. They don't have anywhere else to advertise it.
>>106981043Good job keeping the asses on-model
>>106982288there will probably at least be a “comfy wen???” reddit post about it before it gets completely forgotten next week
>>106982166comfy wen???
>>106982070I hate going through these papers. Sure, you can make even a small dick look huge on paper if you measure from the asshole.
>>106981043lora?
Is it possible to have a lora and checkpoint gallery in comfy, like you can with forge?
>>106982520there might be some nodes and I think swarmui does something like that
Anime 1gril obsession = Male delayed development Doll Collecting.
>>106979255so which is the lora that will stop color shift?
>>106982560Any of them supposedly, depending on the type of video you gen.Some are trained for one long video, some are trained for different transitions etc.
>post locally genned video>janny warns mei guess it was too realistic?
>>106980758>https://github.com/stduhpf/ComfyUI-WanMoeKSamplerI used that for a while but it makes nonsensical high/low step count if you change the shift value too much so I got back to bongmath nodes.>https://github.com/ChenDarYen/ComfyUI-NAGBasic stuff with wan at cfg 1.>https://github.com/Extraltodeus/Skimmed_CFGI'm testing that in the first step in high noise I gen without the lightx2v lora, allowing me to go basically very high cfg, I seem to not have any slow motion issues thanks to that. Then after that I go NAG with cfg 1 + lightx2v.
>>106982647stop posting scat in a blue general
>been genning all dayBros, it's 28c in my office with window open and it's 12c outside..
>>106982697Powerlimit to 75%. Open case.
>>106982715Where's the girl?
>>106982697>window open and it's 12c outside..what gpu are you genning with? 4x5090?just opening the windows when it's cold should be enough
>>106982727Lost for now
>>106982793cute freckles
>>106982809Agreed
Good morning anons, I have a couple of questions, if you would indulge me.What is NAG? Another thing: CFG on low noise in Wan2.2, I've heard 1.0 ignores negative prompt, but 1.1 don't. I've been using 1.1, but I don't know if it's the optimal.
>>106982192>Why doing open and verifiable research at all? Closed SaaS labs mogs anyway broRetard
>>106982879NAG is a way to guide your positive prompt away from the negative when using CFG=1. Normally CFG causes the result of model(positive) + (cfg * model(negative)) - model(negative). When CFG=1 it leaves you with just model(positive), so it's faster to just inference positive, instead of a negative too. NAG tries to give you > 1 CFG control without incurring the higher inference cost.. but I don't think it works very well desu
>>106982715existential angst in a hostile universe.
>>106982697check your memory temps and shit yourself
>>106982879>I've heard 1.0 ignores negative prompt, but 1.1 don'tyes but it also doubles gen time because it has to generate a negative video. you might as well increase cfg if you're gonna use it because 2.0 or 3.0 vs 1.1 is not gonna make a difference in gen time but setting it to 1.0 will>CFG on low noise in Wan2.2using cfg on high noise would make more sense because high noise is responsible for establishing the base motion of the video. low noise just fills in the details
blessed anons, could someone share one of those vace workflows that connects videos seamlessly by using ending frames from the first video as reference?
>>106983085>yes but it also doubles gen time because it has to generate a negative video.What the fuck, why have I never heard this before?
>>106983099i would but i'm a cursed anon
>>106983099> seamlesslyimpossible
>>106983140workflows from cursed anons are accepted as well
>>106981922Wanmaxxing when a new LocalOptima lora drops for a couple of days before going back to Chroma HD and NoobAI to wait for next videogen model drop.
DIddtech Industries Presents. N-Droidhttps://youtu.be/PUtqe_nEovY?si=VT_QgKgD8J-fq3UV
>>106983186i fuck with the beat
>>106983186diddy is truly the most significant person with epstein to zoomers, it's so overused
>>106983225oh yeah 6 fingered grim reaper, my favorite
>>106982070this got me thinking... is CFGZero worth it? looking at the paper, it looks pretty good tbqh. does it work everywhere?
>sweet, LTX2 is coming https://website.ltx.video/blog/introducing-ltx-2>10 secondsi weep
getting better with Chroma concept bleed
is there any way to get rid of this error that keeps popping up? it doesn't seem to do anything, it's just annoying
>>106983520Ask your favorite LLM
>>106983186>the yakub appearancekek'd
>>106983434how did you make multi chars for chroma?
how do i stop 10% of gens failing with "moov atom not found"?
Instant video gen on quantum GPUs when?
>>106983605
>>106983593>how did you make multi chars for chroma?Tagged chracters by their name and look. That's all really
>>106983593>multi chars for chromaUse Unicode.
>>106983173post a loli gen and i'll post the wf
>>106983099it ain't vace it's just wan, i made it myself though so it might be kinda tardedhttps://files.catbox.moe/b37nks.png
>>106983099Try https://github.com/bbaudio-2025/ComfyUI-SuperUltimateVaceTools/tree/main/workflows and grab the long video workflow, its vace 2.1 but seems to be the best I've found
>https://x.com/OpenAI/status/1758192957386342435>Sora can create videos of up to 60 seconds>revealed Feb 15 2024, probably in development since 2023, two fucking years agoWhy are we still stuck with 6-second videos?
>>106983837>browse to /ldg/>complain about sorawhy
>>106983854Because we want local control. Not Silly Valley toll booths.
>we
>>106983837Can only imagine they have more [spoiler]$$$[/spoiler] to make 60 second models.
>>106983125It doesn't, it computes a backward step that deliberately makes the entire video slightly worse in all the ways described in the negative prompt, then says "okay, now do the exact opposite of that" and adds (1 - CFG) multiples of that exact-opposite step on to the usual make-it-better step. At least, that's how I'm reading >>106982947
>>106981055nope its correct
>>106983837maybe when tpus become a staple in local machines we'd get shit like that
https://x.com/ltx_model/status/1981346235194683497new ltx model, 5, 10 and 15 sec gens, audio is better than wan 2.5, 1080, 2k and 4k res, says its still fast, weights late nov, we are saved
Neta does non-anime 2D stuff quite well too
>>106983822>>106983831i'll give them a shot, thanks anons
>>106983593Even SDXL can do this if the characters look absolutely nothing alike
>>106983085I knew that increasing CFG in High helps with more motion. Would increasing CFG in Low help with details? So if I'm willing to take the hit on gen time going >1.0 might as well put low at 2 to improve details for example?
>t2v wan lorasis there any use case for image trained loras for i2v? training for t2v makes sense to control the generated characters/style but starting with an image i can only think of use cases where you would want trained motions
>>106984222Can it do tiddies?
man I need the 1girls laughing at me, WHERE ARE THEY???
Now that we got updated 2.2 i2v light loras I might finally be able to abandon smoothmix.It's still very good for nsfw: https://files.catbox.moe/wjtus1.mp4 (nsfw)
>>106984419>man I hope its not the fucking frieren futa faggot posting again his gens>open linkwhat did I expect.ngl your gens are good but FUCK MAN change the subject, fucking frieren turbonormie SLOP
>>106983299>6 fingered grim reaper, my favoriteCould go up to 7 if you ask nicely
>>106984457Lul. for i2v futa is easiest and most popular. I've tried other girls, but always come back to frieren, I have the most fun with them.I did a tentacle one before this, but it's not worth the effort, no support from loras.
>>106984225
same joke.different post.
cooking some gens, I feel I'm gonna make the next collage bros!!!
>>106984419smoothmix can do that without other loras?
>>106984697but its shit for loras
>>106983434What? Chroma can do this style? Perhaps a further anime tune can save it after all. What is the prompt?
>>106984677Halloween stuff?
>>106984697You mean I got the results without other loras? Yeah. But since it was too complex of a pose I had to rely on normal prompting, like she was chewing on the nipple rather than sucking.It's not t2v.
if i switch from cfg 1 to say 3.5 or 5, do i have to increase step count too, or is the extra time the only downside?do the light loras ruin what you'd gain from the increase in cfg?
https://www.reddit.com/r/StableDiffusion/comments/1ody8v0/no_update_since_flux_dev_are_blackforestlabs_no/FLUX.2 is never happeningBFL confirmed selloutComfy soon to follow (what's stopping them?)If we had an actual FLUX.2 you'd bet it'd be a huge upgrade to original Flux, no more T5, API tier prompt understanding, proper understanding of styles, etc...Shame, really.
>>106984900Flux is a deadend model just like SD. The only exciting thing going for it is Chroma. We now rely entirely on Chinks to get our next gen models, and they are not that good at training models to be frank. Look at the disaster that was Hunyuan Image 3. It's over.
>>106984746this has to be two different prompts causing that harsh ass jump>>106984810the whole point is it's faster to go with 1 cfg, the loras enable you to get good enough output without a higher cfg. you can raise cfg and leave the steps the same or raise the steps but you're gonna get more unpredictable results. you'll have to experiment with lora strength as well.
>>106984727>Chroma can do this style?Yeah with lora
>>106981145lmao nice
>>106985046>tiny hand>breast way too far from the armpit>swollen shoulderso this is the power of chroma huh?
>>106985143>tiny handbig girl>breast way too far from the armpitshe's ((posing))>swollen shoulderwhat you don't realize is that's her JO arm. She's a tough girl.
>>106985161>cope>cope>copeI accept your concession
>>106985161Perspective on the gun is also fucked up and it shouldn't be a glock anyways, i guess chroma is only good for slop
>>106985166>>106985173I guess Chroma girls are just built different.
hmmm 1girl bros??????
post your 1girl (hint: you wont)
>>106985204oh yeah bro hit me with the abstract grainy chroma slop I fucking love it my man
>>106985214still better and more creative than 99% of the plastic shit posted here, only the lumina 2 model looks better
>>106985214Love negative space in your text my man. The lack of an image next to it makes it really symbolic
Hello, what's the verdict on Ovi?
>>106985300useless trash
just saw this on its page, a list of artist styles netatume knows https://gumgum10.github.io/gumgum.github.io/
>>106985300I did a couple gens. It's not great.
>>106985362>https://gumgum10.github.io/gumgum.github.io/gumgum.github.io/doesn't work
>>106985362>>106985378>https://gumgum10.github.io/gumgum.github.io/nta, worked for me
>>106985378it works
>>106985362Add to OP >>106981016
is 2.1 lighting lora still best for 2.2?
If one would want to start getting into QWEN or Chroma, where would one start?
>>106985362>for v2 not v3.5dang. not the end of the world tho
>>1069855083.5 is a bigger dataset / trained more
>>106985304>>106985373Pity. Thanks.
>>106985511the author seems to say its not DPO'd as well which is good
>>106985214This but unironically. Chroma grain is so aesthetic.
>>106985635I prefer anime 1girls desu
>>106983249Very pretty water
>>106985507The very beginning? https://www.comfy.org/download
>>106985641yes but what if anime 1girls that have chroma grain abstract backgrounds that both show off how futuristic they are and also emphasize the size of their boobsthe future is in model bashing
>>106985664i like my anime girls petite
Bros, I've been out of the loop for a year. Is it possible to run a wan 2.2 quant on a poorfag card with 8GB of vram? And, can I stick to my boomer UI (forge neo)?
>>106985776no, you can only be poor OR stupid, not both
>>106985657I'm already well versed in Cumfy.I'm asking more of which base workflow would be a good start to learn how it works so I can understand the process and fine tune it for me. Does the models fit in 12gb, or do they have gguf/quant ones?
kijai-sama delivershttps://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Stable-Video-Infinity
>>106985845still is not fully implemented, it needs temporal masking