Discussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107344153https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipehttps://github.com/ostris/ai-toolkit>Zhttps://huggingface.co/Tongyi-MAI/Z-Image-Turbohttps://huggingface.co/Comfy-Org/z_image_turbo>WanXhttps://rentry.org/wan22ldgguidehttps://comfyanonymous.github.io/ComfyUI_examples/wan22/>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
For those who weren't there on the previous thread, increase your shift to get rid of the noise.
god imma cum>oooh husbant you want pray street fighta 2 for tha supa nintendo??
Kijai's Torch Compile node can now handle zimage.
Is there a way to input a dimension value, then a ratio (2/3 9/16 etc) + rounded to x (16/32/64) instead of typing everything?
left is Z-image and right is Flux 2, I mean, come on... what the fuck did the bfl fags do to make it so bad?
>>107345792No but you can use scale latent if you already have the ratio dialed in right and you just want to make it bigger/smaller.
Can someone explain to me how they got the turbo model before the base model? I thought they made the turbo model from the base model (with that teacher distillation shit)
>>107345792Wait until swarmui implements native support for Z, it has excellent dimension value tool
>>107345765Thanks, made a big difference.
Anyone else fucking around with the img2img? what're your go to settings so far?
Bloatmaxxers, have we been blown the fuck out?
>>107345809Safety
>>107345765>b-but Z-image has muhh jpg artif-ACKkek
Why does Anon love cheese pizza so much?
also thank you >>107345607
>>107345868IT'S TASTY AND RIGHT-OUT-OF-THE-OVEN
>>107345815our based chink overlords released the turbo model first, they're still sitting on the base model until you get your social credit score higher
>>107345898the twitter chink said that the base model is "bad", so if turbo got distilled from this "bad" model and we got this kino, does it means we'll get the final boss later? lmao
>>107345898i mean he wanted an explanation on how it works in the nitty gritty, but lol true our social credit scores are in the shitterespecially the simpsons anon since he turned into a pedo THROUGH z-image. funniest shit i've ever seen in these threads.
>>107345910prompt? I like that anime style
>>107345765I'm not sure but it might be better to keep the noise version then run a 2nd pass ksampler on it for like 2 steps.I do something like that with Illustrious, where I intentionally create a shitty version (with RescaleCFG) then clean it up on a 2nd pass, because the shitty version has more intelligent composition.
"the video is shot like a first person view and the viewer is the camera-man.the demon woman crawls up to the camera man with seductive motions as the foliage on the ground gives away as she moves on it her hair swaying with the movements with realistic physics and the camera follows her as she crawls closer to the camera and places her hands on the camera mans thighs as she smiles seductively and looks up to the viewer as the camera now looks down on her face and the forest ground as the background, point of view style."How do I prompt for the camera being an actual pov, first person view?
>>10734591680s and 90s retro anime style illustration
>>107345918there's a node for that, it lets you use rescaleCFG for a certain amount of timehttps://github.com/BigStationW/ComfyUi-RescaleCFGAdvanced
>>107345909Nobody is born deviant.
>>107345909Is that z-image?How did you make her thick but not fat?
>>107345928I used to use that one but I made a better version of it
God damn it does small detail so well when other models completely shit themselves.
>>107345939img2img my nigga
>>107345909I like this.>>107345939But this... is perfection
>>107345956lame
>>107345954you'd think they used a godlike tier vae but it's still the regular flux vae lmao
>>107345961>the virgin "husband let's go running in the park!" vs the ascended "ooo husbant my chowestoral levelrs arr too high.."
>>107345954it's humiliating for the rest of the field to be this much pwned by a 6b model desu lmao
>specify working out in the gym>adds visible sweat beadsS O U L
>>107345961I see...a fellow architect
Why is wan2gp slow as shit vs comfy? I downloaded it specifically to avoid dealing with comfy's shit, tried to generate a 5 second video, and it claimed it'd take over 2 hours. Comfy would have had it done in ~10 minutes on my shitware.
>>107345921
>>107346018>Why is wan2gp slow as shit vs comfy?Comfy has the best back end of all of them, not even close
>>107345964IMAGINE IF THEY... FINETUNE IT.THE POSSIBILITIES OF EVEN MORE DETAILS... IMMMMMMM CUUUUUUMMIIIIIIIIIIIIIIING I'M CUMMING I'M CUMMING I'M CUMMIIIIIIIIIIIIIIIIIIIIIIIIIIIIINGGGGGGGGGGG
It deeefinitely doesn't have much knowledge on nudity. But it doesn't seem censored though. >>107346044wwoooaaahh
*insert pigeon gooner with lots of spilled milk around*
top row is Grokbottom row is ZimageZissies, how are we coping with this one?
>>107346002can you catbox that so I can see what kind of prompts you're using? I have no idea how to prompt this shit.
>>107346080Oh wow, didn't know Grok improved so much
>>107346080nice
>>107346085>photo of character>brief desc of what the character is doing in what settingand that's it basically. Should probably feed that shit into gemini or something to improve it, make it way more detailed. But it works.different variations of photorealistic tags works, like amateur photo or "Polaroid SX-70 manipulation photograph" like one anon was using in his txt2img.
>>107345964>>107346044someone tell lax about this so he can tell the chinks to finetune it so we can have even better detailshave you ever seen nai v4.5's details? well they finetuned flux's vae and looks better
>accidentally gen a 1080p 16:9 wan video at 81 frames>only half of my 32gb vram is used???
>>107346080Backgrounds have better architecture in all three Z-Image.Women are more beautiful/cute in all three Z-Image.Western image generators only make sluts and avant-garde (LGBT-style).
detailmaxxing
zimage being viable in the long term will all depend on how trainable
>>107346098The skin of the girls looks noisyThe rest of the pic doesn't
>>107346080>Huge cloud-only SAAS model is slightly better than a 6b local modelYeah, how on earth will I cope...
>>107346095How are people getting it to do characters? I asked it for mercy from overwatch as a photorealistic girl and it gave me totally unrelated slop that looked like it was a 3D rendering of a chink mobile knockoff game of a knockoff.
>>107346124you guys should stop complaining about this, it's much better that an AI model is slightly too noisy instead of the opposite. there are 50 different ways to reduce noise.
>>107346126>6b local model that's a turbo distill of an unfinished checkpoint>running on steps as low as 4-8honestly its like apples to slightly better apples running on datacenter PC's ran by jeets and musk. all things considered we're eating pretty fucking good.>>107346129we're img2img maxxing now son.
>>107346080
comfyui nodes 2.0 verdict?
hows the gooning for z-image so far? i have yet to see one
>>107346113Keanu Reeves doesn't look like he knows his way around a piano
>>107346101Glich in the matrix
>>107346149look at the catboxes of the previous threads
>>107346112It's a distilled model so not so much from the get go.
>>107346138I fucking hate comfy and don't know how to do this shit like I would be able to in forge. Help a retard out with a workflow?
>>107346135Don't bother responding to retards, these are the kind of morons who want shitty encodes where they remove all film grain making everything uncanny.
>>107346113
Instead of using Qwen3 as text encoder can I just use Gemma or something else? Already have tons of other models but not Qwen.
>>107346157They will supposedly release the non-distilled base model weights too.
>>107346146Haven't tried them yet, hopefully they add more basic functionality which you need third party nodes for now.
>>107346160I'll help out my nigga retard, no problem.https://files.catbox.moe/ec22tr.pngadjust denoising to 0.3 or lower depending on the image, i'm not sure what exact settings i was using for this image but set it to 8 steps euler simple to start with if it isn't already.https://files.catbox.moe/ec22tr.png
>>107346181that coffee can kick in any day, preferably right now so i don't do silly things like re link the same catbox>>107346179god i love her face here, very cute.
>>107346094very nice
>>107346149Decent, but of course the true potential will only be tapped once you can train loras.
Hey hey Anon, Anon here.Some more from the mill.I'm still sorting and checking, there's a lot of things that work absolutely great or decent for these prompts.res_2m, euler, sa_solversimple, ddim_uniformThose seem to be favorites so far. Fucking model is a beast, most of it is at least decent, while some of them are pretty mind blowing for a small turbo model.China, man. Fucking China.Full sized of all prompts:Frilly titties: https://files.catbox.moe/ue2243.jpgTrash girl: https://files.catbox.moe/3u1zwe.jpg (Resized version picrel)Rat girl: https://files.catbox.moe/cpz3uy.jpgMixed media: https://files.catbox.moe/jtfu1h.jpgOil painting: https://files.catbox.moe/6kvktz.jpg
>>107346144kek
>>107346194>Rat girl>click>actually humanMost disappointing thing that happened today.
>>107346181>>107346185So what is the second image? The workflow is kind of moot until we know what the input was.
>>107346207If it stays like that it seems like your day will be great.
>>107346103A worthy cope>>107346129You can try "cosplay of mercy (overwatch) ..."But I don't know if it knows mercy natively
>>107346179I fully expect something bad to happen because only gay and retarded things are allowed to happen in this world. Euther they won't release the weights or the nondistilled model will suck, there is always a catch
>>107345765I made this change and it created way more blocky artifacts in my image. For reference what sampler/scheduler/step count do you use anon?
>>107346156>>107346188yeah i checked on civit ai, it doesn't know what pussy or dick looks like but atleast its uncensored, ill just have to wait for some legend to properly train it
>>107346224the regular euler + simple, 15 steps
>>107346221You already posted this.
>>107346227>i checked on civit aican you show one such link I can't find them
>>107346227why not both?https://files.catbox.moe/vdwyr0.jpg
>>107346237I will keep doomposting until proven otherwise
>>107346221>I fully expect something bad to happen because only gay and retarded things are allowed to happen in this world.nothing ever happens
>>107346230Thank you will experiment more. Maybe it was because I was not generating photo style but 2D art but maybe I was just using enough steps.
>>107346153Definitely. I guess it's because there's not much detail, just flat 2d.
>>107346256me in the pic
>>107346238>https://civitai.com/models/2168935/z-image?modelVersionId=2442439 or you can filter by z-image, on the images tab >>107346243bleach now.
anyone been writing down actor/celebrity knowledge for Z?
>>107346256GOD he's literally me
>>107346256lmaooooo
A note from a VRAMlet: loading the full model in Comfy with one of the fp8 weight modes seems to make the outputs notably more artifact riddled than loading q8 GGUF or even q6.GGUFs are a little slower for me though but everything is slow anyway since my GPU old so whatever, if I'm gonna wait anyway might as well get something good.
how does shift work in how it scales up/down? is there a point were i SHOULD expect it to start breaking down from being too high? Like 10?because i think 10 might be better than 7 kek
oopsies
>>107346261https://civitai.com/images/111748988damn there's some good images in there
>>107346282if 10 works for you, then what the fuck is the point of your question? just experiment until you find what works for you, fuck you.
>>107346296this is the type of shit you let your therapist know about bud. autism shouldn't be this hostile.
>>107346282>is there a point were i SHOULD expect it to start breaking down from being too high?AI models are black boxes so the only answer comes from experimentation, see by yourself
>>107346213i tried widowmaker and it produced a regular brunette with no ass
Reminder that BFL tried to make a video model.How big do you think that would haven been? 500B? 1T?
>>107346300hostility towards people of lesser intellect is always valid
>>107346315he's smarter than you though
https://civitai.com/images/111700885kek
base model really needs to be released so I can train a big ass lora
>>107346322me on the right
>>107346221>nondistilled model will suckthis one, they delayed the release of the base model because of how bad it was, and are trying to unfuck it, see previous threads
Reminder that generating CSAM is illegal and is punishable with jail time
>>107346332THE DONKS... IT WILL BE A GOOD FUCKING ERA
how bad is the q3 quant?t. gigavramlet
>>107346321i doubt it
>>107346332>>107346364>genning fat doinks in Z-image>>107346360w-would..
>>107346339>not 2B's skull with a blindfold and white hairtch
>>107345765sovl vs sovless
>>107346368what is wrong with you people? go for bigger quants and offload... >>107345353
>>107346332if it actually releases and it trains well ill become a member of the CCP
>>107346344only in cucked countriesand how will they know
>1male, male focus>get a vagina on the manepic
>>107346245>nothing happens>gay and retarded thing happens >good thing happensorder of probability
>>107346398based zimage showing you what you are for genning men.
>>107346398Pooners are men chud
>>107346332Just how big are we talking?
>>107346398>prompting model trained with natural language with tags
>>107345792this?
>>107346421BIIGGGEERRR
>>107346421100 epochs minimum
Starting some photo style tests.Euler/Simple, 9 Steps, shift of 3.Prompt prefixes used:An image taken on a 1990s analog disposable camera ofAn amateur candid photograph taken on an iphone ofAn amateur candid photograph taken on Sony Cybershot from Flickr ofAnaesthetic 5, a still from a movie scene ofA professional 35mm film photography ofA 1960 technicolor film still ofA classic cinema film still ofSo far, Chroma outshines it, but there's a lot of prompts to go through. Pardon the spam for a bit.
this is goodhttps://civitai.com/images/111711391
>>107346445>So far, Chroma outshines it,really? I don't get that much difference on chroma when prompting for those kind of styles
>>107346344>decided to wan2 some images with lolis without any anime lore>mfw
>>107346445thank you for your service once again, and don't apologize, this stuff is super helpful.
>>107346435That is pretty big. I'm very much looking forward to the tunes.
To be fair, you have to have a very high IQ to understand danbooru tags. The precision is extremely subtle, and without a solid grasp of of the english language most of the tags will go over a typical prompter's head.The prompters understand this stuff; they have the intellectual capacity to truly appreciate the depths of these tags, to realize that they're not just jeets describing a woman- they tag something deep about 1girl. As a consequence people who dislike booru tagging truly ARE idiots- of course they wouldn't appreciate, for instance, diffraction_spikes, ben-day_dots, or tenshi_kaiwai. I'm smirking right now just imagining one of those "natural language" simpletons scratching their heads in confusion as they desperately google what the name of the light patterns that appear at the bottom of a pool is called. What fools... how I pity them. And yes by the way, I DO have a danbooru platinum account. And no, you cannot see it. It's for the ladies' eyes only- And even they have to demonstrate that they have a positive feedback score within 5 points of my own (preferably lower) beforehand.
>>107346382oh shit it knows 2b, nice
>>107346462Yeah, I think so. Picrel is a plot of the same thing on Chroma V46 I did back in July.
>>107346445I'm getting really sick of your "chroma is better" posts bro, no one is going to use chroma so shut the fuck up. It has no future.seriously, do you really see a future for it? no.
>>107346386You don't even need that node, ComfyUI offloads the required amount by default if you load a model that won't fit in VRAM alone. I tested it and it was not any faster to manually specify a GB amount with the DisTorch2 loader compared to just using the built in UNET loader. I guess it might depend on setup. But in any case yeah just offload some.
>>107346477this but almost fully unironically
>>107346483why not? it's completely normgroided
>>107346445Chroma is definitely better when it comes to style variety, but it's also like 5 times slower.
>>107346487the 3 images are the same thing (looks like a seed difference)
for an ancient vramlet (2070 super 8gb) chroma is fucking slow. z-image is also fucking slow. this is pain
>>107346445Z-Image is smaller and not slow as molasses, that's the reason why it already BTFO chroma. It doesn’t need to be better, it just needs to be good enough, a sinificabt improvement over XL, and fast.
>>107346491>ComfyUI offloads the required amount by default if you load a model that won't fit in VRAM alone.it doesn't work well for me it doesn't put enough on the ram and when it starts to run it OOM
>>107346515How slow are we talking
>>107346504And twice as large. If Z-Image responds well to training it will own image generation
>>107346519>It doesn’t need to be betterbut the funny part is that it is better, it doesn't have that oversaturated color and the details are much better (better anatomy too)
>>107346488I'm not shilling for Chroma, just saying that in terms of photography styles, from all the models I have tested with it (Krea, Chroma, Wan, Qwen, Hunyuan) it has been the only one so far that got a lot of styles at least partially correct, while others consistently shit the bed.The Z images are still leagues beyond what Krea/Qwen/Wan/Hunyuan managed to do.>>107346508Seed is the same on all gens. But yeah, you're correct. Chroma was still very much in my mind as the one model that actually got close to what you'd be going for.
>>1073465339 s/it on z-image
>>107346244i will be happyposting. im happy and content with what we got. immense research paper as well. 640,000$ for a model of this magnitude, imagine how much xAI/grok spent for their model, which is only slightly better than >6b model
>>107346550SAME. This is a GOOD VIBES general for the REST OF THE YEAR. We're EATING GOOD.
>>107346281have you tried the bf16 weights? i get images in only 25-30s (text encode included in 30s) on a 3060 12gb
>>107346550>i will be happyposting.based, it's rare we're getting some good news so I'm gonna profit of that moment
open source keeps winning lads>OpenAI has been hacked>ChatGPT maker OpenAI confirms major data breach, exposing user's names, email addresses, and more
>https://huggingface.co/ChenkinNoob/ChenkinNoob-XL-V0.1looks like there's a new NoobAI based finetune
>>107346545Could be worse... could be 20 s/it!
>>107346592More good news? And on this day a thanksgiving day? I could not be happier.
>>107346445Here's a silly one testing camera types and brands.As always, not too interesting, but it's kinda hilarious how accurately it captured the camera models themselves if it put it in the images.A Polaroid instant photograph from the 1970s ofA Kodak Brownie box camera photograph from 1950 ofA Hasselblad medium format photograph ofA Leica M6 rangefinder photograph ofA Canon AE-1 35mm photograph ofA Nikon FM2 photograph from the 1980s ofA Pentax K1000 student photography ofA Minolta X-700 photograph ofAn Olympus Trip 35 vacation photo ofA Fujifilm X100 street photography ofA Lomo LC-A photograph ofA Diana F+ toy camera photograph ofA Holga 120N photograph ofA Pinhole camera long exposure of
>>107346592>ChatGPT maker OpenAI confirms major data breach, exposing user's names, email addresses, and morebruh, how is that a good news, a lot of people are using ChatGpt, even people here
>>107346160>>107346181Slightly cleaner img2img workflow with resolution control if you'd like.https://files.catbox.moe/ix9cek.json
ive been gone 12 hours.how big did Z image blow up?
>>107346594Yeah, and it's a proper noob instead of waislop.
>>107346611>how big did Z image blow up?biggest blowup in the history of /ldg/, and I was here during the blowup of Flux and Wan
>>107346603>Kodak momentThanks anon
>>107346610thanks bud.giving you the biggest thumbs up i possibly can IRL.
>>107346621What do you mean?
>>107346599and how fast are vramchads?
they should've called it zeus image or something for how GODLY and lightning FAST it is
>>107346611>>107346621>work with most popular frontend for day 0 support>small model that literally anyone with any gpu can run fast>limited in seed variance but still sota for ootb realism, ip knowledge, celeb knowledge, anatomy, camera control, text etcsimple formula
>>107346642I run at around 20 miles per hour.
>>107346603you know it has an effect when it changes the default asian type to caucasian lmao