Discussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107344153https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipehttps://github.com/ostris/ai-toolkit>Zhttps://huggingface.co/Tongyi-MAI/Z-Image-Turbohttps://huggingface.co/Comfy-Org/z_image_turbo>WanXhttps://rentry.org/wan22ldgguidehttps://comfyanonymous.github.io/ComfyUI_examples/wan22/>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
For those who weren't there on the previous thread, increase your shift to get rid of the noise.
god imma cum>oooh husbant you want pray street fighta 2 for tha supa nintendo??
Kijai's Torch Compile node can now handle zimage.
Is there a way to input a dimension value, then a ratio (2/3 9/16 etc) + rounded to x (16/32/64) instead of typing everything?
left is Z-image and right is Flux 2, I mean, come on... what the fuck did the bfl fags do to make it so bad?
>>107345792No but you can use scale latent if you already have the ratio dialed in right and you just want to make it bigger/smaller.
Can someone explain to me how they got the turbo model before the base model? I thought they made the turbo model from the base model (with that teacher distillation shit)
>>107345792Wait until swarmui implements native support for Z, it has excellent dimension value tool
>>107345765Thanks, made a big difference.
Anyone else fucking around with the img2img? what're your go to settings so far?
Bloatmaxxers, have we been blown the fuck out?
>>107345809Safety
>>107345765>b-but Z-image has muhh jpg artif-ACKkek
Why does Anon love cheese pizza so much?
also thank you >>107345607
>>107345868IT'S TASTY AND RIGHT-OUT-OF-THE-OVEN
>>107345815our based chink overlords released the turbo model first, they're still sitting on the base model until you get your social credit score higher
>>107345898the twitter chink said that the base model is "bad", so if turbo got distilled from this "bad" model and we got this kino, does it means we'll get the final boss later? lmao
>>107345898i mean he wanted an explanation on how it works in the nitty gritty, but lol true our social credit scores are in the shitterespecially the simpsons anon since he turned into a pedo THROUGH z-image. funniest shit i've ever seen in these threads.
>>107345910prompt? I like that anime style
>>107345765I'm not sure but it might be better to keep the noise version then run a 2nd pass ksampler on it for like 2 steps.I do something like that with Illustrious, where I intentionally create a shitty version (with RescaleCFG) then clean it up on a 2nd pass, because the shitty version has more intelligent composition.
"the video is shot like a first person view and the viewer is the camera-man.the demon woman crawls up to the camera man with seductive motions as the foliage on the ground gives away as she moves on it her hair swaying with the movements with realistic physics and the camera follows her as she crawls closer to the camera and places her hands on the camera mans thighs as she smiles seductively and looks up to the viewer as the camera now looks down on her face and the forest ground as the background, point of view style."How do I prompt for the camera being an actual pov, first person view?
>>10734591680s and 90s retro anime style illustration
>>107345918there's a node for that, it lets you use rescaleCFG for a certain amount of timehttps://github.com/BigStationW/ComfyUi-RescaleCFGAdvanced
>>107345909Nobody is born deviant.
>>107345909Is that z-image?How did you make her thick but not fat?
>>107345928I used to use that one but I made a better version of it
God damn it does small detail so well when other models completely shit themselves.
>>107345939img2img my nigga
>>107345909I like this.>>107345939But this... is perfection
>>107345956lame
>>107345954you'd think they used a godlike tier vae but it's still the regular flux vae lmao
>>107345961>the virgin "husband let's go running in the park!" vs the ascended "ooo husbant my chowestoral levelrs arr too high.."
>>107345954it's humiliating for the rest of the field to be this much pwned by a 6b model desu lmao
>specify working out in the gym>adds visible sweat beadsS O U L
>>107345961I see...a fellow architect
Why is wan2gp slow as shit vs comfy? I downloaded it specifically to avoid dealing with comfy's shit, tried to generate a 5 second video, and it claimed it'd take over 2 hours. Comfy would have had it done in ~10 minutes on my shitware.
>>107345921
>>107346018>Why is wan2gp slow as shit vs comfy?Comfy has the best back end of all of them, not even close
>>107345964IMAGINE IF THEY... FINETUNE IT.THE POSSIBILITIES OF EVEN MORE DETAILS... IMMMMMMM CUUUUUUMMIIIIIIIIIIIIIIING I'M CUMMING I'M CUMMING I'M CUMMIIIIIIIIIIIIIIIIIIIIIIIIIIIIINGGGGGGGGGGG
It deeefinitely doesn't have much knowledge on nudity. But it doesn't seem censored though. >>107346044wwoooaaahh
*insert pigeon gooner with lots of spilled milk around*
top row is Grokbottom row is ZimageZissies, how are we coping with this one?
>>107346002can you catbox that so I can see what kind of prompts you're using? I have no idea how to prompt this shit.
>>107346080Oh wow, didn't know Grok improved so much
>>107346080nice
>>107346085>photo of character>brief desc of what the character is doing in what settingand that's it basically. Should probably feed that shit into gemini or something to improve it, make it way more detailed. But it works.different variations of photorealistic tags works, like amateur photo or "Polaroid SX-70 manipulation photograph" like one anon was using in his txt2img.
>>107345964>>107346044someone tell lax about this so he can tell the chinks to finetune it so we can have even better detailshave you ever seen nai v4.5's details? well they finetuned flux's vae and looks better
>accidentally gen a 1080p 16:9 wan video at 81 frames>only half of my 32gb vram is used???
>>107346080Backgrounds have better architecture in all three Z-Image.Women are more beautiful/cute in all three Z-Image.Western image generators only make sluts and avant-garde (LGBT-style).
detailmaxxing
zimage being viable in the long term will all depend on how trainable
>>107346098The skin of the girls looks noisyThe rest of the pic doesn't
>>107346080>Huge cloud-only SAAS model is slightly better than a 6b local modelYeah, how on earth will I cope...
>>107346095How are people getting it to do characters? I asked it for mercy from overwatch as a photorealistic girl and it gave me totally unrelated slop that looked like it was a 3D rendering of a chink mobile knockoff game of a knockoff.
>>107346124you guys should stop complaining about this, it's much better that an AI model is slightly too noisy instead of the opposite. there are 50 different ways to reduce noise.
>>107346126>6b local model that's a turbo distill of an unfinished checkpoint>running on steps as low as 4-8honestly its like apples to slightly better apples running on datacenter PC's ran by jeets and musk. all things considered we're eating pretty fucking good.>>107346129we're img2img maxxing now son.
>>107346080
comfyui nodes 2.0 verdict?
hows the gooning for z-image so far? i have yet to see one
>>107346113Keanu Reeves doesn't look like he knows his way around a piano
>>107346101Glich in the matrix
>>107346149look at the catboxes of the previous threads
>>107346112It's a distilled model so not so much from the get go.
>>107346138I fucking hate comfy and don't know how to do this shit like I would be able to in forge. Help a retard out with a workflow?
>>107346135Don't bother responding to retards, these are the kind of morons who want shitty encodes where they remove all film grain making everything uncanny.
>>107346113
Instead of using Qwen3 as text encoder can I just use Gemma or something else? Already have tons of other models but not Qwen.
>>107346157They will supposedly release the non-distilled base model weights too.
>>107346146Haven't tried them yet, hopefully they add more basic functionality which you need third party nodes for now.
>>107346160I'll help out my nigga retard, no problem.https://files.catbox.moe/ec22tr.pngadjust denoising to 0.3 or lower depending on the image, i'm not sure what exact settings i was using for this image but set it to 8 steps euler simple to start with if it isn't already.https://files.catbox.moe/ec22tr.png
>>107346181that coffee can kick in any day, preferably right now so i don't do silly things like re link the same catbox>>107346179god i love her face here, very cute.
>>107346094very nice
>>107346149Decent, but of course the true potential will only be tapped once you can train loras.
Hey hey Anon, Anon here.Some more from the mill.I'm still sorting and checking, there's a lot of things that work absolutely great or decent for these prompts.res_2m, euler, sa_solversimple, ddim_uniformThose seem to be favorites so far. Fucking model is a beast, most of it is at least decent, while some of them are pretty mind blowing for a small turbo model.China, man. Fucking China.Full sized of all prompts:Frilly titties: https://files.catbox.moe/ue2243.jpgTrash girl: https://files.catbox.moe/3u1zwe.jpg (Resized version picrel)Rat girl: https://files.catbox.moe/cpz3uy.jpgMixed media: https://files.catbox.moe/jtfu1h.jpgOil painting: https://files.catbox.moe/6kvktz.jpg
>>107346144kek
>>107346194>Rat girl>click>actually humanMost disappointing thing that happened today.
>>107346181>>107346185So what is the second image? The workflow is kind of moot until we know what the input was.
>>107346207If it stays like that it seems like your day will be great.
>>107346103A worthy cope>>107346129You can try "cosplay of mercy (overwatch) ..."But I don't know if it knows mercy natively
>>107346179I fully expect something bad to happen because only gay and retarded things are allowed to happen in this world. Euther they won't release the weights or the nondistilled model will suck, there is always a catch
>>107345765I made this change and it created way more blocky artifacts in my image. For reference what sampler/scheduler/step count do you use anon?
>>107346156>>107346188yeah i checked on civit ai, it doesn't know what pussy or dick looks like but atleast its uncensored, ill just have to wait for some legend to properly train it
>>107346224the regular euler + simple, 15 steps
>>107346221You already posted this.
>>107346227>i checked on civit aican you show one such link I can't find them
>>107346227why not both?https://files.catbox.moe/vdwyr0.jpg
>>107346237I will keep doomposting until proven otherwise
>>107346221>I fully expect something bad to happen because only gay and retarded things are allowed to happen in this world.nothing ever happens
>>107346230Thank you will experiment more. Maybe it was because I was not generating photo style but 2D art but maybe I was just using enough steps.
>>107346153Definitely. I guess it's because there's not much detail, just flat 2d.
>>107346256me in the pic
>>107346238>https://civitai.com/models/2168935/z-image?modelVersionId=2442439 or you can filter by z-image, on the images tab >>107346243bleach now.
anyone been writing down actor/celebrity knowledge for Z?
>>107346256GOD he's literally me
>>107346256lmaooooo
A note from a VRAMlet: loading the full model in Comfy with one of the fp8 weight modes seems to make the outputs notably more artifact riddled than loading q8 GGUF or even q6.GGUFs are a little slower for me though but everything is slow anyway since my GPU old so whatever, if I'm gonna wait anyway might as well get something good.
how does shift work in how it scales up/down? is there a point were i SHOULD expect it to start breaking down from being too high? Like 10?because i think 10 might be better than 7 kek
oopsies
>>107346261https://civitai.com/images/111748988damn there's some good images in there
>>107346282if 10 works for you, then what the fuck is the point of your question? just experiment until you find what works for you, fuck you.
>>107346296this is the type of shit you let your therapist know about bud. autism shouldn't be this hostile.
>>107346282>is there a point were i SHOULD expect it to start breaking down from being too high?AI models are black boxes so the only answer comes from experimentation, see by yourself
>>107346213i tried widowmaker and it produced a regular brunette with no ass
Reminder that BFL tried to make a video model.How big do you think that would haven been? 500B? 1T?
>>107346300hostility towards people of lesser intellect is always valid
>>107346315he's smarter than you though
https://civitai.com/images/111700885kek
base model really needs to be released so I can train a big ass lora
>>107346322me on the right
>>107346221>nondistilled model will suckthis one, they delayed the release of the base model because of how bad it was, and are trying to unfuck it, see previous threads
Reminder that generating CSAM is illegal and is punishable with jail time
>>107346332THE DONKS... IT WILL BE A GOOD FUCKING ERA
how bad is the q3 quant?t. gigavramlet
>>107346321i doubt it
>>107346332>>107346364>genning fat doinks in Z-image>>107346360w-would..
>>107346339>not 2B's skull with a blindfold and white hairtch
>>107345765sovl vs sovless
>>107346368what is wrong with you people? go for bigger quants and offload... >>107345353
>>107346332if it actually releases and it trains well ill become a member of the CCP
>>107346344only in cucked countriesand how will they know
>1male, male focus>get a vagina on the manepic
>>107346245>nothing happens>gay and retarded thing happens >good thing happensorder of probability
>>107346398based zimage showing you what you are for genning men.
>>107346398Pooners are men chud
>>107346332Just how big are we talking?
>>107346398>prompting model trained with natural language with tags
>>107345792this?
>>107346421BIIGGGEERRR
>>107346421100 epochs minimum
Starting some photo style tests.Euler/Simple, 9 Steps, shift of 3.Prompt prefixes used:An image taken on a 1990s analog disposable camera ofAn amateur candid photograph taken on an iphone ofAn amateur candid photograph taken on Sony Cybershot from Flickr ofAnaesthetic 5, a still from a movie scene ofA professional 35mm film photography ofA 1960 technicolor film still ofA classic cinema film still ofSo far, Chroma outshines it, but there's a lot of prompts to go through. Pardon the spam for a bit.
this is goodhttps://civitai.com/images/111711391
>>107346445>So far, Chroma outshines it,really? I don't get that much difference on chroma when prompting for those kind of styles
>>107346344>decided to wan2 some images with lolis without any anime lore>mfw
>>107346445thank you for your service once again, and don't apologize, this stuff is super helpful.
>>107346435That is pretty big. I'm very much looking forward to the tunes.
To be fair, you have to have a very high IQ to understand danbooru tags. The precision is extremely subtle, and without a solid grasp of of the english language most of the tags will go over a typical prompter's head.The prompters understand this stuff; they have the intellectual capacity to truly appreciate the depths of these tags, to realize that they're not just jeets describing a woman- they tag something deep about 1girl. As a consequence people who dislike booru tagging truly ARE idiots- of course they wouldn't appreciate, for instance, diffraction_spikes, ben-day_dots, or tenshi_kaiwai. I'm smirking right now just imagining one of those "natural language" simpletons scratching their heads in confusion as they desperately google what the name of the light patterns that appear at the bottom of a pool is called. What fools... how I pity them. And yes by the way, I DO have a danbooru platinum account. And no, you cannot see it. It's for the ladies' eyes only- And even they have to demonstrate that they have a positive feedback score within 5 points of my own (preferably lower) beforehand.
>>107346382oh shit it knows 2b, nice
>>107346462Yeah, I think so. Picrel is a plot of the same thing on Chroma V46 I did back in July.
>>107346445I'm getting really sick of your "chroma is better" posts bro, no one is going to use chroma so shut the fuck up. It has no future.seriously, do you really see a future for it? no.
>>107346386You don't even need that node, ComfyUI offloads the required amount by default if you load a model that won't fit in VRAM alone. I tested it and it was not any faster to manually specify a GB amount with the DisTorch2 loader compared to just using the built in UNET loader. I guess it might depend on setup. But in any case yeah just offload some.
>>107346477this but almost fully unironically
>>107346483why not? it's completely normgroided
>>107346445Chroma is definitely better when it comes to style variety, but it's also like 5 times slower.
>>107346487the 3 images are the same thing (looks like a seed difference)
for an ancient vramlet (2070 super 8gb) chroma is fucking slow. z-image is also fucking slow. this is pain
>>107346445Z-Image is smaller and not slow as molasses, that's the reason why it already BTFO chroma. It doesn’t need to be better, it just needs to be good enough, a sinificabt improvement over XL, and fast.
>>107346491>ComfyUI offloads the required amount by default if you load a model that won't fit in VRAM alone.it doesn't work well for me it doesn't put enough on the ram and when it starts to run it OOM
>>107346515How slow are we talking
>>107346504And twice as large. If Z-Image responds well to training it will own image generation
>>107346519>It doesn’t need to be betterbut the funny part is that it is better, it doesn't have that oversaturated color and the details are much better (better anatomy too)
>>107346488I'm not shilling for Chroma, just saying that in terms of photography styles, from all the models I have tested with it (Krea, Chroma, Wan, Qwen, Hunyuan) it has been the only one so far that got a lot of styles at least partially correct, while others consistently shit the bed.The Z images are still leagues beyond what Krea/Qwen/Wan/Hunyuan managed to do.>>107346508Seed is the same on all gens. But yeah, you're correct. Chroma was still very much in my mind as the one model that actually got close to what you'd be going for.
>>1073465339 s/it on z-image
>>107346244i will be happyposting. im happy and content with what we got. immense research paper as well. 640,000$ for a model of this magnitude, imagine how much xAI/grok spent for their model, which is only slightly better than >6b model
>>107346550SAME. This is a GOOD VIBES general for the REST OF THE YEAR. We're EATING GOOD.
>>107346281have you tried the bf16 weights? i get images in only 25-30s (text encode included in 30s) on a 3060 12gb
>>107346550>i will be happyposting.based, it's rare we're getting some good news so I'm gonna profit of that moment
open source keeps winning lads>OpenAI has been hacked>ChatGPT maker OpenAI confirms major data breach, exposing user's names, email addresses, and more
>https://huggingface.co/ChenkinNoob/ChenkinNoob-XL-V0.1looks like there's a new NoobAI based finetune
>>107346545Could be worse... could be 20 s/it!
>>107346592More good news? And on this day a thanksgiving day? I could not be happier.
>>107346445Here's a silly one testing camera types and brands.As always, not too interesting, but it's kinda hilarious how accurately it captured the camera models themselves if it put it in the images.A Polaroid instant photograph from the 1970s ofA Kodak Brownie box camera photograph from 1950 ofA Hasselblad medium format photograph ofA Leica M6 rangefinder photograph ofA Canon AE-1 35mm photograph ofA Nikon FM2 photograph from the 1980s ofA Pentax K1000 student photography ofA Minolta X-700 photograph ofAn Olympus Trip 35 vacation photo ofA Fujifilm X100 street photography ofA Lomo LC-A photograph ofA Diana F+ toy camera photograph ofA Holga 120N photograph ofA Pinhole camera long exposure of
>>107346592>ChatGPT maker OpenAI confirms major data breach, exposing user's names, email addresses, and morebruh, how is that a good news, a lot of people are using ChatGpt, even people here
>>107346160>>107346181Slightly cleaner img2img workflow with resolution control if you'd like.https://files.catbox.moe/ix9cek.json
ive been gone 12 hours.how big did Z image blow up?
>>107346594Yeah, and it's a proper noob instead of waislop.
>>107346611>how big did Z image blow up?biggest blowup in the history of /ldg/, and I was here during the blowup of Flux and Wan
>>107346603>Kodak momentThanks anon
>>107346610thanks bud.giving you the biggest thumbs up i possibly can IRL.
>>107346621What do you mean?
>>107346599and how fast are vramchads?
they should've called it zeus image or something for how GODLY and lightning FAST it is
>>107346611>>107346621>work with most popular frontend for day 0 support>small model that literally anyone with any gpu can run fast>limited in seed variance but still sota for ootb realism, ip knowledge, celeb knowledge, anatomy, camera control, text etcsimple formula
>>107346642I run at around 20 miles per hour.
>>107346603you know it has an effect when it changes the default asian type to caucasian lmao
>>107346642You can roughly calculate the speed, if SDXL has 2B parameters and z-image has 6B it's 3-4 times slower than SDXL. Extrapolate this to the hardware in question.
>>107346604ChatGPT is gayOpenAI is gayPeople who use it are gay
>>107346614Did you test it already? any comparison pics?
>>107346194ty for your cervix
>>107346603Here's a few fun ones.A 1920s silver gelatin print ofA 1930s depression-era documentary photograph ofA 1940s wartime press photograph ofA 1950s Life magazine photograph ofA 1960s fashion photography ofA 1970s National Geographic photograph ofA 1980s glamour shot ofA 1990s grunge photography ofA 2000s digital camera photograph ofAn early 2010s Instagram filter photograph ofAnd to compare, here's the old Chroma V46 doing the same thing:https://files.catbox.moe/dq7kqp.jpgI still think Chroma did a better job, but I imagine with better prompting and explicitly describing the photograph (sepia, torn, whatever) Z will do fine.
>>107346592>>OpenAI has been hackedSo they got sora 2???>major data breach, exposing user's names, email addressesGAY
>>107346657eeeh... sdxl is 15 sec (28 steps). ZIT is 80 sec (8 steps).
>>107346669please stop with the 1girl comparisons already, that's basically personal taste. do something more objective.
>>107346674That's the ballpark.
>>107346669if you add characters like "a woman disguised as 2b from nier automata" then the style doesn't work anymore, desu the prompt adherence of this model is not that great, and that's expected it's a turbo model, I'm really hyped by the base model now
>>107346515ggufs are very slow, have you tried bf16 with ram offload?
Hey anons, is a 12 gb 3060 good enough for Z-image turbo? I wanna try it out after a great run with Illustrious
>>107346712just try it out, doofus
>>107346712it uses a bit more than 12gb, so offload >>107345353
>>107346705yeah, but rtx 20xx can't do bf16. and if you do fp16 dtype compute, it's not faster than ggufs
>>107346712I'm using it on that GPU, it's good enough. About ~3s/it, this image took 30 seconds
>>107346719>if you do fp16 dtype compute, it's not faster than ggufsyeah but fp16 is still the best quality so it's still better to use that
>>107346690Quit treating objectives as 1girls.
I already got bored of its limitations
>>107346712Try chroma before then it will feel lightning fast
>>107346594Tried it, couldn't gen characters with 2k pics in their dataset released this year, so I don't think it's worth
all we need now is a new UI to make everyone drop cumfart and my wish list this year is complete
Is Zimage inference compute bound or memory bandwidth bound?
fun model
is z-image going to give my computer some kind of chinese zombie virus?
>>107346712yeah, use the default wf, works out the boxoffloads automatically, no need for custom nodes
>>107346778all image/video gen is compute bound
>>107346746try putting pic related on your second monitor
>>107346773Just vibe rawdog it in python. Easy peasy.
>>107346781no, don't be cra-我爱北京天安门,
>>107346781New Comfy update deleted all my files...
>>107346669A Daguerreotype from 1850 ofA Cyanotype blueprint photograph ofA Sepia-toned albumen print ofA Tintype photograph ofA Wet plate collodion photograph ofAn Autochrome color photograph from 1910 ofA Hand-colored photograph from 1890 ofA Glass plate negative photograph ofA Stereoscopic 3D photograph ofIt did way better again than any other Chinese or BFL model. Based.I'll be done in two more after this and then you'll be free again.>>107346703Haven't played around with the prompt adherence that much, but it does seem that longer, more fleshed out prompts will start to deviate from the style more (and if the subject is especially unfitting).>>107346690You're supposed to look at how well the model gets the style, not swoon over the girl.
>>107346796we already knew comfy is spyware and malware. the model is fine though
is z turbo just flux 2 but for poors?
>>107346796rm -r dirnamemkdir -p dirnameIt happens.
>>107346817>>107346817>is z turbo just flux 2 but for poors?it's way better at realism
>>107346817>flux 2flux 2 is barely better than flux 1 kek
>>107346817flux 2 is just z for retards
What about art gens? Can z-poop do some oil paintings or something?
https://files.catbox.moe/bou48h.pngis nudity on z just cooked until we get loras?
>>107346854it looks pretty good as it is, the loras won't have much issues making the real deal
>>107346854yes sadly. so, we gotta wait ((2 more weeks)) for the noob trained base model.
What are the recommended sizes to use? Can I just use anything?
>>1073468692k seems to be the limit
>>107346748i laughed at this but it hurts that it really takes forever with chroma
>>107346845Yes, but a bit worse than Chroma. Still very impressive.
>>107346817>bigger model=betterThat meme died yesterday
Someone remind me what nudity / explicit nsfw looked like when SDXL had just released.
>>107346879this
>>107346879its been dead a while, multiple large chinese models came out that were all pretty bad
>>107345765Isn't the shift dependant of the resolution? like if you go for 1024x1024, shift 3 is fine but if you go for something bigger you have to increase that shift, I've seen that somewhere...
>>107346895>multiple large chinese models came out that were all pretty badIlya was right
>>107346883what was the exact date?
>>107346896If that's the case, someone really should make a custom node that automatically adjusts the shift based on resolution.
I told you, you fucks. I told you that you didn't really needed an insane amount of params just to make a good, smart model. Models actually were incredibly inefficient and Z-image proves it. But nooooo>MUH PARAMS MUH +100B PARAMS LOCAL CANNOT COMPETERetards. And I bet you could even reduce those 6B params to 4, 3 or even 2 and still be as good as the 6B one, if done right.
>>107346877Not bad.
the thousands of hours of abuse we have inflicted upon our GPUs for a model to drop that just kinda works. Excited to get back to lora training, but man what a waste
>>107346810Second to last one.Not too great with TV & Video aesthetics, sadly. But perhaps that needs more prompt wrangling. I would have expected it to get CCTV camera footage, at least.1980s VHS camcorder footage of1990s Hi8 video camera footage ofEarly 2000s MiniDV footage ofSecurity camera CCTV footage ofDashcam footage ofGoPro action camera footage ofA Webcam screenshot from 2003 ofA Public access television still from 1985 ofAn MTV music video still from 1999 of
>>107346912July 2023 apparently.
>>107346925Based Nostradanon
>>107346925>And I bet you could even reduce those 6B params to 4, 3 or even 2 and still be as good as the 6B one, if done right.I get your point, but desu a bigger model will always be better than a small model if both are trained the same exact way, those alibaba fucks could release a 18b model (a size equivalent to Qwen Image) and the 24gb vram chads could still run it and the quality would definitely be close to something like Grok or Nano Banana Pro
>>107346931i found a correlation with other models that when it shows the actual camera it doesn't get the reference to a style, not sure if you would agree with that. this one also seems to avoid fisheye compositions
>>107346929True, but some of the knowledge will carry over.
>>10734696118b is still waaaaaaay smaller than a +100B model so my point stands firm.
>>107346932July 26, according to github.
Neat.
>>107346994Nice 6 fingies Z-image!
>>107346925I've been saying similar, for a model to take off locally it must fit comfortably on a current midrange GPU or it's DOA. People were cooming over each new bloatmodel forgetting that these models are pretty much static because no one has the hardware to modify them so as soon as run into limitations of the model it's a dead end
>>107347019I love how Nvdia was like "see? we're helping you run this giant flux 2 model by making it fp8 and adding some offloading!" Uhh what about making cards with a lot of vram affordable so that we can put the whole thing in it in the first place?
>>107346663Well it's kinda hard for me to say right now. It certainly didn't sabotage the artist styles, they are still as prominent as they are on the base noob and there's no waishit vaseline, but the outputs are significantly different compared to the epsilon noob from a year ago.
I still think the Siege of Pale is the best way to see how far an image gen can go and it's pretty decent.
What, isn't the point of a refiner to use a different model?
>>107347042Afaik Noob 1.1 was worse than the original epsilon 1.0 noob. Not sure what this means in this context.
what happened to 4stats charts?>>107346987i remember the sdxl 0.9 beta leaked earlier
it's pretty good at styles if you do some boomer prompting
>>107346931And here's the last one. Sadly, another rather disappointing one.A Lomography experimental photograph ofA Double exposure artistic photograph ofA Light leak experimental photograph ofA Photogram without a camera ofA Scanner photography ofA Disposable underwater camera photograph ofA Redscale film photograph ofA Solarized photograph ofA Chemigram abstract photograph ofAn Expired film photograph with light leaks ofA Cross-processed slide film photograph ofLargely ignored all the quirky kinds of photography styles. Weird.You're free from me now, Anon. Good bye, may your 1girls be beautiful.>>107346810I found an old plot I did with Chroma flash of this plot. Here it is:https://files.catbox.moe/782xla.jpgWhile it's of course blown out to shit, the styles are pretty good.>>107346962Yeah, Qwen did the same thing. It added action cams and CCTV cameras constantly but never got the actual style at all.
>>107347057vs Nano Banana
>>107347104>>107347057I'm sure it can be closer to nano banana if we have the base model + reasoning https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/picture
>>107347070>sdxl 0.9 beta leaked earlierTrue, I just don't remember if it could do nsfw or if that only came with tunes / loras or whatever else was around at the time.
>>107347103shank you in the cervix
>>107347057>>107347104Well shit, if the sorcerers were more expressive i would prefer Z to nano banana.
https://www.reddit.com/r/StableDiffusion/comments/1p8462z/z_image_tinkering_tread/>but it's actually works with cfg above 1, despite being a distilled model, but it also requires more steps As for now I tried cfg 5 with 30 steps and it's looks quite good. As you can see it's a little bit on overexposed side, but still ok.interesting
>>107347151z looks better stylistically but nano adhere to what i wanted more
>>107347019Flux had a lot of research and snake oil released for it though. More than any other previous model and that was right on the edge of consumer hardware.
>>107347196supposed to look like this ultimately
>>107347153Yes, this model just keeps on giving, negative prompting evenA shame even if expected that prompting in chinese gives best results though
>>107347153>It's best to use chinese prompt where possible. Gives noticeable boost.>it's actually fucking truei can't believe it. huge boost too, in every way including face detail. that said, i've only tested this with an already azin subject so results could vary, doing everything in chinese might make westoid girls turn bug.
>>107347065yeah but if you switch vaes you need to translate your latent
>>107347228Qwen3 is a Chinese llm model. It understands chink really well. Text encoder is responsible for the initial vectors.
>>107347251>Qwen3 is a Chinese llm model.but it knows a shit ton of languages though
>>107347251Yeah i guess it should be a no brainer but. i don't know. i'm used to my burger language being the standard for everything. not complaining though, i'll take what i'm getting for fucking free. even if it means passing every prompt to gemini to translate to winnie the pooh.
>>107346382you got it boss
Quick test with my huge-ass concept bleeding boomer prompt.Easily trades punches with Qwen in prompt adherence (some details are of course wrong but still) and the concept bleed between the subjects is basically 0, haven't seen that before on any model.That's pretty damn impressive.Prompt:https://files.catbox.moe/6fmtfx.txt
>>107347271>not 9s holding 2b's skull during his psychotic hackerman phasetch
>>107347246Translate how. I2i so far works nice to get the pose.
It can do a pencil sketch but sometimes it really wants to not understand my prompt. Tried both "her eyes are obscured by her long hair" and "her hair is covering her eyes" and it just keeps giving me blindfolds lmao.
>>107347278The model knows 2b but struggles with 9S for some reason.
>>107347153>>107347228>>107347270bruh...
>>107347103Aerochrome worked but not redscale, sadly.
>>107347287>struggles with 9S for some reasondang.
>>107347298>to xi or not to xi, that is the question, and we have our answer
this model is so based it made everyone realize comfyui is fucking garbage and it deserves a better UI kek
>>107347323>realize comfyui is fucking garbagewe didn't need a new model to realize this
>>107347262Chinese and English are its primary languages.Small models are not that great with smaller languages or even something like German in some cases.
>>107347323meh, I think comfy has a good UI. the backend is what's kinda shit.
FOSS killed local models, all UI's are trash, Comfy, SD, Forge, SwarmUI, even Stability Matrix that encompasses all of them isnt free of the blight of FOSS mental disease. None of them work. Have fun with your models of choice cause you aint getting more.
>>107347282Lovely.