I'd Rather You Start Schizoposting EditionDiscussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107339853https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipehttps://github.com/ostris/ai-toolkit>Zhttps://huggingface.co/Tongyi-MAI/Z-Image-Turbohttps://huggingface.co/Comfy-Org/z_image_turbo>WanXhttps://rentry.org/wan22ldgguidehttps://comfyanonymous.github.io/ComfyUI_examples/wan22/>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>>107341058
First for z-image sucks.
why the fuck is the Nvidia offloading on by default?
total bloatmodel death
>>107341081how do you turn it off
>>107341081nobody needs anything else other than flux2
>>107341081i guess for tranny vidya games?
Why is the official ComfyUI page shilling APIShit and not mentioning z-image at all?>>107341064I thought Comfy was local-first?
Is this new z image model open to finetunes, including nsfw ones, or is it doa like bfl models outside of flux schnell?
reminder that paid cuckmodel shills are itt right now
>>107341103>flux schnellThat was also DOA
>>107341081>teehee your entire computer just slowed to a crawl and you don't know why>teehee
>>107341101Maybe because the last piece of news was 2 days ago, before the release of z image?
>>107341101for comfyorgflux2 > zimage turbo
>>107341113I meant license wise.
..And after the premiere of Von Braun, we also have a new retelling of Schindler's List!
>>107341101People are on vacation for thanksgiving so the blog post might be up Monday.
Z image artist knowlege is poor. It only knows the most popular, and for those it doesn't know it has the same fallback as Qwen where it uses the name as a reference for the ethnicity of the subject, or for a style bucket (ie Italian name -> generic renaissance style approximation)
>>107341107as a clean feet lover, this is some good shit, especially with a cute girl
>>107341125shaniqua's list
>>107341125Can you make an obese guy in a concentration camp with a Judenstern saying "FLUX 2" on his clothing?
>>107341127why aren't you commenting on all the complaints racking up? why do you still make the absolute shittiest templates?
>>107341129>cake the image in a layer of ugly noise instead of actual brushworknew meta just dropped
>>107341146100% wf issue
>>107341150is it also good at non asian girls?
>>107341152Is this the part where you blame comfy instead of the model?
>>107341162who cares
>>107341136sides caved in reading this. thank you.
>>107341172ded
I did not lose respect for comfy when he started associating himself with avatarfags.I did not lose respect for comfy when he started enshittifying the UI.I did not lose respect for comfy when he started adding api nodes.I lost respect for comfy when someone posted that promo video and we learned that he is a fat fuck irl.
z-image danbooru finetune might be the best thing for /hdg/ troglodytes
>>107341172Can you get rid of the blur using prompts?
>>107341127Reminder ComfyUI still:>doesn't remember queue if you crash/quit>can't stop the queue if one gen OOMs, it will go through all the queued gens and OOM on them all instantly, forgetting all sent workflows in case you don't have each saved>can't be scheduled so it only gens within a particular time or at least begin at a set time>doesn't have "precompute text encodings of all queued gens and throw the encoder out of vram forever" toggle, speeding up bigger models by double digit % if genning multiple imagesThese wouldn't be as bad if this wasn't trivial to add for them who know the codebase well at least in a dirty way before implementing it properly and it wouldn't be that bad if there werent literal dozens of memory leaks and bad memory allocation code that no matter what OOM you for video gen every once in a while despite unloading all models for every single gen, having 24gb vram 128gb ram and dynamically managed pagefile that gets filled to 170+gb sometimes.Even the basic feature of just being able to stop the gen mid way through a step instead of having to wait multiple minutes for it to finish for high quality video gens wasnt implemented until a few days ago and ONLY after some guy created a node to do it first, proving that it's obviously possible. Only after that did comfy write what were essentially two unique lines of code that added this basic feature.
>>107341181/hdg/ would be too busy irony posting to even care
This is a major upscale for Seedream 5, I just know Bytedance won't let themselves be beat by open sores
>>107341180I never respected comfy and was a spaghetti hater since day 1Total spaghetti death
>>107341182the blur *is* part of the prompt famalam.A dramatic movie poster for 'Shaniqua's List'. A serious-looking, very fat overweight African American woman with curly hair, wearing a black nazi uniform with a red armband featuring a white circle and black swastika on her left arm, stands prominently in the foreground. In the blurry background, numerous people in striped prisoner uniforms are visible in a somber, industrial setting with dark, muddy ground. The top left corner shows the NETFLIX logo in red, followed by 'Presents' in white. The title 'Shaniqua's List' is at the bottom, in a classic, slightly distressed white serif font. The lighting is dim and dramatic, casting a serious tone.
>>107341168wheres the jpg artifacting in images like >>107341150
>>107341089Kek, forgot to set it to 2k
>>107341193oh ok I thought it was forced
>>107341186>stop the gen mid way through a stepThat issue stems from ai researches usually being dogshit programmers and pasting reference code as is into the node is much easier that fixing it.
Reminder than comfyui sends all your data to their cloud for analysis and can format your hard drive anytime if they want.
>>107341186 (me)>Even the basic feature of just being able to stop the gen mid way through a step instead of having to wait multiple minutes for it to finish for high quality video gens wasnt implemented until a few days ago and ONLY after some guy created a node to do it firstI copied this from my previous posting of this complaint, and this part is no longer true, this happened weeks ago by now instead of few days.
>>107341214cool i hope the glowniggers see this one that's cracking me up
>>107341195most obvious at the ends of the hair, but whole picture looks like it has noise reduction at 200%
>1girl, sculpted by michelangelo
WHERE IS THE LORA SUPPORT???
You have to update comfy to use z, but o remember just a few days ago someone showed that the latest update deliberately made the up worse and hide a bunch of shit. For those of you who not the bullet and updated, how bad is it.
>>107341242Hagbros eating good
>>107341237Made the UI worse*
>>107341237ComfyUI is terrible, rendering nodes tanks it to 20FPS. And it's not a model loading issue because if I just move the viewport away from the nodes it spikes up to 240 FPS.
>>107341129Shame, it does do stylized photos quite well. Flux.2 however does seem to know every artist style I threw at it, but I have some hopium that this is due to distillation and when they give us base model it will be better since Z knows a few of the ones I tested from Flux.2.
Comfy should be dragged out on the street and shot
>>107341255when previewing images, resize them down to like 32pxit's some retarded approach to rendering texture arrays.
Can Z-Image do upside down faces?
Has anyone else noticed that every big SAAS model is starting to look the same? I feel like there was more variety in a batch of Dall-E 3 gens than in Flux 2, Seedream, Nano Banana Pro, any of this shit.
>>107341283the famous "upside down on grass" test?
>>107341285They're probably being trained on each other's slop
>>107341301I think they're just plagiarizing each others' methods
>>107341285The money being put into AI wants it to be replicating a certain thing instead of doing it's own unique thing like what a creative type would want
Hello everyone, how have threads been?Is the poop dick schizo still at large?
>>107341303Why not both
>>107341313Feet look like rubber, very bonerkilling
Lumina2 + few step distil + realism lorachinese revolution
>>107341312Very unsafe gen
..And we're back with LDG news. Breaking story, PoopDickSchizo is back, and this time he's like a hydra with many unkillable heads.
>>107341186frontend issues like queue stuff will probably get fixed at some point.If you have a workflow that is OOMing go make an issue with it on the repo.
>>107341285The more data samples, the closer the results will look between models, even if the datasets are different.Similar to how a live poll's results will swing back and forth with the initial votes, but then 3,000 votes in and the percentages barely move anymore. They stabilize around a particular result.
>>107341354I don't buy it.
So will it be possible to train loras for turbo? It was still possible with Flux dev and schnell despite being distilled. Though the loras all wrecked the anatomy
added sdxl, so basically z image is almost twice as big as sdxl
>>107341336is that z image?
>>107341354unless they're all training on the same shit that doesn't make any sense
fucking great model. based alibaba
anyone tried the z image edit model? is it even out?
>>107341369this chart is something you made?
Man I must be tripping because this model feels like every other chinese model.Like I've already seen these images before.
step right up, step right up
>>107341395used perplexity because I was annoyed at never remembering what model was what size
>>107341369thank you based chartautismo>>107341385it's fantastic, way better than i could have expected.>>107341397might be a flux trauma symptom, many such cases. >>107341398kino
>>107341369Z-image runs fullsize on 12 gigs perfectly fiine. 4070S here.
>>107341380I think he is treating "different data" as different large samples from the same set of data, that set of data being "all the data that exists"
>>107341336Other ComfyUI QoL things:>doesn't have a native "fuzzy match model names in loader nodes" feature that automatically resolve paths to model files that were moved since last usage, or, god forbid, just find those models anywhere they might be by their unique hash>doesn't have "widget control mode: before" as default seed changing behaviour, which is much more intuitive>goes to first workflow tab when closing the currently active one instead of going to the one that was last used or at least the one next to it, helping people with many active tabs>doesn't allow you to drag and drop a workflow tab anywhere on the bar to the right but instead you have to drop a workflow on top of another>doesn't have a quick swap button for hight/width of image resolutions on nodes>can't foward image dimensions from load image node to nodes that use dimensions>when loading a workflow and running it, even if you have seed set to randomize, the first time will silently use the seed in the workflow instead of randomizing it>workflows doesn't have fuzzy search or anything similar, if you search "wan lora 1" it wont find your "wan 2.2 lora 1" workflow
>>107341369>2x as big as SDXL>2x as slow as SDXL>2x the resolution of SDXL>4x res VAE compared to SDXL>at least 10x better than base SDXLthis is the scaling that we need. not shit like hidream with 4 text encoders that's 4x bigger and 2x slower for images not even 10% better than flux 1.
>>107341402would be neat to see with a bunch of models going back to SD1 and before. i wonder how accurate perplexity is with shit like sana or other obscure models >never remembering what model was what sizesame
Drag and shot the comfy
>>107341369Turns out license wise, the only time bfl ever cared was with Schnell.
>>107341412bahaha.. uh ok.. well that also doesn't make sense but sure
>>107341363>>107341380Imagine if you took one trillion completely random photographs in the world, for one model. Then another trillion random photographs in the world, for another model. They would start to start to look the same, the randomization ends up being less random the more data samples there are.You can scale up this thought experiment:Imagine if you took infinite photographs of the world (essentially building a simulation of earth) for one model. Then took infinite photographs of the world for another model. The two models end up being the same.You can scale down this thought experiment:Imagine if you took 1000 photographs of the world for one model and 1000 photographs of the world for another model, the two models would be much more randomized than the trillion version is.
>>107341326Yh unfortunately only a few come out non slopped enough for my tastes.
a WAI-style nsfw tune of this gonna go hard.
>>107341425give me a few and I can trysd1.5?
>>107341440top of feet, so nice
BLACK FOREST LABS POOL CLOSED
>>107341430they never cared. schnell was a pseudo-local release designed to bait 'developers' into thinking the model could be salvaged. meanwhile it was giga-slopped, hyper-distilled, and sabotaged to the point where it wasn't finetuneable. chodestone spent $100k of his budget de-distilling it and rebaking from scratch only to be left with a mess of melted limbs and nonsense artifacts.harsh lesson to never attempt serious development on anything BFL releases, it's bloated dated garbage designed to shill API. flux 1 was only relevant because it was the first natural language local model, though scholars at the time correctly pointed out that it was miles behind dall-e 3.
This model really changed my mind that bigger models was the only way forward. I am a "size don't matter" believer now
>>107341466any improvement that happens with a small model will be better on a bigger one, hardware will be the bottleneck for years to come until we get a small model that will be able to do everything
>>107341449
>>107341453A bunch ofBlacksFartingLogsis the best evropa can do now... really makes you think.
>>1073414461.5, 1.4, VGAN+CLIP, Pixart Sigma, Lumina 2, both Schnell and Dev Flux, uh.... fuck i swear there are more
>>107341444>waistop it, sloppedwai is just a shitmergenow noob finetune on other hand
>>107341462I guess you're right, especially concerning how hard it was to finetune schnell.Kind of sad, but it seems they don't really care about the competition and just do their own thing.
Is Zimage comaptible with any of the speedtraining snakeoil methods?
>>107341444>WAI-style nsfw tuneOnce again, not a tune.
>>107341466>I am a "size don't matter" believer nowit's more that there was plenty optimizations to have at any size, so this shows the 20B+ models could look even better if they did the same
>>107341127Hi comfyanon, can you stop bloating up the frontend? It's dropping frames bakaIt's supposed to litegraph, but now it's bloatgraph
>It doesn't know natalie portmanwelp, at least it got every detail right anyway kek
>Comfy 3.75Why does middle mouse wheel now clone the entire fucking work flow and how do I un-keybind this shit.
>>107341451I've got canal paths near me that look just like this. Majestic.
>>107341484ok
>>107341508how many mouse wheels you got?
>>107341466Big models were always memes because big=static, you can't/barely can modify them with local hardware. That's why XL is so popular, every retard can play around with the models, meaning lots of interesting things get done. Bloatmodels might be able to do more out of the box but once you run into a limitation it's basically over because the model is static for all intents and purposes.
>>107341476>hardware will be the bottleneck for years to come until we get a small model that will be able to do everythingNot if we're still using attention/transformers. In 4 years the perspective will change and a "small model" will be one that fits onto an entry level 48gb GPU >>107341488Kek what competition? They are the weights-available leaders in the West
To our new golden age. Cheers fellas.
How good is zimage in following a complex prompt?Example, the one used for qwen : A vibrant, warm neon-lit street scene in Hong Kong at the afternoon, with a mix of colorful Chinese and English signs glowing brightly. The atmosphere is lively, cinematic, and rain-washed with reflections on the pavement. The colors are vivid, full of pink, blue, red, and green hues. Crowded buildings with overlapping neon signs. 1980s Hong Kong style. Signs include:"龍鳳冰室" "金華燒臘" "HAPPY HAIR" "鴻運茶餐廳" "EASY BAR" "永發魚蛋粉" "添記粥麵" "SUNSHINE MOTEL" "美都餐室" "富記糖水" "太平館" "雅芳髮型屋" "STAR KTV" "銀河娛樂城" "百樂門舞廳" "BUBBLE CAFE" "萬豪麻雀館" "CITY LIGHTS BAR" "瑞祥香燭莊" "文記文具" "GOLDEN JADE HOTEL" "LOVELY BEAUTY" "合興百貨" "興旺電器" And the background is warm yellow street and with all stores' lights on.
>>107341520>>107341531What's the surefire way to get rid of the psudocompression? These look good.
>>107341520>uuuuohhh husbant...
>>107341528>Kek what competition? They are the weights-available leaders in the WestCompetition in general, it's not like it matters if the weights are Chinese, German or whatever.
>>107341511>>107341451(You)>I've got canal paths near me that look just like this. Majestic.Lucky. My canal paths have gators
>>107341528>4 years>48GB entry level GPU What are you smoking
https://www.reddit.com/r/LocalLLaMA/comments/1p4urm7/we_are_considering_removing_the_epstein_files/
>>107341545i have a subway near me that looks just like this!
>>107341421>>2x as slow as SDXLthis is with extreme distillation. Even FLUX1 could run in 3s on a 3090 with distillation.
If this is what z image turbo can do I can't wait to see what the full model doeshttps://files.catbox.moe/7nvqib.png
>>107341534that certainly pushes its limit
were there genuinely retards itt who thought the bloated slow crap like flux/chroma/qwen/hidream/neta would actually take off? Nobody wants to wait 30+ seconds for subpar 1024x gens on a 5090. there is a reason SDXL remained winning for so long. z-image is the first model since SDXL with the actual potential to both dethrone it while being a full upgrade, not a sidegrade.
>>107341560whoooaaaa...it can do 1girl...standing there?!?!
>>107341542If the weights were german, the model would be worse for cooking /ldg/ asian 1girls, baka
>>107341573I see you are new here but this is a base model. Even sd 1.4 wasn't this uncensored to start
>>107341560apparently the full model is worse currently
>>107341573sd3 couldn't kek
>>107341542>Competition in generalBFL is German, 100% they're trying to get contracts where the fact they're not a Chinese company is relevant here And even without adjusting the goalposts I'd say the local competition is all equally in a state of "don't use this for actual work ever unless you have a really good reason like privacy or compliance" right now in terms of generative AI>>107341550>What are you smokingMostly distillate with terpenes but I'm thinking of splurging on one of those gimmicky disposable vapes with two strains in one, it's half live diamonds and the other half live resinBut this is just a napkin math guess based on current scaling + china catching up + the fact that 40gb A100s should be cheapish by then I should have said "entry level AI GPU" because I also think the future of inference is dedicated discrete cards
>>107341573and its prompt following is insane if that is what you are after, that has already been shown off, and its skin detail is the best from any base model as well
>>107341067Wtf Bruce Lee? How could you hurt migu?
>>107341565Actually better than what I expected, see what Qwen does.
>>107341589its not out yet so what insider knowledge are you claiming?
>>107341285I think it's because of how they all clean up your prompt with LLMs now.
>>107341569It was mostly anons flexing that they can run them I guess, it was obvious they won't take off given their sizes
>>107341589probably not worse but not good enough for what they want as its still trainingthe guy who leaks shit on twtter who said that speaks through a language barrier
Seed variety is really bad on some prompts, like it locks in like Qwen.
>>107341592>the future of inference is dedicated discrete cardsI've been hearing that since 2022.
>>107341615yea, that is the trade off with strong prompt adherence, but fine tuning for more creativity instead will be easy
>>107341604
So Z Image is done by Alibaba but not by the Qwen team? Why do they have multiple teams on this
>Prompt enhancer with z-image-turbo might be better . System prompt is on its way!https://xcancel.com/srameojin/status/1993793896397320193#mTHIS ISN'T EVEN THEIR FINAL FORM
>>107341636sounds like a bad translation if anything to me
>>107341636it's a bit weird that the distilled is better
>>107341599phew. wow. that's Qwen huh? there really isn't a model left out there that Z isn't assraping.
>>107341569Z and neta run at a similar speed on my system so
>>107341560>actually proper genitaliathat's already so much better than flux nonsense on this
>>107341543>Lucky. My canal paths have gatorsExotic.
>>107341633>fine tuning for more creativityis that a thing? any finetuning just removes the creativity iirc.
>>107341639So they can decide which one gets promoted to API-only
Maybe it's because it's Turbo but this model seems very deterministic. Using the same prompt generating 4 images with different seeds results in almost identical images a lot of the time. Even if the prompt is very vague like "A monster under the bed" which could look like fucking anything.
>>107341657i asked it for a penis and got a disgusting mass of flesh.
>>107341640>Prompt enhancer with z-image-turbo might be better . System prompt is on its way!care to explain what are these?system prompt is for LLMs, so what does it have to do with zimage?and prompt enhancer?
>>107341647Qwen FP16 yeah.
>>107341659No, finetuning is just changing parts of the weights, its already been shown that this works for qwen in making it less deterministic.Also there are samplers that inject more noise between steps that helps too
>>107341661it's an issue for me as well. i believe qwen was like this too
>>107341669>care to explain what a system prompt would do with an image gen model?>prompt enhancer? now you're really not making any sense.
>>107341661will be fixed by loras when they release the base model. Even just a lora that is mostly noise would do wonders. For now just use a prompt enhance or add random stuff to prompt
>>107341661For quite a while now I've been of the opinion that no one should ever use a base-model, loras are ALWAYS essential (that's the human-input that makes your outputs look different than someone else's outputs). It's way too easy for any two people's prompts to be the same.
OMG it Migu!The watercolor effect on the background is decent but on Miku herself it's a little iffy. Looks more like normal anime style with some artifacts but could definitely be worse.
>>107341661You have to manually prompt the variation and be specificIf you don't want straight-on photos just tell it you want photo from the side or from behind etc.
>>107341439makes sense to me
>>107341702im hard now
>>107341514that monogatari live action looking good
Can Z-Image do XI Jinping or Winnie the Pooh?
>>107341618>I've been hearing that since 2022.You don't need to hear it, just look at the inference speeds you can get on custom hardware with stuff like Groq. Also both Nvidia and AMD are dedicating more resources to discrete NPUs etcHere's another perspective: The gaming GPU is almost dead in favour of the APU at this point. Consumers have developed the learned helplessness about not affording high end GPUs and have always enjoyed consoles. Couple this with the fact that Nvidia and AMD make buckets on data center compared to consumer, the fact that consumer GPUs have already reached that point e.g. the 5090 is just a VRAM gimped RTX 6000 Pro Blackwell, and the fact that eventually you have to cave and do unified memory/soldered memory to get the higher bandwidth speeds you need like Apple Silicon or DGX clusters do, I really don't see discrete GPGPU being a thing for very much longer
>>107341709Yes >>107341620
>>107341709yeah
>>107341709you got it bosshttps://youtu.be/PGa3xmdVvMM?si=xM9T_UzurJx1nh2y
>>107341722fuck yeah
>>107341661I wonder if this will help making an animation workflow later on.
now we just need LTXV-2 to release so we can be freed from the shackles of WAN
>>107341692>(that's the human-input that makes your outputs look different than someone else's outputs)>t. promptlet
>>107341741kandinsky 20B already did that but its fucking fat and comfy still has not merged kaji's implementation cause comfyui hates torch compie which it needs apparently
>>107341741No, we need ltx2 so alibaba will release wan 2.5 out of spite
>>107341748>kandinsky 20Blink?
https://files.catbox.moe/m3zlcd.pngNaked Frieren standing in a fantasy forest setting, her breasts and vulva exposed, vulva, pussy, 2d, anime screenshot, masterpiece, high resolution, very aesthetic
>>107341742It's not really about me, it's about the million other users who only type ONE sentence for their prompt. There's only so many ways a person can type one sentence, and they're all colliding, all producing the same result and feeling like a retard when they see their album cover as a video game asset in someone else's product. This is going to be sadly common in the future unless everyone adopts standards like using loras for the sake of uniqueness.
2 Years ago I would not have questioned if these amateur photos are real. It's so over.
>>107341764
>>107341769fuckin hot
>It's not X, it's Ystop letting your chatbots in here
>>107341484man sd1.5 was so small
>>107341767https://huggingface.co/collections/kandinskylab/kandinsky-50-video-prohttps://github.com/kijai/ComfyUI/tree/kandinsky520B does porn out of the box btw. But its big and slow as fuck
>>107341784nice feminine energy, next time quote me bitch.
the step-by-step inference on Z Image doesn't seem very fast at all to me DESU, so far. 8 steps Z doesn't seem any faster than 25 steps of NetaYume at the same res
>>107341805nice OL
>>107341784You are absolutely right!
>>107341787>20B does porn out of the box btw. But its big and slow as fuckYeah I won't be running that shit unfortunately. Sad to hear that something like that out of the box already exists but is that big
>>107341774>It's so over.Alternatively, intellectual property is dead and the children (think of them!) have been saved>>107341773Why do people forget that we already were in a slop lack-of-creativity attention economy culture war doomspiral before AIAlso, your point about millions of NPC prompting the same thing is another point for SaaS, because those same NPCs don't delete or turn off conversation history sharing so the Service can adjust the prompt slightly based on your preferences inferred from your past conversations
>>107341748>Hunyuan 1.5*We just need its NSFW tune.
z image does 2k res perfect btw, you are not limited to 512 x 512 - 1024 x 1024, 1440x1920>>107341823nah, sorry to say hunyuan 1.5 is way more censored
this is kandinsky 20B but at super low res / steps so it does not take 2 hourshttps://files.catbox.moe/6pdai4.webp
>>107341786It gets some of the details incorrect. VG+C ran on the CPU, couldn't do videos (unless you count animations of it generating), 1.5 could be run on 4GB cards same with XL, and there's probably some other stuff that I don't realize because I'm retarded and don't wanna read papers right now. A chart like this with even more models, like Kandinsky and any others anon can think of, would be a really cool resource to have.
>>107341838howd you run it?
>>107341844>>107341787
>>107341844OFFLOADINGFFLOADING
>>107341838The clit isn't supposed to be at this place
>>107341838Any workflows for this model? May fuck around and try to run it
>>107341851yea no shithere is this as well btwhttps://huggingface.co/Ada321/Kandinsky-5.0-T2V-Pro-sft-5s-Q8.ggufhttps://github.com/Ada123-a/ComfyUI-Kandinsky
>>107341854>>10734185716GB vram is minimum btw, even with offloading the latent is about 8GB ish
>>107341787How slow we talking? Depending on how good it is, id be willing to wait up to 10 minutes with a 3090.
What's actually "so over" is that this board's traffic when /ldg/ is popular is just "/ldg/ and friends". >>107341786>Z image is 6BHuh guess I'll get off the couch and try it out. In my opinion Z.ai have been the most consistent Chinese lab of 2025, they feel like a Chinese version of Anthropic>>107341851>The clit isn't supposed to be at this placeThat is a very, very reasonable mistake that models make (like double assholes or turning pussy lips into testicles or vice versa) and that example actually makes me more hyped because you need to have an understanding of the anatomy of the vagina and have seen enough clits to make a mistake like that >>107341864I am assuming that 64gb of ram is the minimum with 16gb of vram because I couldn't even run the q4_ks with just 32gb. The best I could do was a one frame 128x64 image that took 5 minutes to generate.
>>107341878lol, lmao even.a hour on a 5090 using cache
Body horror when prompting for certain yoga poses that Chroma HD Flash nails first try.
z image struggles to mix 2d and IRL, qwen beats it at this (for now)>>107341811it's not really faster, the distill is just way higher quality than we're used to. With CFG, it's similar speed to chroma for me too.The answer is we need to ditch CFG and use something like this. https://github.com/AMAP-ML/S2-Guidance
>>107341885thing requires a fucking h100 cluster then lmao im good. surely we'll get something as uncensored as it that ISNT fuck-you big
>>107341885Why the fuck can’t we have anything nice. Imagine z-image prompt adherence but for i2v. I’m tired of tard wrangling the hell out of wan and still not getting anywhere close to what I want.
>>107341912>>107341916with a good 4 step distill it would prob get down to like 5 mins
>>107341882>In my opinion Z.ai have been the most consistent Chinese lab of 2025, they feel like a Chinese version of AnthropicWhat else should anon know them from?
>>107341882It's not Z.ai btw, this is still alibaba
>>107341882dumbass kid talking out of his ass lol
please care about flux 2
>>107341950would
>>107341950can u fill the box with cum plox
Flux 2 had better artistic potential while the China model is cheap yet functional slop machine spewing out cheap gimmicky memeslop clogging up all the threads, how fitting
>>107341885>an hour for 5 secondsbro..
is this option depracated or just hidden now
>>107341967it's "streamlined"
Kek, doesn't understand fellatio that well unfortunately, or maybe it's skill issue on my part.https://files.catbox.moe/kair61.png
I don't get it. It's just slop. The whole model is a slop machine. It's basically the same shit as flux 1.
>it can't do the part were her body is made out of the pancake syrup but nails literally everything including perfect anatomyflux would just fuck up the anatomy constantly, same with chromasome. wow.https://files.catbox.moe/9nan6k.pnghttps://files.catbox.moe/c4mktw.png
>>107341959I’d probably be willing to wait even that long so long as it had god tier prompt adherence which it probably doesn’t. Imagine waiting for an hour excited for your 5 second goon gen, and then something is glitchy or your prompt was almost ignored cause you were just slightly too undescriptive. Long gen times make experimentation hard.
>>107341997Rent a 8xH100 cluster and get those 2 min gen times buddy
z-image sucks in the same way SDXL sucks
>>107342018it feels like SDXL-2 in a lot of ways
>>107342018I remember being blown the fuck away when sdxl came out but could barely run it on my 1080 ti.
>>107341995Yeah, Chroma has substantially better prompt following, NSFW concept and anatomical knowledge, in addition to higher res photos, way less slop and more variety. It's just that this is obviously a 6B Turbo model, and for what it can do out of the box (about 60% of what Chroma can) it's impressive. We'll see how its base model with full prompt following fairs, plus tunes (especially a large scale tune) will bring the best out of it.
>>107342018no, sdxl was safetyslopped when it first came out
ok this works, I just have to start the prompt with the description of the 2d character.
>>107342018>>107341945
Need a local audio model that can do NSFW shit and dialogue too
>>107342082thats enough asian beauties for today, anon
>>107342088vibe voice not good enough for you?
>>107342100no anything outside of dialogue is very gacha
>>107342088we need something where you can clone specific plap sounds
>>107341934>What else should anon know them from?Their only other stuff is /lmg/ , the GLM series is considered a cheaper, but genuine alternative to Claude. We have Claude 3.25 at home because of them>>107341937>It's not Z.ai btw, this is still alibaba... Okay in that case Alibaba is the best across all modalities for local, text image video a complete and total moggening>>107341945>dumbass kid talking out of his ass lolI am probably younger than you, that is true, but I have an idea of what's going on in the industry. The actual "talking out of the ass" comes from the fact that who knows what decisions these companies would make if DRAM and compute weren't so bottlenecked. Pretty much every chip that gets released always has some SKU defined in the spec with double the VRAM
>>107342052I haven't tested this model yet (I am on a trip) , but from the outputs i've seen so far, z-image has less mangled anatomy problems than Chroma, and this is coming from a guy that defended Chroma for a long time. I think the base model with loras will look just as good as Chroma for photorealism, considering the base for z-image is already devoid of the plastic skin and buttchinfest Flux had
Z Image seems to work with Qwen3-4B 2507 Instruct too. Haven't tried Thinking yet. Possible gains to be had there anyways since real finetunes of Qwen actually exist, unlike for T5-XXL.
>>107342143i can't fully illustrate how happy i am knowing i don't need to put (cleft chin) in my negatives anymore thanks to the chinese.
>>107341995chroma, especially spark chroma definitely beats it at this prompt. pic is as far as I can get it with proompting. maybe the prompt enhancer can fix this when they release it.z image is better than chroma overall, but won't replace it yet. ironically, chroma has better survivability against Z than qwen or flux because it has NSFW and more flexibility. If the SPARK chroma guy finishes his work and then we get a Chroma distill with the same quality as Z, chroma will still be competitive.
>>107342161not even remotely what the fuck is this kek, the body is supposed to be like picrel.
>>107342143>less mangled anatomy problems than ChromaChroma HD Flash fixes all problems from original Chroma. This model however has very little seed variety (similar to Qwen) and can do less than Chroma out of the box. I would know because I've prompted all of these things in Chroma (though some of them are not first try on Chroma, there's still more seed variety). If a model cheats by beating its seed variety then it's not really better anatomy than Chroma imo.
>>107342145I need to look into how these instruct models work with the new gen of Chinese models. If you give it too illegal of a prompt, won't the instruct model just write "I'm sorry, I can't assist with that" instead of enhancing/encoding your prompt properly?
Fresh >>107342183>>107342183>>107342183
>>107342179Also I'm paying very close attention to the prompt following, and it's also not up to par. But this is just the Turbo model with reasoning turned off after all.
>>107341840Yeah I didn't run a second pass but I probably should to check sources.
>>107342181no, that's not how it works, none of these models are being used in a way where it's possible for them to "refuse" a response, they only activate the "model understanding" state layers. There's no typical chat context going on at all
Chroma still produces monstrosities more than often... anyone saying otherwise is just a fag