Discussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107529397https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipe>Z Image Turbohttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>WanXhttps://github.com/Wan-Video/Wan2.2>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe|https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
>all these 1girlsWE WON!
Wretched thread of mental illness
Blessed thread of 1girl posting
>>107536415wow, this thread desperately wants to be /adt/ but suck at making kino
https://github.com/Tongyi-MAI/Z-Image?tab=readme-ov-file#-community-works>SGLang-Diffusion brings SGLang's state-of-the-art performance to accelerate image and video generation for diffusion models, now supporting Z-Image.another snakeoil or?
>>107536436>page 7why don't go there?
>>107536443sglang is just an engine, it's pretty popular with LLMs (for enterprises, like VLLM)
>>107536443Chinese culture. Anything but the base model.
>>107536436>kinoIs that a codeword for pedophilic images?
Arousing thread of 1girl gooning
>>107536443a z image gen takes like 10 seconds on modern cards. how fast do you need it to be?
>>107536436>oh no! the niche and hyper specific thread of 3 troons feels threatened when someone posts anime in the god's chosen generalAre you that insecure?
>>107536469we won't be using turbo forever, lot of steps + cfg (2x slower) will come back on base
>>107536471you say this but this thread was baked by a dramatic troon
Possibly posted here before, I don't check every thread but they seem to have updated their websitehttps://tongyi-mai.github.io/Z-Image-blog/Unified ti2 and i2i? Wth does this mean?I am inclined to believe that they aren't doing a complete rug pull and will release something, but I have no idea what that is shaping up to be.
>>107536436go back there, why are you here seething?
>>107536498I like how the team didn't do free advertisement for cumfart
>>107536498if I understand well, base will be able to do edit while z-image edit will be something finetuned to be really good at that?
>>107536498>the base model will be able to do edit as wellyou guys have no idea how powerful this shit will be, an unslopped model that can make realistic shit and edit, apache 2.0, small, this is literally the dream model, those fucker brought the fire on me, WHEN RELEASE???
>>107536498So, if you finetune such model you have to finetune it on both image and edit pair examples?
>>107536519I am thinking it might mean something like that but it doesn't make sense to beat that into the base model, edit models have a different training loop and why burn money doing that, and degrade finetuning and non-edit use case capabilities of the model, if you are already going to release a dedicated edit model?
>>107536511>107536471 (You)>107536436>Samefag btwToo tired to take a screencap and then you mention that I used console inspect element
>>107536530Hehe. You're about to be a victim of Chinese culture.
>>107536498that's the first time they changed the readme to provide some news about base, it's a big sign it'll be released soon
>>107536537If that's what they are implying then yes.
>>107536538maybe they found a way to not kill the edit's capabilities if you only finetune on imagegen, I know that's naive wishful thinking but I like that approach
Is it really worth it /g/?
>>107536555Is it powered by her piss or shit?
>>107536562yes
>>107536391>>107536426>>107536477>>107536511actual samefag btw
>>107536537>>107536545I guess it would be possible to finetuning on just images, if you don't mind frying edit capabilities away.We have no idea what it actually is so it's just a guessing game at this point.
>>107536449>Anything but the base model.not the right time to say that kek >>107536498
>>107536498that's the first time I've heard that base is actually able to do edit, I thought you put the edit capabilitie on a model through finetuning, not during pretraining
>>107536582I am quietly confident the model will not becoming. You can kekkaroo increasingly nervously as the weeks go by but I just understand the culture better than you.
>he spams reports on different ipsreminds me of that randall (the Jewish snitch) from recess
He could've just posted "the inpaint part is currently missing and will be implemented later"but nooo he had to add his petty opinions too
first time trying 3d. and it made an object good enough for 3d-printing on first try. only one reference image. very impressive
>>107536594>You can kekkaroo increasingly nervously as the weeks go byyou're the one being nervous, a lot of signs point to an actual release, we're so back
>>107536622Mind sharing workflow?
is cumfartorg and this thread some kind of mental asylum that uses shock therapy to make you gay?
>>107536622>>107536633desu, I just use the script from the GitHub. cumfart is too annoying when comfy breaks everything. workflows suck
>>107536498>A foundation model designed for easy fine-tuningThat's Chinese Culture speech to say that you'll only be able to finetune through API, trust the doom.
>>107536612He's probably pissed at having to implement any new models. Just doing it to keep up appearances. Normal ZIT is fucked up too, genning times are all over the place and require a restart after a while.Probably wishes they would all go API, much easier to handle.
>>107536656>He's probably pissed at having to implement any new models.instead of hiring Ui jeets who are obviously doing nothing but pretending to be working by removing stop buttons, he should train them to learn how to implement new models
>>107536498Actually it seems like they have put this up 10 hours ago but no one seems to have noticed kekhttps://github.com/Tongyi-MAI/Z-Image-blog/commit/e67bafb673fa19d301f903ac62de26c48b4cc1c4If you scroll down there added hints about the difference between the base model and the dedicated edit one? (It has better prompt adherence and it is more creative?)
>>107536498it's comming, they hadn't touched the blog for 2 weeks, the model is probably finished
>>107536656he just can't understand new models at all anymore. everything recently has been an improper implementation and actively ooms on dumb shit
>>107536612Slower = increased cloud cost = less profitThey need to do the needful and retrain it immediately
I feel so conflicted. I want to believe base will be released. It's like being interested in a girl and getting mixed signals.
>>107536436>post some 1girl, anime in highlights>entire general is now meaningless
>>107536698Embrace the understanding of Chinese culture.
>>107536666>If you scroll down there added hints about the difference between the base model and the dedicated edit one? (It has better prompt adherence and it is more creative?)you can see it on the blog yeah
>>107536703Can you define this for me? I see people saying it all the time but I don't know what they mean.
>>107536498>>107536666Anyway here is the 1girl of celebration.Doomers on their last supply of copium.
>>107536699some lost /adt/ posts are the most prevalent images made itt
>>107536710it's not that deep, he's saying that chinese people are snakes and that "chinese culture" is actually the default way of acting for them (lying, cheating and so on)
>>107536633I just used this template and dragged in the png. just wanted to try it and it worked much better than expected
>>107536720I see. Thank you for the explanation.
>>107536721does it do texture extraction? would be ultrakino
>>107536720Basically, but they will string you on as along as necessary to achieve their own means at the lowest personal cost to themselves. This is why you need to leave no wiggle room to cheat when doing business. If they can go back on a condition of a deal and get away with it, they will. The concept of good faith business is a joke to them.
>>107536498>>107536666>>107536707I want to believe boys...
You can test with the control_refiner_layers for noise_refiner hints and it's only marginally faster, it's just generally slower because of increased control embedder dimensions and adding hints to more layers, also requires more steps. Probably they did an initial experiment with control_refiner_layers for noise_refiner and found it doesn't work as good as just using control_layers twice, then forgot to remove the code so the untrained weights are in the released checkpointAlso you have to concatenate zeros to fill the expected dimensions to make t2i work anyway so not implementing inpaint doesn't even make sense considering it's just concatenating the init image and mask instead of zeros
>>107536710
>>107536498Do you think they delayed the release because they wanted to make the base model unified? it's the first time they ever said base would be something like that
>>107536718Hope your next post is proof of what you are saying, also, why are you shilling your general here if we don't step in yours?
>>107536744
>>107536756no I am just ashamed of the quality around here. allowing slopstyle is a farce
>>107536588did you forget that flux2 can do that? I don't blame you to be desu
>>107536759>instabitch makes a predictionkek, now there's a 100% chance it'll be released>>107536767flux 2 isn't a base model though, it's a finetuned model, like Qwen Image Edit
>>107536721Thanks, will try some anime girl with white background
>>107536666>https://github.com/Tongyi-MAI/Z-Image-blog/commit/e67bafb673fa19d301f903ac62de26c48b4cc1c4>Single-to-Singleas expected, Z-image edit will only be able to handle one image input, but they could do like what they did on Qwen Image Edit, finetune it further to make it able to edit multiple images
>>107536763And the photorealistic ones aren't also 'slopstyle', why your fixation with anime gens here? Also if anons here post kino anime gens, wouldn't that be taking away the little sense your general has left?
>>107536799>plastic skin isn't slopstyle
>>107536707I have meant it in the sense that they have also changed edit model's description, that's why I highlighted the commit>>107536744Enjoy the hopium!>>107536754Likely yeah.There aren't too many good explanation why you distill an unfinished model otherwise.>>107536798Actually single-to-single got deletedThat means they might already be finetuning the edit model to do that.But that might be too much hopium.
>>107536823>There aren't too many good explanation why you distill an unfinished model otherwise.since turbo can't do edit, it means it was distilled from an early version of base that couldn't do edit as well
>>107536812There's an entire thread waiting for you, you can go there.
>>107536833Anything to cope with the fact the base is done and they either cant/won't release it.
>>107536731dont see any texture. took four screenshots of a very complex model on sketchfab and it remade it pretty good. at least good enough as a base mesh or to get proportions right in your modelling
>>107536861Now try an API img 2 3D model and die of hopelessness.
>>107536861Cool, which other anime girls can do this model too? Can I export it to 3D rendering software? I need 4 different angles, right?
>>107536842They literally changed the github to say that it's a base model made for finetune and you still don't think it'll be local?
>>107536861Can do some random SDXL gen >>107536856 to 3d?
>>107536707No photorealistic tag on base? So will it be shit at realism or better at weebshit? Why would they drop this tag
>>107536897>and you still don't think it'll be local?Yes. I do not think it will be release.
>>107536908you don't expect a base model to look as good as a finetuned model, it's just not happening, if base models ended up like that we wouldn't need extra finetune in the first place
>>107536889>>107536902you need four angles for best result. find models on sketchfab for instance and just take screenshots. or you can make one image and use the template for more views based on image
>>107536744>I want to believebillions must believe
>107536912i am very english too good morning
who is paying ran overtime for being a dramafaggot 24/7?
>>107536666The training pipeline and model variants were already described like that in the technical report (https://arxiv.org/abs/2511.22699, section 4.3) from its first version in November. Omni pre-training covered both image generation and editing. Both Z-Image-Edit and Z-Image-Turbo (which is actually called "Z-Image" in some parts of the report) branch off from the base model after that stage. The editing variant had more pre-training specifically for editing (section 4.7).This means there's a chance LORAs trained on base will work on the editing model, but it's not guaranteed.
>>107536957esl
I really wish base gets released and the porn-niggas that made biglust and lustify fine-tune it. The slop will be so good. I hope and wish.
>>107536979this. and chinese noob niggas
>>107536951How did you prompt this style?>>107536979For me it is the BigASP guy moving on to Z-Image Base
>>107536979>I really wish base gets releasedI'm more hyped now knowing it can do edit as well, I thought I would have to wait additional months (and the release of z-image edit) before doing that lol
>>107537006>How did you prompt this style?I got that style prompt from the LLM rewriter >>107531631
>>107536958Animanon = DeboRan = DeboIt's all the same faggotNo other pathetic fuck wakes up every day, sees his dead general with the same Avatartroons himself despises but pretends to like them, then seethes and starts shitposting in other generals
I'm mostly happy with how my latest lora turned out
>>107537044Poor guy, you have to understand him, he hit the jackpot with the worst avatarfags from /g/
>>107537044ani is a real person. comfyanon and catpissanon met him irl. I don't think he shits up the thread but mentally unhinged schizos try to make it appear otherwise
>>107537061What is the theme of the lora, retro? SDXL or ZiT? I want to use it!
Is it possible to generate 16bit pngs in comfy for massive dynamic range, or is the gen itself limited in a way?
When will the onetrainer nigga update his shit
>>107536498https://arxiv.org/pdf/2511.22699I have a question though, what version of base will we get? the one on the far left?
>>107537102surrrre
>>107537102Sorry Debo, it must be rough watching time pass and seeing your justice league of avatarfags stay the same, and even worse, they don't improve, like they have some kind of mental illness, right?
>>107537145https://github.com/FizzleDorfhere is ani's GitHub. show us yours
>>107537103It's based off of 100 illustrations by an artist named systemst91The more I use it, the more I realize it's pretty flawed, and might need to be remadeI might not have the skill (or, let's be honest, the patience) to make a lora that is actually worth uploading anywhere
>>107537061link?
>>107537188bruh >>107537182
>>107537182What model is this? If it is Noob show me training settings and I might suggest some changes.For SDXL especially, bad anatomy can also stem from "confusing" images in the dataset as well though.
>>107537182You can always share your failed lora gens in the official /Stable DiffsuionTM general/ , I'm sure they'll be above average there. Right? >>107536958
>>107537205I'm probably going to get laughed out of the thread for using a model that is universally regarded as shitty, but...I trained the lora on Illustrious Hassaku
>>107537182if you are using noob vpred it just looks like this. try Wai or plantmilk
Imagine being so far up your own ass that you add an entire thing called "broken" to your code instead of just not using the part you know isn't used
>>107537225>Hassakubased, Ikena is an honest dev
>>107537225>Illustrious Hassakuwhy? you are supposed to train on base illustrious then it's compatible with every other ill model
>>107537225Training on specific checkpoints can be good if you want to squeeze maximum quality from your lora at the expense of compatibility with other checkpoints. Though training on shitmixes come with the same caveats of using shitmixes.You want to train on a base model like Illustrious XL v2 or better Noob v-pred v1 for SDXL anime.Though you likely have fucked up some parameters or have too weird images in the dataset to mangle anatomy that much.
Even if they retrained it, unless they changed something else, you'd have no way of toggling broken
>>107537231Ignore this troll.
Ah, I understand now. I guess my stocking lora sucked cock because i stopped it at under 3,000 steps. This is my personal look for a (realistic) peach at 3,000 steps i let train while i slept.last time i tried this, it failed to learn certain aspects of her attire. This time i went in with a more diverse dataset (still 20 images) and on top of that anon's suggestion of keeping all the training settings at default, it trained 100% of the character and i can change all of her attire. oh and i can strip her nude because of the dataset, and the nudity accuracy is like 99% there. yeah z-img turbo is really as good and trainable as everyone says. Damn. Base and edit are gonna light this scene on fire. 10/10 do recommend giving it a shot.>>107537115onetrainer is fucking DEAD nigga you're gonna have to get ai toolkit.
necroresponse, someone asked about training a lora on 512px, the likeness is not bad (zoey luna). it's kinda weird that i can get results of 750 steps
>>107537166ran doesn't do anything constructive to the thread or society so of course she doesn't have one
>>107537284I was gearing up to train on turbo but now I think I will just wait for base
>>107537145What is this screenshot supposed to be about?
>>107537238cuz I'm a retard and didn't know that
>>107537305I would still wait for base, it's not perfect. Just because it's trainable doesn't mean it's as good as it can be. Would be great if i can get it to not force the style of the dataset too but that may be a flaw of training a distilled model.
>>107537284What training settings?
>>107537234Why are people hating comfy for his behavior here?There are way too many fucking grifters that are trying to capitalize on the Z-Image hype at any cost. A different team inside Alibaba itself trained a controlnet for Z-Image, on the distilled model (?), not once, but twice and released just a few days apart. The second version has a literal blatant typo that runs part of the model wrong, but of course being an ML model it will adapt to whatever you trained it with even if suboptimal. It is clearly broken and comfy just made the code handle that case explicitly and called it out.Z-Image has an epidemic of shitty loras, shitty controlnets, shitty half-assed everything that people are rushing out because they want to jump on the hype train.
>>107537295512px isn't terrible just kinda flat on the detail i guess. 40 mins on a 3090 (1500 steps), i might push this to see if i can actually train a likeness lora on 750 steps, 512px in 20mins
>>107537295>the likeness is not badFlux vae preserves details a lot better even at low res, so the model can actually learn the likeness.It still looks desperate or Indian to train at 512p though. (Not that I should judge too much as a vramlet, but that doesn't make it untrue)
>>107537307ani is more respected and talented than tRan which is why she has frequent melties and spitebakes
>>107537324Forgot your avatar image
There's no alternative to seedvr2 is there? tiled upscaling with zit itself?
>>107537267This is helpful to know!So, I'm guessing that a small dataset with very clear images is probably going to produce much better results than a large dataset with a lot of clutter in each image?
>>107537324>Z-Image has an epidemic of shitty loras, shitty controlnetscalm down that model is less than 2 weeks old, let people master this architecture
>>107537324Newbie was a lumina tune and comfy should be labelling his own code broken considering people are still memory leaking zit
>>107537323defaults as i said.-open ai toolkit-change paths and lora settings as needed-change steps as needed-start.>>107537295>>107537328yep that was me. Still blown away you could even train a lora at all with that resolution. Picrel was trained at 1024.
>>107537121even the Z-image devs are shilling rewriting your prompts into a boomer prompt with LLMs
>>107537331trvke
>>107537342For SDXL, for style loras, you want a lot of images typically.Quality over quantity route works for character but SDXL struggles to learn style without too much noise from a small dataset. (Can still be done with knowhow and luck but you don't have the former.) Your initial 100 mark is good enough. I guess you can remove some low quality images but don't remove more than a few.
what happened to the vae replacements? how come the new models don't use them?
>>107537423when are you releasing the Yakub ZiT lora?
>>107537382NTA I agree, 100 images is great for IL style loras. However I've had to use datasets that only had like 30 images and managed to get pretty decent results so don't be discouraged if your artist doesn't have a lot of art online or something.
>>107537414Do you refer to lodestone's claims of about pixel space diffusion?His model is yet to (and not going to) converge into anything worth a damn to convince anyone outside of his discord.And while not perfect flux vae is good enough in terms of quality.
cumfartorg is simultaneously toxic positivity culture and toxic corpo culture that reached the boiling point with a garbage ui library they doubled down on. 2026 really is going to be the year it all falls apart
>>107537453Yeah, whatever sad schizo
DRAG AND SHOT
>>107537447no just the papers that came out a while ago. supposedly the replacements are much lighter on vram, faster and reduces noise in the output (higher quality)
>>107537463Inquiry if i may,What if comfynigger drug and shot you?
>>1075374654h33m51sproper documentation my furkan
>>107537465I hope this gets memed into reality. furk is a great storyteller I'd love to have my kids listen to for what these times were like
>>107537486get tf outta here with that slopsdg is that way
>>107537472>papers that came out a while agoWell there a lot of these.Link to which ones you are talking about?
>>107537465>"he made a deal with the Jewish devils">"big mistake"
>>107537339I don't know why I didn't think that tiled upscaling would work. Then just top it off with seedvr2.
>>107537486>>>/g/adt/
>>107537495>>107537508What's wrong with it
>>107537496there were quite a few but this one came out earlyhttps://arxiv.org/html/2510.15301v1there are a lot looking into it but I can't link them all
>>107537517Maybe you are a newfag, but there is a specific anime general thread for that. It will be better received there than here.
>me waiting for OneTrainer to implement ZiT
why is he samefagging again
>>107537508It's not anime though>>107537518Let me skim through.But judging by 17 Oct 2025 release date I would say it is too new even if correct and worthwhile. Probably would take a few more months until it gets used in any finished model.
>>107537166How does that prove that trAni is not an unhinged faggot and should fuck off forever?
>>107537545it's interesting they tested on sdxl. we might see it get life support in some new ill-like model which I would be fine with. unet is still a great arch that still needs to be explored
>>107537557ani works at contributing and sharing his work with others. you shit your diaper and screech in the thread every day. I wonder which anon people want around?
>>107537583you forgot to mention ran, sonic and ben10 "anon"
>>107537583trvth nvke
>>107537353it's an interesting experiment, for higher concepts i think you need a ton of steps, i think refining something the model already knows is very fast. i couldn't get a new concept in there with the default settings, i had to really change the LR, etc
>>107537583ani jacks off to shota and waits for people to implement basic features in his wrapper more like
NO you don't understand, everyone at alibaba's internal team and the reviewers on their public repo missed the TYPO, no way it was done on purpose, can't be, I'm right and EVERYONE ELSE IS WRONG
>>107537626link to the shota collection?
>>107537626 >>107537583 >>107537557Did you know Ani from AniStudio (Ani from Anime) is a /adt/ regular? You should post and discuss his stuff there instead, it's more relevant to that thread!
>>107537640Kek
>>107537626ani sounds pretty based ngl
How could flux blunder so hard? China nr 1.
>>107537667sorry but animanon is in the OP so it's relevant to the thread
>>107537583basedi wish schizo just stopped harassing anons who actually try to contribute. ani is the good guy here
When will I be able to create 3D characters and 3D clothing/objects from generated images in a simple pipeline, then manage them in a DAZ3D-like editor and use these characters, clothing, backgrounds, and poses to auto create control nets + prompts for Z Image finetune?I mean, can't we replace the writing somehow and make the generation process more playful?
>>107537632are you 12? you seem to have a lot of free time
>>107537694bad idea since the topology is bad for anything that isn't a static object
>>1075376942 more years. 3D models are the final frontier of this tech to be honest, there's far more complexities going on there than with 2d.which is why you never hear artists crying and pissing and shidding themselves about the 3d modeler jobs, no one cares about them.(remember the industry hated them first)
>>107537685Sorry but his UI is focused on Anime, and there is a specific Anime general for that, you have to talk about him there.
>>107537640Comfy's ego is so high he genuinely believes that yeah
>>107537678Baby making sex
bet-a57 is alight
>>107537708I think you interpreted my thoughts differently than I intended.Auto Remesh that it's a few hundred/thousand instead of 1 million polygons and you're done. I don't see how topology would play a role here in any case.The 3D model is more for visual feedback, as what you currently have in the prompt, and gives you a control net. You are also welcome to separate the character and background.
>>107537771thank you for convincing me that the dpm samplers are overhyped and not alight
>>107537771I am a dpmpp_2m beta guy.Though that causes issues for zit sometimes.So I am using euler ancestral ddim unifrom for now.No idea what I will use when base releases because I hate ancestral samplers.
>>107537794>No idea what I will use when base releases
Way closer than I expected lol
>>107537777the poly count isn't the problem, it's how the topology works with skinning+rigging. generated 3d models are already retopo'd but it's just evenly spaced quads which will just look like clipping garbage when it's used in a rig. some edge cases might be static objects like a hair ornament or a belt buckle but for the character model itself it's terrible other than for 3d printing or maybe a 3d statue in a scene
>>107537794res2m seems to be the go to sampler, with beta/beta57 scheduler.
>>107537803Can you share the whole prompt for the options thingy?
>>107537725>3D models are the final frontierworld models are because it removes the need for 3d altogether
Convenient censorship when you didn't ask for it.
>>107537815Surehttps://files.catbox.moe/nvl4h1.txt
>>107537794>switched to dpmpp 2m without changing the prompt>it removed "sde" from the textbros?...
>all this 1girlsMistress /ldg/, can this worthless anon coom?
>>107537803Does it run local models is it hardwired to HF repos? Fuck the cloud nodes.>>107537834Can you try i2i upscale if the beta scheduler is still blurring it?
>>107537843DON'T CUM!
>>107537853>Does it run local models is it hardwired to HF repos?it only runs on local models, you put your gguf on a folder and you're good to gohttps://github.com/BigStationW/ComfyUI-Prompt-Manager
>>107537834LolAlso the excessive grain you see in the image is the problem I was referring to.No idea why that happens with zit, it works fine on many other models.
https://emma-umm.github.io/emma/>We did it SAAR we beat Bagel!Really? In front of my end of 2025?
>>107537870>No idea why that happens with zit, it works fine on many other models.since they only trained on real data, there was probably a lot of compressed jpg images in there
i have a text encoder in this formatqwen_2.5_vl_7b_fp8_scaled.safetensorsits a 9gb filewhen i try to find abliterated versions of qwen 3 vl 2b or whatever, i cannot find a single file, instead there is the whole folder with configuration files and what notwhat is the difference, and how do i make use of the verion that has multiple files in a folder in comfy ui? right now i load the qwen_2.5_vl_7b_fp8_scaled.safetensors using load clip nodei want to use it with qwen image edit
You are using the correct text encoder?Why do you want abliterated or 2b. The latter won't work at all, the former will give worse results, since the model wasn't trained on it.
>>107537976Cute!
>>10753799299% chance a coomer said it'll make better booba
>>107537992Meant to tag >>107537911
If ZIT can figure out that she has to close her eye because the scar is going through her eye, not some random part of her face, it would be great.
>>107537992>>107538007yes better bob and vageneso its not like ill get better results with say qwen 3 4b vl ect instead of qwen_2.5_vl_7b_fp8_scaled.safetensors? im new to local ai
base more like a never-ending maze
>>107538034try prompting it better. like, boomer prompt. >>107537355
>>107537257>>107537423>>107537448>>107537525>>107537632>>107537787agartha needzs your powa frien
>>107537803Is this simply zit? It understands regional prompting like that?
>>107538037No it doesn't give better bob and vagene sir.The text encoder isn't censored, the diffusion model can't draw bob and vagene, because it simply wasn't trained on bob and vagene.Qwen image (edit) isn't really a coom model. There is no coom edit model yet. There are some loras for flux kontext and qwen image edit that you might use (most got jannied so you need to use civarchive) but they don't work well. Might get something usable through seed lottery though.There are some API models that can do okay bob, but that's outside the scope of this general.
so it seems, from my testing, if you can train your character lora with some nudes, absolutely do it. it 100% fixes the lack of titty training in turbo provided you gave it enough steps to work with.down to the color of the nipples trained with that character, even. very nice.oh and pubes.
>>107538001tyytyy
>>107538078it's using a visual LLM to rewrite your prompt and describe your characters from the image input, then with that prompt you put that on ZiT >>107537868
20 minute loras aren't great, but they sure as hell aren't bad, no wonder civit is full of low effort crap
>>107538078Pretty much any semi-decent text encoder (t5, qwen) understands regional instructions. (Though even they will occasional blend stuff.)CLIP of SDXL and before days couldn't, because it is simply too retarded not to blend concepts from different regions together.Flux, chroma, qwen image, etc. can all do this, nothing special to zit.
>>10753811120 minutes on which GPU? 5090?
>>107538133a 3090?
>>107538108>>107538122I'm doing llm for zit already, but was unaware of the regional thing, cool.I'm gonna have fun with that node, thanks.
>>107538139no fucking way you can train a lora in 20min even on a 3090... capppppp
>>107538139I am curious what shortcuts you used to get it converge into something halfway usable that fast on a 3090.Mind sharing your training settings?
>>10753816720min is pretty low but still XL is realyl fucking slow, more modern models learn way faster
>>107538182wait thats an SDXL lora?
>>107538139bruh WTF my output of the IRL I trained on was so dogwater compared to your lora and gens.
>>107533983use a black image and lower denoise a bit
>>107538167this is literally the whole point of the experiment to see how shit my training can be>>107538174- ai-toolkit- z-img default settings with training adapter v2- rank 64- 18 images cropped square (various resolutions)- no captions- trained at 512- 750 steps
>>107538188no I meant in comparison. XL is really slow when compared to newer models when it comes to training
>>107537803question, can this be done without hooking up a thinking LLM to my regional prompting workflow? what node do i use instead of that prompt generator?
>>107538092i was able to denude quite a few celebrities with 2509 thoughalthough nothing amazing for coomerbrains, works for me (for now)but i was just wondering what were the differences, theres a billion models out there
>>107538205Interesting. I thought it needed 3k steps or so.
>>107538205>356WJust an idea if you want to experiment but I remember reading some studies that said you could undervolt your card by like 60% and get very minimal impact on inference speed.
>>107538255>theres a billion models out thereIrrelevant for you.Diffusion models only work with the text encoders they were trained on.
>>107538255qwen will give generic parts but you can always feed the image back into a diffusion model for inpainting after.SAM3 can detect tits easily and for a vagina you can say "mouth".
>>107538299>and for a vagina you can say "mouth".sounds about right.
>>107538205interesting. Im trying to make a body type lora, of this ferraira woman, but even at 3k steps, nada. fuckall results.
>>107538299lmao patchworking tits and vagene in
Is there a node that uses Qwen 3 4b to refine the prompt before passing it to z-image? Seeing as you have to load Qwen 3 as the text encoder it seems like the slowdown wouldn't be too bad. I think it would work well for creating cohesive scenes with wildcards.
>>107538197It's either on or off for me. Left is .9 denoise, right is .91.
>>107538343lmao
>>107538343LMAO, just released that the model generated indian men pissing on the background, I didn't prompt that, based Xi, I kneel
>>107538359>>107538343Needs more indians around her. Totally takes me out of the immersion with so few trying to take pictures and demand the bobs and vagene for the pay cards, saar.
>>107537808yeah res2m seems to work pretty damn well in zit .. i usually use uniform_pc for the scheduler with it
>>107537803this is very very cool>>107538259>>107538307really depends how many images you are using for your dataset and the quality of them, along with your captions. if you're using default settings it shouldn't be absolutely nothing.
>>107537870>excessive grainif i understand thing right shift is essentially a control for when zit shifts from low noise to high noise. the "grain" is high noise, try lowering shift. I haven't extensively tested shift values against samplers and schedules but it definitely has an impact on shitty skin texture graininess.
>be trani>see MIT licenced work with over 40 contributors>"wait, comfy became a winner when he made an ui!">lightbulb.png>vibecode a wrapper that barely works and even misses trivial features>"now is my time to shine thehe~">closes shota folder>injects the last dose of hrt juice>slap a commercial licence on top of the MIT licenced stuff >"it's basically as if i build everything myself, i'm such a genius">"they will never make fun of me again thehehe~">release it and spam all threads for months...>no one cares
>>107538421maybe this trani person will stop doing whatever it is you don't like if you don't bring them up out of nowhere
>>107538415I'll tell u exactly how many and what kind and captions and shit. 40 images, head cropped out. insta size mostly. ie 1350.captioned with body description.
>>107538205>- rank 64Ism't that too strong for Zit? You already run loras at sub 0.50 because of the distortions when having 2+
Please halp. Nvme drive loads clip SLOWER than my ssd drive. It takes like 5+ minutes to load wan where on sdd its almost instant..I have the same setup on my regular ssd crucial drive and nvme m.2 drive (which is supposed to be faster), same speed boosts and models. Only difference is, there's more nodes on my nvme drive.
gday fellas
>>107538441don't caption the body, caption everything except the body. you're basically telling the trainer that it has to learn everything else.ie. if i want to train a body i want to caption the setting, the clothes, the jewelry so it learns only the body and not that stuff.try again with no captions.
>>107537803Welp.
>>107538435>maybe this trani person will stop doing whatever it is you don't likeif you mean existing sign me up champ!
>>107538508you have to put the mmproj next to the gguf you tardburglar
>>107538497Hm ok, ill try once more with no captions at all.
Cute wannabe butthurt schizo
>>107538517Pretend I'm retarded.
>>107536612Well, he is corrrect. Not surprising since this Z-Image control net is not from the guys who makes Z-Image.
>>107538529Now check the console. It must start the same as the model itself and end with "mmproj"
>>107538490
>>107538477maybe your nvme is kill
fresh bread>>107538552>>107538552>>107538552
>>107538541damn that's great, gj anon
>>107536897Also they say it's a community model over and over, the Flux 2 shills are desperately trying to pretend this won't be released.What's even more hilarious is that the chinks will release this great undistilled model before BFL can get their shitty small distilled Flux Klein model out.BFL is dead, they peaked at Flux dev which was ok for art styles but 100% slopped humans and super censored. Thanks for playing.
>>107538541kino genprompt?
>>107536908A Base model that is primarily made to be further trained on should have as little aesthetic and caption bias as possible.It should have strong foundational knowledge of practically every concept, and then people can finetune the model to be extremely good at particular concepts, anime, NSFW, art styles, etc.
>>107538508you have to read thishttps://github.com/BigStationW/ComfyUI-Prompt-Manager?tab=readme-ov-file#image-inputs
>>107538107catbox/prompt for this style please?
>>107538312>patchworking tits and vagene inIt's actually a good technique when you want a specific boob shape. it's really hard to prompt for a slim body with massive tits or a fat body with tiny tits.And for some reason boob tags massively affect the way a face will look.