Did They Just Ban Him? Edition Discussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107364548https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipehttps://github.com/ostris/ai-toolkit>Zhttps://huggingface.co/Tongyi-MAI/Z-Image-Turbohttps://comfyanonymous.github.io/ComfyUI_examples/z_image/>WanXhttps://comfyanonymous.github.io/ComfyUI_examples/wan22/>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
Z-Image is unsafe and violates Chinese law. The developers need to be more careful for future releases. Perhaps consider making the more powerful models API-only so prompts can be adequately filtered
>>107366162
>>107366162bghira is that you faggot?
>>107366162Perhaps you should polish my balls.
>>107366162this but unironically
>>107366172holy SHIT it's NIGGER bird anon! so glad to have you back.you tell that fucking nigger!
Luckily ComfyOrg offers a solution through API Nodes. Access the full safe power of Z-Image 2 through ComfyCloud's API.
Chinese girls want to fuck NIGGERS they're RICE BUNNIES they want BIG BLACK COCKthis post was sponsored by Chroma™
also to whoever was saying z doesnt understand breast sizes, it does, and it understands types of breast shapes as well. you just need to know some good terms. look up a cheat sheet or something kek. it just works.
I applaud chinamen for obliterating censor trannies
>gen nsfw 1girl w/ chroma>compose w/ flux 2 referencesI thought flux 2 is useless but it's actually quite good for this
what resolutions does chroma support without throwing anatomy demons?currently using 832x1216
>>107366230>without throwing anatomy demons?if you gen at 256x256 you probably can't tell. does that count?
Z image has fixed all my issues I had with WAN not making jungle women ethnic and dirty enough and I am very happy>>107366194i would love to know how to get breasts larger than this. it feels like this is the maximum size for Z
>no prompts>gets a woman lying on graslmao they sure didn't want to miss that one!
>>107366230It's hard to avoid the anatomy demons. I could be wrong but I follow the same rules as sdxl, multiples of 128 in each dimension so like 1024x1024, then 896x1152, then 768x1280, and so on. I think generally for best accuracy you want to keep it close to 1 megapixel.
>>107366237yeah yeah, the spooky bad hands and feet of chroma>>107366238>Just don't go above 2048alright, i was asking because i got elongated torsos and legs at 1600x1600
What are the best settings to train zimage loras? Or is it similar to training Illustrious or Noob?
>>107366261I deleted the post because I thought you asked about Z. I didn't expect anyone to be using chrome in this day and age.
>>107366265>What are the best settings to train zimage loras?Wait until Base.
is 80 seconds per image normal for chroma? im using a 5090.
im getting extra toes very often on foot gens with Z turbo, both from above and from the soles>>107365456>catbox me boss. you have activated my jungle fever. and fevor.ah shit im sorry i stepped away, the prompt was>A National Geographic photograph from the 1970s, three extremely voluptuous very dark-skinned Brazilian indigenous women with primitive features, dirt-covered bodies now wet and glistening, long matted black hair plastered to skin, gold hoops, extremely plump lips, grass bikinis clinging to massive breasts and huge buttocks, thick thighs, standing under makeshift outdoor shower spray, left woman has mouth wide open with tongue extended and neck craned upward catching water droplets, center woman rubbing water over her dirty skin with hands on her enormous breasts, right woman squatting low with thick thighs spread, eyes closed with expression of intense pleasure, hands behind head exposing armpits and underarms, water cascading over her huge buttocks, candid moment, 1970s documentary film quality
>>107366255>I follow the same rules as sdxlthat makes sense, i'll just use the sdxl table then. thanks.>>107366268>I deleted the post because I thought you asked about Z. I didn't expect anyone to be using chrome in this day and age.chroma's good, i think its the best. i use noob for anime though.
>>107366281thanks king, your fucking gens are jewcyand nah i dont think the tits CAN get bigger, its just turbo's limited training.
>>107366275Seems a lil slow, depends on how many steps you're using and what workflow. I think I get about 30 seconds with the default workflow which is 26 steps of euler on a blackwell 6000 pro (which should be about the same speed as a 5090). Maybe make sure you're not running out of VRAM?I usually use 40-50 steps with uni_pc or res_multistep which usually takes more like a minute.
>>107366331im doing 30 steps with heun/sgm uniform at 896x1152 with upscaling. i forgot to mention that. im not getting oom errors. ill try the samplers you mentioned.
Z is very good with cars
>>107366390where's the 1girl in this image, what is this heresy?
this one came out alright if you pretend its a small rock or really big grain of sand>>107366288>chroma's good, i think its the bestno one cares about my opinion and I missed the model war discussion last thread but personally I never found a desire to even download chroma (even when SPARK came out) because the increase in gen time and setting all that up for realism just wasn't worth it to me compared to using WAN, which I could use for videos to at any timeand now that Z image turbo is out, and from last thread it seems that loras train well on the turbo which imply that the base model will be even better I can say with absolutely zero sour grapes that I do not regret skipping Chroma entirelybut again my opinion doesn't matter. i mostly just wrote this out because I'm waiting for my series of gens where the researchers take a couple of tribewomen into society and funny lewd hijinks ensue. thanks for reading this far kek
>>107366402sir please put a 1girl in the image to adhere to /ldg/ guidelines
>>107366308>your fucking gens are jewcyi have more types of brown women to gen first before sexualizing the jews but we'll get there>and nah i dont think the tits CAN get bigger, its just turbo's limited training.yeah I think so too. but I'm sure a macromastia lora will exist by the end of the year>>107366415from what i remember from my cars phase when i was younger, the hot asian 1girl is supposed to be in the passenger seat of the lamborghini
>>107366407>thanks for reading this far keki read the whole thing, you need to lay off the natural language prompting and just said "I just didn't think Chroma was worth it but I think Z-Image is worth it".Me likey Chroma but the gen times are rough. Still prefer it to Z-Image for now.
>>107366432so you liked hot asian women when you were younger and now you like fat aboriginal women as an old man?
>>107366390toyota gt-r supra r34
I liked hot asian women when I was younger and now I like them with cocks too.
>>107366390indeed
>>107366398Of course anon, here's the 1girl
Z handles reflections noticably worse than WAN. one of the advantages of being a video model with a temporal understanding and training data of videos with mirrors I guess. or maybe it's also because of turbo, we'll see on sunday>>107366445>you need to lay off the natural language prompting and just saidwell if you read the whole thing you'd know why I was blogposting lel>the gen times are rough.the gen times are the only reason I'm addicted to Z image right now. It's actually faster than my ancient SDXL workflow since that required a hiresfix while this just does it in a single sampler pass>>107366449I can appreciate pretty much every form of female before menopause except indians, honestlyThe chinese/asian beauty standard is not for me though. I like westernized looking asians more. too bad a lot of them have tattoos nowmaybe that's why I like tribal women, either no tattoos or the tattoos are curvy and sexy not retarded shit like dogpaws and butterflies that are dead (99% of butterfly tattoos have the wings open too much like they're in a biology textbook aka dead, once you learn this you can't unsee it)
>>107366461same, huge veiny cocks with sagging balls or you're a bitch.
What is the best method for NL captioning SFW datasets? TagGUI is outdated shit that still recommends Florence 2. Any good SaaS API that can handle 100 or so images without paying?
>>107366486yeah, the gen times of zim are extremly nice but it doesnt have the knowledge that chroma has so ill stick with it until i see what zim base can do.
I hate z fags so muchthey are basically the former chroma fagsI hate that underdog thinkpad mentality
>>107366470>no huge flaccid dong peeking from under her skirttch, z-image SUCKS
>>107366551I honestly wonder if the model has been overfit on its RLHF data, it does cars WAY too well
>>107366557Didn't they only do that for Turbo and not Base? I don't remember.
>>107366551lmaoooooo
>>107366580do one with e621
>z fagscmon anon, zigger was right there>>107366557>it does cars WAY too wellif it's trained on portraits maybe its trained on girls posing with cars? I don't think it does cars *way* too well btw, just really good
>>107366594my zigger gens always come out blurry or with noise. i know adjusting the shift addresses some of it but they dont come out this clear. did you edit the workflow?
China, the master of AI coom
>the word "MissAV" with "Miss" being in black and "AV" being in pink>>107366586I doubt it knows any logos more obscure than something like PornHub.
>>107366609>did you edit the workflow?I didn't change anything other than using TAEF1 instead of the normal VAE which changes the image less than 0.1% so that shouldn't the difference. I'm bypassing the shift node like the original workflow was though
>>107366627i must be messing something up, prob a prompt skill issue.
>>107366625>MissAVthat one is good, but the goat is ThisAV desu
>>107366641the image you referenced still has the typical z compression artifacting
>>107366505joycaption or torigate
i hope zbase is better at pov hands reaching outmaybe i'll be able to reach the endgame of full-on gum inspection videos with squelchy audio by 2028
>>107366653I find turning off the bypass for aura flow, setting it to 7 and increasing steps to 12 cleans up most of the noise at the expense of being slightly slower.Currently messing around with switching schedulers half way which is giving interesting results.
grab her by the pussy>>107366681>I find turning off the bypass for aura flow, setting it to 7 and increasing steps to 12 cleans up most of the noise at the expense of being slightly slower.show us a comparison?
why is z image so good at controlling the age of a girl?
hands reaching out seems to work better in landscape than portrait, which makes sense if you think about the portrait dataset used to distill and the where most pov hands image involving grabbing ass come from (landscape porn videos)anyways thats enough brown women for now>>107366697>why is z image so good at controlling the age of a girl?same reason WAN is so good at controlling the age of a girl. They got Gemini to guess the age and include it in part of the caption I'm assuming, or maybe the AI really is that smart. Either way I don't think they did something very new for Z image captioning compared to WAN
Look at him go!https://civitai.com/models/2174416/technically-color-z?modelVersionId=2448632
It keeps doing "/ldg." instead of "/ldg/"
>>107366747Is that a supra
>>107366751its a sigma
>>107366725here ya go tribal girl enjoyer grampshttps://www.reddit.com/r/StableDiffusion/comments/1p9f6it/humans_of_zimage_races_cultures_and_geographical/ill have to test this shit out tomorrow morning for sure
>>107366751That one was supposed to be a Chrysler but I'm just realizing I spelled it incorrectly.
>>107366775JK I spelled it correctly nvrmind.
How do I gen longer wan 2.2 videos than 81 frames?Can I simply just increase the amount of frames or is there some hoops I have to jump through? I sometimes see vids on Civit that are way longer than 5s, so there must be a way that I'm missing.Slightly related, how would I go about upscaling the vids during generation?
>>107366194what it really doesnt know is areola size but i wont hold that against it kek
>>107366798For simple prompts that involve a stationary camera and a character doing repetitive motions, yes you can just increase the frames. For anything more complex like the camera moving, a character appearing from offscreen or any drastic changes in pose the video will likely loop, but you can usually get away with 101-113 frames without looping. Other than that you can use painter long video nodes which will generate multiple videos using different prompts and it will combine them for you.
>>107366798Set frames higher. It's 81 for 5 sec and 113 for 7 seconds. Wan is designed for 5 seconds anyways so any longer, the video just loops. A good workflow can upscale and grab last frame and extend scene.
what can i do with these? i have not genned any images in like 2 years. can i make videos?
>>107366864train an actual finetune
>>107366864You can play Minecraft with raytracing
>>107366864Give them to me.
>>107366851
>>107366892What I find interesting is that it makes the people in the background look really diverse and different, but as soon as it comes to doing portraits they have sameface.
Z Image is fucking incredible holy shit
>>107366931funny you say that
Yeah I'm thinking this Z image model is based.Also what the fuck this thing runs quickly on my 3080 10gb it this actual witchcraft or some shit?
>>107366971>>107367033lol'd
did /ldg/ confirm that prompts in chinese work better than english?
>>107367084Gacha.
man, lumina models (z-image) is fucking horrible on turing gpus. no bf16, sage attention doesn't work properly (it doesn't work with fp32). i'm sad
>>107367084Only one image comparison but yes.
>>107366102Chroma was only possible by breaking Flux's distillation basically. Plus Lodestone discarded Flux's useless parameters. It's a modified Flux Schnell. Allegedly the tune itself cost like $150k, and was only possible due to community funding and sponsors like Pony.
>>107367084It's not 100% but something to try if your prompt isn't working.
>>107367132>Plus Lodestone discarded Flux's useless parametersnow he wants to expand z-image into a 32b MoE
>>107367159can he do something useful?
>>107367019>>107366931Kek, you're right.
>>107367159Visionary
>>107367159>expand z-image into a 32b MoESource? I mean, I remember this frankenstein experiment https://huggingface.co/blog/segmoeIt did have cool results
I have no fucking clue what I am doing. Is it possible you guys could train a retard on this matter? I would prompt slave away for anyone's help for like a month straight. I have comfyui installed and that's about it. I read through the rentry stuff but it's all just going right over my head.Here is my computer.AMD Ryzen 9 9950X3DASUS GeForce RTX 5090 AstralMSI MPG X870E EDGE TI WIFISAMSUNG SSD 9100 PRO 4TBDOMINATOR® TITANIUM RGB 96GB (2x48GB) DDR5 DRAM 6000MT/s CL30
>>107366864You can make 4 720p videos at a time at full precision full steps with WAN 2.2 which should take about 15 minutesI would recommend vibecoding a webui for maximum comfiness assuming you have access to this compute permanently/for a while>>107367084>did /ldg/ confirm that prompts in chinese work better than english?Actually I thought we confirmed that the Chinese prompt enhancer concerts the prompt to English. It's just that qwen is trained really well for both languages (an anon even tested russian and it worked too) so the outputted embeddings are similar enough, but may be different enough to do a concept in one language that it can't in another. So actually, you should keep trying your prompt in different but well-represented languages (maybe even Spanish too) until you either get what you want or rewrite the prompt and try again. >>107366767>here ya go tribal girl enjoyer grampsThanks. Amazonian looks very colonizable but I think qwen basically converged on that facial structure with Brazilian + tribal but I'll definitely try with it too next time as well. Some interesting islanders too. Filipinos are too light skinned and that's probably the portraits they trained on since Asians are more obsessed with being white than white people are (unfortunately)>>107367108>man, lumina models (z-image) is fucking horrible on turing gpus. no bf16, sage attention doesn't work properly (it doesn't work with fp32). i'm sadNo one is using sageattention for Z right now, and Q8_0 will get you 99.97% of the way to bf16 so stop complaining and be happy that solutions for your vramletness even exist
>>107367198What do you want to do, for starters. Gen images? Videos?
>>107367198>I have no fucking clue what I am doing.I am unable to sleep and I am unable to get so I am willing to help you out for about 2 hours First of all you need to tell me what you want to make with AI, since I have no clue what you are doing eitherSecond of all you need to tell me your OS and your level of experience with Python and the Command Prompt
>>107367205Truth be told, right now I just want to make a test video of a vaporeon running around a field of sunflowers.
>>107367159It's crazy that no ones beat scale-at-all-costs. Could you imagine? Zimg being... hah... something small? Like 6B? Absurdity. No way.
>>107366230Which version of it are you using? If you want significantly less limb errors could always switch to Chroma Flash HD.
>>107367211The true aim is to make hyper niche personalized porn for the wife and I to laugh at and get off too. Maybe just make shit to shitpost with too. Windows 11. And I used to be really familiar with both python and command prompt but I also haven't fucked around with either in like 2 years.
>>107367198how are you struggling to understand the basic concepts when you have enough money to buy that pc?just download the smoothmix t2v workflow for smoothbrains from civit and prompt your vaporeon porn
>>107367227im using spark, flash looks like shit whenever i use it and keeps giving me anime randomly, which pisses me off.
>>107367244Well, spark is a meme tune. Fix your Flash settings then.
>>107367239I just fuck and exercise. I gave my brain away long ago. Thank you for the insight though. I greatly appreciate it.
>>107367218>>107367234Okay cool I won't bother with technical stuff since I'm assuming you're intelligent enough to ask Claude for help with python fuckeruThe other anon might help you directly get to video, but I'd want you to learn the basics of comfyUI first and actually generating a basic image and starting to use the program So how about you start using Z image which is the cool new text to image model we're all using right nowRead this https://github.com/comfyanonymous/ComfyUI_examples/tree/master/z_imageIt's very simple. Ask for help of you get stuck. I hope to see a vaporeon image from you in under an hour assuming you have fast internet If you can figure this out, honestly I'm sure you can figure out the video example which is herehttps://github.com/comfyanonymous/ComfyUI_examples/tree/master/wanThis is wan 2.1 which is the older text to video model. But it's much easier to setup. This will get you to second-to-state-of-the-art vaporeonsAnd then you can fuck around with wan 2.2 which requires a second sampler and to use two models in the same workflow. Which is twice as complicated https://github.com/comfyanonymous/ComfyUI_examples/tree/master/wan22This is my advice. Ask for help if you get stuck at any step. Please set up basic image generation first, because there's more to talk about with video models to speed them up so you're not spending 30 minutes per video on a 5090. Good luck anon
>>107367254spark works fine for me, how is it a meme and would you share your flash settings with me?
Why it no werk
>>107367256your brain should still be working if all you do is fuck and exercise. also, download the i2v one too in case you also want to animate a vaporeon image instead of making one with the text2video one.you should be able to figure out the rest, the workflow includes notes as well.
>>107367269Yeah I was also gonna say if you want vaporeon Pokemon stuff you probably want to generate an image with an image model first and then give that to wan 2.2 with image to video. Wan text to video isn't the best or most consistent for anime
lodestone is based for asking for the pretuned versions and aesthetic tuned versions to be released separately but shut the fuck up about pixel space and let them cook
the details on radiance are so fucked everyone knows it grainy as hell
respectfully
>>107367294does pixel space even work or is it more bitnet fake hypeshit that doesn't amount to anything? it reminds me of that "unlimited detail!!!!" thing that claimed games would have unlimited atomically-deep graphics with no performance cost.
>>107367320who knowslode got further than ostris but still not all the way play with the idea for the edit model but leave base alone or at least release base before fucking around with it further
>>107367320It works but it's inferior to latent space and I don't really understand the reason to pursue it. latent space is more like analog film while pixels are discrete like digital videoThe value proposition is that you don't have to encode and decode the latent but that shit is basically instant with state of the art tiny AEs.
>>107367320It works in that it produces an image. Look at any radiance gen and see how much noise is still there.
>>107367384This is why I am on finasterideDo people actually use computers like this? Even women sit with their laptops on their... lap
just turn z image into a pixel-space x0pred 32b MoE model trained by merging 3 separate 512x512, 768x768, and 1024x1024 bakes without testing any of these ideas individually. all that was missing from chroma was compute, $200k simply wasn't enough!
>>107367370the value isnt speed but rather getting rid of a lossy operation its trying to solve the "we asked AI to NOT change this image over and over again and it turned the cute asian 1girl into a fat brown 1girl" problem >>107367378>Look at any radiance gen and see how much noise is still there.this
>>107366078pretty good but the face looks off, something about the features being out of place
>>107367395>getting rid of a lossy operationHuh? How is decoding to latent space lossy? You can't do anything with latent space in terms of consuming the image except decode it. Calling that "lossy" would be like calling the observer effect of using a voltmeter "lossy", like yes you're technically changing the voltage by measuring it but but there's no other way to actually measure the voltage (pixel space and latent space wouldn't deterministically converge to the same thing so it's not apples to apples comparison)Or should I actually start the fast.ai course on stable diffusion because I am missing some fundamental knowledge here
>>107367262It's clearly working but why isn't it put into text/string?
>>107367422Anon no one fucking cares, export your workflow as a json and open your ComfyUI folder in Visual Studio Code and use your free Claude Sonnet 4.5 and give it your JSON and ask it this question
>>107367429That's not very local of you, anon.
>>107367419you can observe the effect yourself by encoding and decoding a latent multiple times its more pronounced on earlier models you might remember the purple splotches that would arise occasionally with 1.5 which was partially solved with external VAEs but not entirely latent space itself is compressed pixel spacesure modern VAEs have gotten REALLY good at minimizing this but its still there and with enough en/decodes it shows up
>>107367446Okay connect copilot or your favourite Viscose extension to your local OpenAI compatible or KoboldAI endpoint and use that then.
but again the important part is no one i2i's with modern (non edit) models enough for it to show up so the benefits do not outweigh the cons of its current status which is grainy gensmaybe we will figure it out with zedit but theres simply no reason to do it with base right now
>>107367453Turns out the combination of the two different prompts actually work, but it doesn't show up in the prompt window, the prompt that is in the prompt node gets frozen and displays forever until unhooked.Thanks open source.
>>107367428Prompt?
>>107367452>sure modern VAEs have gotten REALLY good at minimizing this but its still there and with enough en/decodes it shows upBut we're not doing multiple encodes and decodes so this is basically irrelevant right? Oh I guess hiresfix or other multiple pass workflows this matters. Or maybe it doesn't since you can just use Upscale LatentMaybe I will make a test for this (and make another slop rentry for it) because I am.interested in how well 16 Vs 32 Vs 64 channel VAEs compare. I have a feeling this is a negligible concern at 64 channels and probably even 32 though for the average workflow
>>107367484>Thanks open source.You're only meeting open source halfway with an attitude like this, the cool part is that if you found the bug, you can make a pull request to the GitHub repo of that node and get it fixed
>>107367498And in open source I have the freedom to not report it and let others suffer.
>>107367507based spiteful free will enjoyer. fuck everyone if you got yours already
>>107367490zimg - Sabattier effect Photography in the style of Slim Aarons. Anaglyph 3D, calligram, symmetrical composition, ascii art, surreal, abstract, geometric, futuristic neo-brutalism. japanese graphic design aesthetics. the designer's republic
loads of loras up
>>107367507>>107367523I mean sure I think doing it for free is cuck coded too I just don't want to deal with maintaining private forks or upstreams so I'd rather it get merged into master >>107367524Schizo prompting with z base will be so kino bwos
>>107367491if youre interested anon was posting and comparing VAEs around the time flux first released as that was when you could clearly see how much better a 4ch is to a 16 for details >But we're not doing multiple encodes and decodes so this is basically irrelevant right? for regular imggen you are correct but not for edit modelsno one edits a specific gen more than X times because it slowly degrades which again is due to the VAE >because I am.interested in how well 16 Vs 32 Vs 64 channel VAEs compareiirc 16 vs 32 is not worth it due to the additional training time vs the amount of detail retained but i could be misremembering the big leap was 4 to 16 and either no ones trained on anything higher or if they did it didnt take off the main issue is it doesnt need to be in base right nowokay thats the last time ill say it i promise
>>107366147Did ani say when anistudio's getting z-image support? i'm not gonna use anything related to comfy. hopefully ani has some free time this week.
whats the best local model for horror, actual scary monsters and atmosphere?
>>107367568seconding that, not really interested in supporting comfy bullshit
>>107367560>iirc 16 vs 32 is not worth it due to the additional training time vs the amount of detail retained but i could be misrememberingOnly reason I brought up 32 and 64 is because Infinity-2B uses a 32 channel VAE by default and it had like 4 different VAE pths in the repo and I was remembering that. I forgot about edit models tbdesu because I'm also the guy who firmly believes (because of my experience fucking with Infinity) that autoregressive/edit models are not suited for local use cases anyways at least for the time being due to the high vram requirements and inability to quantitze the activations in a trivial manner so it literally slipped my mind
samefagging this, i only use SD1.5 btw
>>107367568>>107367587he gave up on anistudio after he was caught schizoposting with his trip onhttps://desuarchive.org/g/search/username/Ani/tripcode/0gRLTHrqN2/type/posts/
>>107367523This kind of mindset is actually part of why the west has fallen.
What the fuck is wrong with people who go through the effort of training a fucking lora and upload it to CivitAI only to scrub their goddamn samples of metadata?Fuck you if you do that.
>>107367603ani didn't use secure trip and the schizo was able to mine it. he deliberately falseflagged as ani
>>107367615??? and how would you know that ``anon''
>>107367616it's obvious what happened schizo. your obsession with ani is showing
>>107367608Isn't that civitais fault for censoring shit?
>>107367615>>107367622Are those posts also the schizo falseflagging as Ani, or are you just naturally so stupid you don't see everyone can see you refer to yourself in the third person?
>>107367526Yeah, interesting that you can get such good quality like the technicolor lora out of training the distilled Turbo version.Makes you wonder how good training the base model will be. Then again base model trained loras might not work as well on Turbo, and if you can train on Turbo directly...
>>107367554It feels like diet Qwen with merged in ultrareal lora.
>>107367636not ani. i just like his work and think he deserves more support from us
>>107367646Very organic, anon.
>>107367595>due to the high vram requirements and inability to quantitze the activations in a trivial mannerperhaps zedit will fare better than previous models in this respect considering the nature of their paper >>107367641>merged in ultrareal lora.turbo is aesthetically tuned for maximum 1girl so youre not entirely wrong thankfully it sounds like base isnt and if we're lucky itll be the pretuned version
>>107367646Don’t know who this Ani guy is, still a newfag (been here like a week). What kinda projects does that dude even do?
>>107367568this weekend
>>107367637If the base model lives up to the hype of general knowledge and trainability while still being relatively easy to prompt for then I'm getting a gut feeling that this might be the photoreal model accessible enough to get Anti-AI people to start taking notice I know people said this for flux, and then said this for wan, but this is really really photorealistic really fastOn the other hand, every 2 months normies seethe at Grok for being PedoHitler and then move on to the next current thing>>107367652>considering the nature of their paperWhat is there to consider? Is it not autoregressive?
>>107367658His biggest project is samefagging the thread, pretending anonymous posters are supporting his broken UI when it has only one user (himself) (it fails to build on anyone else's machine), and ritual posting "comfyanon should be shot".
>>107367658ani was on a forefront of vidgen way before it became popular, hence the name. he also was moving models forward while working in japanese studio (ikhor labs, he was employed there). also helping comfy with his interface before comfy betrayed himright now he's developing a c++ ui for imagegen https://github.com/FizzleDorf/AniStudiomuch faster than python crap
I am SICK AND TIRED of chinamen spamming low effort ComfyUI workflows with runninghub referrals. Civitai seriously needs to ban that shit.
>ritualpostingYou just made me realize this isn't a blessed thread :(
>>107367258They're the same they've always beenhttps://files.catbox.moe/rgk5w1.png>spark works fine for me, how is it a memeOne of the images you posted (ldg sign next to a car) looked very broken compared to what Flash can do.
>>107367603what was the schizopost?
>>107367684oh wow, thank you for the lore dump, anonymous!
>>107367684Based if true. Should check out the interface, I honestly kinda hate python.>>107367673You sound unhinged anon
>>107367684wow. comfy is an asshole
>>107367696For a week or two he was double baking trying to force AniStudio in the /ldg/ OP because /adt/ removed it from theirs, and spamming the legitimate threads trying to redirect people to his own bake. He forgot to remove his trip for one of these posts.
>>107367670There's an NB2 "it's so over" marketing push on X and 4chan rn, that will draw the fire first.
>>107367704he has a version much further than the current release but the schizo has a poopy melty across a week that ruined the threads so he's probably burnt out. if he comes back and adds zit support I'd support him
>>107367670>What is there to consider?one could infer that the philosophy behind the entire suite of models is small but SOTA not saying i comprehend everything in the paper but you get that from even the abstract and introduction
>>107367704>You sound unhinged anonThere's a reason /adt/ removed AniStudio from their OP and no one on /ldg/ uses bakes made with AniStudio in the OP. You're talking to the thread's biggest schizo after debo.
>>107367734debo became pretty much inactive more than a year ago. ani is a much, much bigger schizo and honestly deserves a rentry much more than debo does
>>107367716the only thing I see in your link is him trying to get anons to fill the oldest thread because you baked two more threads with and without anistudio>>107367734>There's a reason /adt/ removed AniStudio from their OPhaha, yeah. some schizo would split the bakes every thread and schizo spam both all day for a week. we had the exact same thing happen here because mods just can't range ban this unhinged nigger
>>107367673>>107367684just checked his project. weird choosing ImGui over anything else lmao. well good luck. since Forge Neo already supports Z-Image i don’t need other UIs anyway kek
>>107367608The loras on Tensor were way worse. Like 90% of the loras had hidden prompts on the sample images and many didn't even tell you the trigger word.
so is comfyorg bad? I get the red hat ick when I look at what they do publicly but betraying friends too? I feel bad for using comfy for so long
>>107367724>he has a version much further than the current releasedo you have a source or did you make it up 'anon'
>>107367757comfy is just a bad person
So, today is Saturday 4:00 PM in China. Where is the model? They seriously pushed into Sunday?
>>107367718Kino
>>107367718>There's an NB2 "it's so over" marketing push on X and 4chan rn, that will draw the fire first.I'm so happy I have absolutely no idea what you're talking about. I love not knowing things going on in the world. I didn't even know who Charlie Kirk was until he got necked and obviously I was still online>>107367726Based on the team's behaviour towards embracing noob dataset (even if they will censor) and kicking out bad actors from their discord, this is probably the most aligned team ever I wouldn't bet money on it, but if I was forced to I'd probably bet on an Ali team being the first to get a local version of video+audio/Sora2 at home And then I'd lose the bet because tencent would pump out some shitty garbage that's still SOTA a month before the Ali team releases their model kek
>>107367760he shared a screenshot in one of the threads but I'm not going through the schizo bakes to find it. it looks a lot easier on the eyes
>>107366248She's a chinese woman, though. Can it do an actual woman?
>>107367773I'm glad you're burned out, bro. Wish you were burned out on schizoposting as well.
>>107366248Hmm, so this is what it was RLHF tuned on I guess.
>>107367684I might try this out, C++ ui is what I wanted for a while
>>107367782>Wish you were burned out on schizoposting as well.>t. schizoanyways, I hope ani comes back. comfy needs a replacement because comfyorg is filled with grift chinks and indians now
>a c++ ui for imagegenc++ and python :]
>>107367775I completely understand that this woman is a 10.in Asian beauty standards but since I don't sexually subscribe to those standards she looks like a 6 trying to be an 8 which is making me uncomfortable Also smoking isn't hot but I think that's either a generational difference or my mouth fetish and desire for a tasty clean mouth to lick
>>107367790Yeah, yeah. You can call me the schizo. But no one ITT used your bakes with AniStudio in them. Strange.
>>107367791if we're being pedantic pytorch is c++ and python too, kinda
>>107367791I think python is only there for conan or running scripts in the app. there isn't anything else suggesting it relies on it at all.
kinda wanna try training lora with actually good nudes + detailer using ostris adapter, but i’m way too fucking lazy to prep the dataset, haeughh
>>107367811vibe code it by talking into your phone microphone
>>107367524>Sabattier effect Photographyneat concept
>>107367788go for it anon, it's worth it
>>107367839the last time I tried I couldn't get it to work from the release but I could build my own binaries. if it still loads the model every gen it's probably not worth using atm.
>>107367775Is that a sigma
>>107367822>Sabattier effect PhotographyI don't think it understands the concept sadly :(v
>>107367904You could probably gen some kino playing cards with Z
Kino gen hour
>What the ace of spades playing card would look like if designed by Claude Monet, a winter landscape painted in his signature impressionistic style.
if Zimg gets porn knowledge from the noob dataset, can it use it to gen 3dpd?
>>107367953>>107368044nice
>>107368022You'll be able to gen more than that yes
>>107367695>One of the images you postedthat wasn't me, but thanks for the workflow.
OK, Z is good. But There's a problem. Well, two. Most of the time 1girls come out looking asian even if you specify "white", "caucasian", "western", even "jewish".Also, their crotches are always smooth, and trying to unsmooth them summons horrors.How do we fix this?
>>107368058>How do we fix this?By waiting for the non distilled pretuned base.
>>107368058Add in a random western female name. I haven't tried those ethnicity but Slavic gives me white women.
>>107368058caucasian should work but try something more specific, plenty to choose from
>>107368058>How do we fix this?Loras?
>>107368069"Jessica" gave me a Chinese woman (my IP range can't post images).I guess I'll try Penelope Featherstonehaugh next.
>>107368058If you mean detailed vag, you can't. Only way is with a lora. Just wait for the base model drop. Tho even when the base comes out they’ll probably gut the hardcore nsfw same as the current turbo ver.
>>107368051i couldnt really get the design/art to cover the entire card, it always seemed to stay mostly in that center section prolly issue with turbo
>>107368096Meet Marilyn Minter
>>107368058That's the cost of westernoids letting chinks win. When you use an empty prompt, most of the gens are faces of asian women, so it's pretty baked in there.
>>107367779What like a Filipino?
>>107368096Now we're talkinghttps://litter.catbox.moe/4kyesr6oemwqycm3.png>>107368107It's not surprising. As an European, I'm pretty used to united states culture to be shoved down my throat and I guess it's China's turn now to throat-fuck me. At least I've been learning the language so I can enjoy it undubbed.
>>107367932whats the prompt to gen the roastie?
the distilled lora training doesnt work very well. it seems to have a very hard time capturing the likeness of faces compared to flux. it works well enough up close, but as soon as you zoom out for a full body shot the face becomes almost unrecognizable. the lora was trained on 4000 steps, 3 curated 1024 pixel buckets, around 60 images with plenty of full body photographs, rank 4 and the rest default settings. the same exact dataset made an excellent lora for chroma, so the images themselves should be fine. i hope this is a byproduct of the sketchy distilled training method or because of some new caption rules, and not a sign of model/6b limitations etc.
>>107368107>That's the cost of westernoids letting chinks win.Well, if it was the west winning, the default would be a black personI'm fine with china taking the throne
>>107368192>if it was the US winningFixed that for you.
>>107368170I like the juxtaposition of stylistic elements.
This model is OKhttps://litter.catbox.moe/wtbt531izi44pq51.png
I can't get z to do facesitting, it's not as easy as 'sitting on head, face between buttocks', it just won't do itI also seem to have great difficulty getting one person to grab the crotch of another person, I can get their hand close with "reaching down" but not straight up grabbing
>>107368211After experimenting a bit it's clear to me it's a typical distilled model. Very pretty looking images, but shitty once you try to go outside of the few concepts they baked into it.I predict that the base model is not going to be better than flux dev.
>>107368173It's certainly not perfect but way better than I expected for a distilled model, bodes well for the full one. Wish I could post my gens of it here but it will just enrage the jannies.Settings I used in screenshot (just stolen from someone on r*ddit), only 14 images in the dataset cause I just had some laying around and wanted to see what happened. Took approx 1h 40m to train.As for using it, a strength of 0.85/0.9 seems to work well, honestly results are better than my attempts to make loras for SDXL etc already.
>>107368231Sorry for being a noob, but what tool is this?
>>107368263ai-toolkit
>>107368231rank 32 is crazy for 14 images, how big are those safetensor files? and if its a emma watson lora, you're training it on a person the model already knows, which in theory should make it way easier for it to learn. i guess i'll try sigmoid next, but i'm unsure if i'm going to be bothered making any more distilled loras.
>>107368265Thanks
>>107368231>Rank 32Please don't do this
>>107368269Yeah for sure it helps it has a rough idea of who she is already.
>>107368229Z-Image Turbo is already much better than Flux devAnd as always, the true potential of a model is shown when you start training loras/finetunes, this is the reason SDXL is still so widely used
>>107368269also
Could Z Turbo survive being raped by 13M images though?
>>107368231Diffusion-Pipe dev is also implementing Z-Image Turbo training, seems everyone is too eager to wait for base, can't blame them.https://github.com/tdrussell/diffusion-pipe/issues/462
god damn it. i thought "lying on her back" generating up-side-down women was a FLUX specific problem, but the same exact thing happens with z-image for some reason.
>>107368275To be fair we don't know where training Z-Image Turbo lands in terms of optimal Lora rank, but yes it's hard to imagine having to go over 16 when training a single person.
>>107368292I'd say turbo is better than schnell, but it suffers from the same narrowness. Maybe it's better than dev, I have not tested it enough. What I'm saying is they're hiding the flaws behind the "portrait-optimized" (1girl generator) distill, and that the base model is going to strike everyone as more mediocre. We'll see.Do you have some good prompts for comparing Flux and Z turbo?
Unless z-image can do good anime/hentai it's worthless
>>107368333It can do anime, but it looks chinese as fuck
anon, your warning about that tranny censor guy was spot on, we're so fucking back. also, that chroma lode dude is in there too, discussing about his Chroma.
>>107368331I suspect the gap between the quality ceiling and floor will be widened which will cause many to think it's mediocre at least initially.
>>107368346nobody knows what youre talking about
>>107368347My dude, that's a cope if I've ever seen one. A preemptive cope no less. The truth is we just don't know yet.
everyone's talking about the base model but if it's released, is going to be a 32b monster or the same size like flux de-distilled? It'll just be a lot slower, requiring more steps, right?
>>107368353prev thread
>>107368359Yes, I am coping hard. Coping is all we have left.
I'm not sure it trains better on subjects it has knowledge of. Just threw these two datasets from my wan loras at ZImageTop no lora, bottom lorasIt obviously has a decent understanding already of Rapunzel but not so much Helen Parr but it seems to have learned each equally as well. Also the prompt was "gang sign" not peace "peace sign"
does the z stand for zhang
>>107368374Looks like it picked up the characters spot onDid it learn faster compared to Wan ?
How do you correctly set a local path for the model in Ostris? I keep getting the error about the config.json not being found even though I replicated the whole huggingface directory structure for z-image turbo.
>>107368403it stands for zlop
>>107368325
>>107368362>but if it's releasedThey already stated it will be released>is going to be a 32b monster or the same size like flux de-distilled?It's not going to be anywhere near 32b since the whole reason for this model is to prove that you can get great quality without huge scale, at most it will be 12b, but I doubt it will even be that
>>107368406>Did it learn faster compared to Wan?Seems so, did 2000 steps in around half the time of wan (1/4 the time if you account for both low and high noise models)It's hard to make a direct comparison though, I train wan with musubituner
>>107367932Put the Asian waifu head on the roastie body and you have perfection.
>>107368464>They already stated it will be releasedI'm 90% sure they will but this would not be the first time a Chinese company has said they will release a model after experiencing some success and then renege on that promise for unknown reasons.
>>107368484When did this happen, what company and what model was officially announced to be released and then wasn't ?
>>1073684923D 2.5Wan 2.5 was originally slated to be open source (if you believe the leadup)There's also a few literal who video models that got to the "We are waiting for approval" stage and then ghosted.It definitely happens.
>>107368500>3D 2.5What was this model and by whom ?>Wan 2.5 was originally slated to be open sourceBy twitter randos, nothing official at all
>>107368513Hunyuan 3D 2.5And that was confirmed on discord by the staff.
They're awake, active on HF. I trust the plan.
>>107368466Nice what resolution(s) did you train at ?
>get back from doing core day at gym>lie in bed immobilzed and all I want to do is genI can solve this with a remote tablet, can't I?
This is taking absolutely FOREVER. What have I done wrong?
>>107366189This
>>1073685301024 with 0.0002 learning rateBut they're a little overfitted so I'm retraining with more steps at 0.0001 lr
>>107368551>wan textimage2video>huanyun>2.2 vaeI dont have a reaction image for this.
>>107368565hold on, i do:
>>107368565>>107368574There's nothing wrong with that>2.2 vaeThis is an actual thing for the 5B model, only 14b uses the 2.1 vae
>>107368582oh, i hadn't even looked at his workflow i just wanted an opportunity to call someone retarded since its been a whileHAHA RETARD >>107368565 his workflow's fine
Are you READY for another beautiful day of BLACKED RICE WHORES? (after I buy my groceries)
so how are you coping with the fact that you have to write your prompts in chinese to get descent results?
>>107368551If you need to cope with 5B then give up on video.
>>107368597youre obsessed
>>107368602just reorder the prompt
>>107368602>descent >not decentGood morning Zhaar
>>107368612no i'm trying to generate images of the 1995 space ship doom clone
>>107368602Erm, achksully *pushes up glasses* it's not necessary, It just helps certain prompts. Retard.Like this one was pure english, starbucks and all. Kek i love that this fucking model has copyright knowledge, so good.
>>107368618Just a crippled chroma
>>107368618
>"Uooh, your 9B crotch smells so nice.. unlike Z-hang's pathetic 6B cock.."
>>107368631she lost a tooth from drinking too much pumpkin spice lattes every season, don't judge.
>Workflow contains unsupported nodes (highlighted red)>Remove these to run the workflow>nothing highlighted because I applied the fixes for themFuck off, Comfy... they're working. I can't believe 'ol moneybag's open sores program can't tell when something in a workflow is actually functioning or not.>>107368374Since Z uses Flux's VAE, you should use one of the EQ VAEs out there when you train. The results are always much better with one than without.
>finetunes will fix it
>>107368652What text encoder does Z use?
>>107368328>>107368275>>107368269Could rank 32 harm the lora or is it just the problem that its overkill?Is rank 32 good if training a style instead of a single person?
Usecase for Z-image? I'm being dead serious. It's suck in a middleground of good but not good enough.
>>107368652>The results are always much better with one than without.Your image looks like it's made of fabric
>>107368700deslopper refiner
>>107368687from what i can understand a high rank basically lets a lora learn many different things, but a high rank also becomes more dilluted meaning things like a persons likeness wont be as rigid. a rank of 1 will give you as perfect likeness as possible, but it's going to be very limited in doing things your lora wasnt specifically trained on. so you should use a higher rank if its a broader concept, and use a lower rank if its a specific thing you want to generate.
>>107368677...anon
>>107368723makes sense, thanks
Fresh thread>>107368734Fresh thread>>107368734Fresh thread>>107368734
>>107368724Anon what nigger, is it just the same as flux on vae and text encoder?
>>107368687>Could rank 32 harm the loraTheoretically a lower rank is more 'focused' and thus better when training a single concept (assuming it's not a lot of images, like thousands)>if training a style instead of a single person?Rank 16 should be more than enough for a style unless you train a huge amount of images of said style
>>107368777interesting
>>107368574