Ingrate Contrarian Dipshittery EditionDiscussion of Free and Open Source Diffusion ModelsPrev: >>108027322https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/tdrussell/diffusion-pipe>Zhttps://huggingface.co/Tongyi-MAI/Z-Imagehttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>Animahttps://huggingface.co/circlestone-labs/Anima>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>LTX-2https://huggingface.co/Lightricks/LTX-2>Wanhttps://github.com/Wan-Video/Wan2.2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
https://github.com/Haoming02/sd-webui-forge-classic/issues/671 He noticed
>>108028569>>Maintain Thread Quality>https://rentry.org/debo>https://rentry.org/animanonwe know you are mentally ill but could you stop including your schizo nonsense in the OP?
Blessed thread of frenship
>oh, a new ldg thread, surely the schizo op listened and included my wrapper!>ctrl-f ani>THERE IT IS HAHAHAHA IM IN THE O->AnimaACKKKKKKKKKKKKKKKKKKKKKKKKK
Interesting, large resolutions seems to work.
>>108028599>ACKKKKKKKKKKKKKKKKKKKKKKKKKthank god you are finally hanging yourself troon
>>108028599>it's a comfy collab>if i spam fud in the general comfy will go bankrupt>if comfy goes bankrupt people will buy commercial licenses for my wrapper
>>108028627what exactly is working? it looks terrible
>Z-imageWhat went wrong?
Schizophrenia
>score_9Ew, proper sloppa tag.>>108028635Overall the characters aren't broken.
>e621 is required for a kino anime model sad. ill wait a bit longer then.
>>1080281080.6B bf16 is 1.2gb, which is roughly the size of clip_l and clip_g combined.Maybe some meaning as to be a true SDXL successor or some shit.
>>108028721What? You can prompt anima just fine with danbooru tags or NLP.
>>108028721e621 has never made anime models better
>>108028735danbooru isn't enough. i loathe 90% of the posts on e621 but somehow it gives models sovl.
The model truly is impressive because it's more refined in certain aspects, but in my tests it still is behind Newbie (and I'm guessing NetaYume as well) at prompt following (though it's better with text and results are very aesthetic). Newbie still has a better understanding of raw artistic style control.Here's the prompt I gave Anima (re-formatted from the Newbie XML version):>masterpiece, best quality, score_9, year 2025, highres, safe, 1girl, 2b_(nier:automata), nier:automata, painterly, impressionism, brushstrokes, An artistic, monochrome black-and-white illustration of 2B from NieR:Automata sitting at a restaurant table. The style is a unique blend of detailed manga linework and painterly impressionism, featuring thick, visible brushstrokes and impasto textures. 2B has her signature short white hair and black headband, leaning one hand against her chin while her other hand gently pets a cat lounging on the table beside her. In the foreground, a wine bottle and a half-filled wine glass sit next to a plate of food. The background consists of blurred restaurant windows and shelves, rendered with soft, atmospheric strokes that contrast with the sharp, rhythmic hatching of the character's clothing and the cat's fur.
>>108028721just put only score_9 and you will instantly get your ponyv6 sepia sovl
2026 and we still gotta do that "\" shit before a parenthesis if there's a parenthesis in the series or artist name... fuck...
>>108028735yeah it handles either much like NetaYume, but just is mostly way more stable than NetaYume so far. And faster
>>108028748yeah in my 10 minutes of testing i couldn't really get it to do painterly stuff at all sadly.
Soul.
>>108028748That kinda stuff looks better on newbie because newbie can only do that kinda stuff. Almost all gens on it looks smeary which just kinda works out in this scenario
So uh... what megapixel size can I gen at without getting these gay ass ,W (188, 125) should be divisible by spatial_patch_size 2 errors
,W (188, 125) should be divisible by spatial_patch_size 2
>>108028731Even then, the 1.7B model seems like a better sweetspot
>>108028772writing this down for the retrain/re-license
>>108028680Outdated mediocre model. In 2024-2025 it would be good, but in 2026 it's meh.
>>108028680Chinese eyes too small to see BRC (Big Russell Cock) coming their way
Damn it, I have to go to the gym.I need to set up a way to gen on my pc through my phone.
>>108028770wat
>>108028772Obviously large te would have been better. 1.7B is also still vramlet friendly, can be run under 4GB vram fine.But still, 0.6B one is working surprisingly good for its size.
>>108028791it puts in perspective how outdated 12B T5-XXL is
>>108028770Anima? 0.5m to 1m, it's been mostly trained on 512 pixels so farAny other model? 1m to 2m
>>108028794what the fuck even are those errors he's getting though, like where / how is that a thing
>>108028770125 is indeed not divisible by 2 anon
>tdrussel doesn't have a discord serverI don't think you guys understand just how much we have won here. There won't be any sabotaging by duplicitous furries.Every time some promising model comes around, furries rock up just to poison the dataset or training. I've heard of some even lobbying for synthetic (slop) datasets, or for the addition of low-quality niche fetish content. I swear they do this intentionally too.
>>108028748Actually that's cut off kekhttps://files.catbox.moe/sl91it.pngHere's another one, Newbie understood this prompt a bit more since it needed to split it into panelshttps://files.catbox.moe/7stv4h.png
>>108028805i don't think that's a thing, Chroma is actually LESS similar to Fluffyrock than I expected, overall. Anyone who actually fell for the "muh Chroma anime" or "muh Chroma realism" meme was always retarded
>>108028770What latent size are you using? I realized that going to 1532 or beyond causes the same error. If you want a good aspect ratio, decrease the lower number instead (e.g. 1280x720)
>>108028808I'll wait for Anima lora or any finetune, as the default art style feels too “AI” for my taste.
>>108028819>muh fynetoonzevery time
bloatmodel fetishists getting BTFO hard lately
>>108028786Look into Tailscale, and if you want to self host your own server Headscale is what you want.
>>108028826acckkk
>omg it's small and fast!dalits self-owning lately so much lol
>>108028833what's a dalit
>>108028816Chroma is different because the creator was too autistic and determined to be sabotaged by others.It was also different because up until that point, all large scale finetunes were ruined by stupid decisions to cater to furries, or models were shadowdropped.Anyone remember when noob was training? It had so much potential until some dickhead added a differently captioned non quality-filtered dataset full of furry shit from e621 and pushed for the text encoder to be finetuned too. This caused the model to freak out so hard that it couldn't do basic human anatomy without filling the negative prompt with every furry tag imaginable.
weird, my nano banana pro generates images at 4k in under 10 seconds. maybe something is wrong with your comfyui workflow?
>>108028584based and Haomingd
Now that we have ramtorch and similar systems local models should be ~32B parameters, there is literally no reason to use toy models. And yes I know some thirdie will think this post is insincere
>>108028738never used noobvpred?
>>108028824yes, waiting~
>>108028848i'll ram my torch up your bum if you catch my drift
>another troonjak rentries thread>another failed garbage troon threadwhy are anons so fucking retarded? why do they keep falling for ran bullshit?
>upscaling with zitKino.>>108028828Thanks.
>>108028845how many ((((((((komfybucks)))))))) does it cost tho
>>108028855they don't really care the board has a shit skidmarks of unfilled /ldg/ threads. it's rude
>>108028819Just use @artist tag.You can give generic style descriptions as NLP too. It works >>108028826Kek so true.>my ten thousand dollar RTX 6000 pro purchase will give me superior gens over those filthy vraml-AAAAAACCCCCCKKKKKK!
>>108028848genuinely, why not just trained big models and distill them properly? z-base is dogshit compared to z-turbo but when z-turbo released everyone celebrate it as the greatest thing ever. so clearly distills give you faster speed and better quality, so why not just train something big and good like qwen 2512 and distill that
>distills>better quality
>>108028860anons really are gullible as fuck. they don't stand for what they believe in (if they even believe in anything). they want cozy breads but don't do anything to achieve them
anon shouldve asked dtrussell why he chose to include pony style score tags i fucking hate them so much
>>108028863cause you still have to load the model into memory. Qwen 2512 could be better also, it's not even on Qwen 3 for the TE
>>108028870oh god i'm scooooooring8 uppppaahahaadsdsgsgf
>>108028862>@artist tagaighty, imma try em, thanks!
>>108028872>cause you still have to load the model into memoryok but did anyone who wasn't brown have an issue doing that with z-image?
>>108028837This guy is a dalit, you may also hear him called a brahman sometimes but I assure you they are the same thing.
>>108028870the censor tags are bloated too and he says they are required sometimes since the model will produce non euclidean slop without them
>>108028855>>108028860>>108028868Do you think anon will fall for your posts this time
>>108028879no but qwen way chungser than Z
>>108028837Another word for indian
>>108028886i can gen pretty decently on a 4090 with fp8, 24gb is the minimum anyway and a qwen turbo would be insane. lets leave those poorfags in the dust together, just you and me
>>108028855Kill ani
>>108028808The fridge POV results are much better than what I could get with Newbie on average though in terms of prompt adherence and overall polishhttps://files.catbox.moe/ucp7u9.pngI found this model has a way better representation of objects and backgrounds in certain prompts (though not all).
>>108028833>>108028882only jeets care to differentiate jeets
>>108028896isn't this post is against united states law?
>>108028903where is the noob/ill comparison? nobody actually uses newbie or neta
>>108028908That's why miscasteing makes them seethe so much :)
>>108028911It is. Ran probably doesn't care because he's spamming threads from his proxies.
>>108028794>>108028817so im just not able to gen at a higher res than 1mp? not even upscale and a second pass?
>>108028915how would you know what makes indians seethe? sus
>>108028911Probably Kill ani
>>108028922damage control? big izzat loss!
i don't know what z-image has against reimu but my test booru finetune so far uniquely renders her this way loltbf 1:1 aspect is relatively underrepresented in the dataset so maybe that's the issue. i should try a tall aspectz-image is godly though. training on a wide dataset of booru images, it produces images that look like real drawings. anima is too much of a sidegrade from SDXL>>108028731i spent a while last year attempting to distillation train Qwen3 0.6B on T5-xxl as an experiment. i have a lot of output chroma images with Qwen3 0.6B. i don't think it's capable of matching T5-xxl but it's absolutely capable of producing coherent semi-prompt-following images. I wish I'd tried Qwen3 4B because i'm pretty sure it would be able to work as an alternative... 1.7B maybe too, I tried it once and it was doing an okay job.
>>108028937 (Me)Actually I'm going to post some of these bad Qwen 0.6B + Chroma gens
>>108028908 >>108028915 >>108028922 >>108028926Call a jeet Pakistani sometime they'll freak out.
>>108028937>anima is too much of a sidegrade from SDXLi think till come down to what happens first; tongyi releasing the turbo sauce / anon figuring it out OR anima being "finished". the comparatively slower speed of base seems to be a determinant for many. and "sdxl anime but with a 16ch vae" is what anons wanted for a long time.
What's the token limit of the 0.6B qwen? Still 8k or smaller?
>>108028808Seems to work if you start the prompt with "it's a 2 panel manga"
this is the power of chroma
This model sometimes decides that there has two be a second person or another viewing angle in your prompt and really fights you for it lol. Need to experiment more for a consistent workaround.>>108028937>i spent a while last year attempting to distillation train Qwen3 0.6B on T5-xxl as an experiment.Interesting. I know this is a long shot but would you mind sharing training code? I am curious about how that's done.>i don't think it's capable of matching T5-xxlSo far I haven't run into anything it consistently can't do that t5 excelled at. There are certain "world knowledge" limitations with small models, but it's also so much newer and better "per weight", for the lack of better word.
>>108028632literally why are you so obsessed? is it impossible for different people to be angry with "comfy" org? is it some retarded shill tactic to call anyone against "comfy" ani?
>>108028963>This model sometimes decides that there has two be a second person or another viewing angle in your prompt and really fights you for it lol. Need to experiment more for a consistent workaround.The model inherited a lot of sdxl-style nonsense.
>>108028964Yes. That's the typical cumfart damage control. Before that it was "voldy shills".
>>108028903It does quite well with giantess but it's one area where Newb still edges it out a bithttps://files.catbox.moe/gr45j8.pngAnother comparison (more difficult so both are unpolished, but Newbie has the basic composition a bit more correct just objects/streets/people less refined than Anima)https://files.catbox.moe/mdk00i.jpghttps://files.catbox.moe/o765mf.pngThese are both models with great potential once they're both fully trained. Rooms for improvements? Anima could use a bit more painterly style knowledge to let the DiT shine.
>>108028956You are also forgetting modern text encoder that can handle complex natural language and rectified flow that anima brings over sdxl.
>comfy is getting objectively more bloated, unstable and insecure with every new release>more and more new saascuck shit added all the time, new api shit, new comfy coin bullshit>You're definitely shilling some other interface if you're against all of that though!!
>>108028937>z-image is godly though. training on a wide dataset of booru images, it produces images that look like real drawingsUnless there is a turbo I really don't care, Waiting that long for images is a big killer for me. No point in this stuff anymore at that point
>>108028960Nice, but she should be interacting with the cat on the left panel directly, should be a bit more like this (first Newbie gen I got for the prompt)
>>108028987but we certainly don't need more efficient uis. you're crazy to suggest that
>>108028956Yeah but those of us finetuning z-image base are going to do step distillation one way or another. AFAIK the thing missing from our knowledge is the RL model they used. And i'll be honest, I think base is still better than any other local model and definitely any local model with a permissive license. But we'll find a way to match the RL model. I think the architecture itself is the important part. Which is why finetuning still produces real-looking imagesRe: the image - this is an image you get when you start distillation training the Qwen3 0.6B text encoder for Chroma
>>108028993relax. cumfart is all there is and will be forever
>>108028848> ramtorch Does it work on AMD or Intel cards?
>>108028993Where did UI come into this? I am sure he will support for anima on forge neo if that's your thing
>>108028855What can we do?
>>108028998It's just a memory management strategy, it shouldn't require any special hardware
>>108028998why would you care about poothon garbage
Here's the Qwen 0.6B Chroma image that proves you can smash precise text recognition into a tiny model
>>108028999all python uis will be memory hogs forever by definition
>>108028999His bot broke and replied to wrong post.
alright I trained f2k and zim on the same dataset of a person with very similar settings and while both produce solid results i think f2k takes the cake simply because you can use the lora on the distill without any visible degradation, while zit completely shits the bed with a zim character lora.also f2k seems to properly learn details like moles while zim completely ignores them. i like that zim can use negative prompts but i guess you can also do that with f2k with some workaroundzim takes about a minute for 30 steps on my 5070ti while the f2k distill took 20 seconds for 8 steps (both res_2s)
>>108029014>you're a bot if you don't like comfyuiMaybe you are a bot?
>>108029018Looking at the image comparison you posted makes me agree with you wholeheartedly
>>108028998>>108029006No, it's not> A memory-efficient linear layer implementation that keeps parameters on CPU> and transfers them to GPU on-demand using asynchronous CUDA streams.> > This approach interleave compute and data transfer, making it useful for:> - Very large models that don't fit in GPU memory> - Scenarios where GPU memory is limited but CPU memory is abundant>>108029007Because AI, especially training, is Python.
>>108028913>where is the noob/ill comparisonI didn't make one because I don't test prompt following on noob/ill but it'll probably be utter humiliation because CLIP doesn't follow prompts as well as any of the newer text encoders.
>>108028950>>108028962>>108028994>>108029009Not gonna spam you guys anymore but anyway, discarding the whole 0.6B Qwen Chroma thing for a second, all the z-image nonbelievers are going to feel quite stupid soon. A booru model trained on this is going to be insane. It's literally going to be print-your-own-booru-image. Luddos will scream & cry trying to identify the image as synthetic
yeah this killed illust
>>108029045You just described a memory management strategy. It's literally just layer offloading but in a way that doesn't slow down inference speed
>>108029075btw can i just say, thank fuck for musubi tuner and its dev. every other training project is shit
>>108028828>not just setting up openvpn or wireguard>implement PROPRIETARY solution insteadlol, bunch of literal retards
>>108029066>Luddos will scream & cry trying to identify the image as syntheticI don't get this kind of childish spite desu, why do people keep trying to shove AI art into the face of people who don't wanna see it, this only creates hostility, I don't pretend my gens are hand drawn, what do these people get out of it
>>108029018>>108029030i'm still experimenting with zit because its really stubborn. made some improvements, so now it doesnt *completely* shit the bed but it's still pretty uncanny to me
>>108028986
>>108029018klein has a turbo lora available so that you can control how much distilled is in the base gen. Why isn't there an equivalent for z-image?
How could i hate api nodes when theyre just so damn good?
>>108029085Tailscale is just Wireguard without having to open ports or fuck with config files, the client is open source and if you want to self host Headscale is an open source server (you will have to open ports for that or faff around to run it over a tor hidden service). Also the Wireguard data never goes through their servers they just coordinate clients without opening ports/behind cgnat.You clearly have no clue what you're talking about. https://tailscale.com/blog/how-tailscale-works go have a read, if you're literate and capable of silencing your schizophrenia demons long enough.
>>108029018Yeah, that's Flux.2's VAE in action for those closeup details, plus I'm sure BFL did extra tuning to ensure it's not slopped and the colors are accurate this time around, unlike Z where colors feel washed.
>>108029149you're still relying on a relay/coordination server outside of your control to do the discovery phase, and that's 100% closed sourcetailscale is NOT local, fuck off shill
>>108029120>gretafagOh you should've said that initially so I could more easily disregard your opinion
>>108029136Just wait a few days until someone inevitably makes it or create the diff lora on your own.It's not difficult.
>>108029173eh, i just needed a character to experiment with and its not like we need yet another sydney sweeney or scarlett johansson lora so whatever
Which model can cope better with shitty datasets? O tried some old datasets that were small and/or blurry on ZIT and the resulting loras were pretty impressive given what it had to work with. How do ZIM and the kleins fare in this regard?
>>108028850They are just trolling/retarded, anyone with even a remote speck of understanding of this garbage knows that including e621 would colossally improve the model's capabilities, not only because e621 tagging is far superior its just more datapoints, even r34 is a good source in this regard, but when you say that the only thing a retard sees is "furry = bad"; that being said including e621 would also greatly increase training costs, which is probably why they never include it.
>>108029161I self host my Headscale server and mentioned Anon could do the same, it absolutely is local and 100% open source and enables me and potentially him to access local and open source diffusion models running on our own hardware while outside of our home networks.
>>108029205I got banned for racism once for posting a image like that
>>108029120My ZIM LoRa on ZIT needs 1.75 strength before I'd consider it usable and 2.0 performs better when the face gets smaller or has to change direction. The overall performance though is a lot better the my ZIT-trained one (less occurrences of body horror and other little things a LoRa can introduce.).Z-Image using Flux.1's VAE also means you can use one of the EQ-VAEs out there for even better quality (those helped out big time on Flux). I haven't trained one with EQ yet, but that's on my to-do list.
>>108029214Some jannies also don't like Hitler gens
>>108029204I would say that a small model might degrade by massive influx of furry slop, when there is very limited amount of weights to hold data and you are going to get contention at some point.It's possible that noob for example failed to learn some characters and styles it otherwise could have learned due to furry data. Though again most likely superior tagging and additional data helped more than hurt it.Furshit is fine as long as it is tagged out clearly.BUT I don't mind models not touching it neither in the age of NLP. Anima can survive without e621 tags.
>seaart pruned the loras of my waifusigh, is there anywhere to find loras of actresses/celebrities these days?
>>108029232Trained by yourself on your own hard drive. It's the only way.
>>108029232Probably on Chinese sites since cheeto hitler made deepfakes illegal in burger land.
>>108029249>>108029250that's unfortunate. thanks fellas
>>108029232depends on the base model. check out /r/ realistic parody OPi think there is a lot of zit
>>108029120>zit would just produce samefaceTime to get zim up for some hot Greta gooning
>>108029197Use edit model to unblur the images first.
>>108029214Whatever helps them feel like a woman, amirate?>>108029232huggingface but they get periodically jannied.Is civarchive still around? Maybe there too.But training your own is the best.Most lora trainers are jeets who can't train anything even remotely passable.
>>108029197>How do ZIM and the kleins fare in this regard?absolutely unforgiving imoi ((enhanced)) my datasets with klein 9bsomething along the lines of "remove artifacts and make high quality"just be careful it doesnt change too much
>>108029305kek also thats a nice style anon
>>108029250What do chinks use? Does modelscope have loras? Tensorart and Seaart are both just as dogshit as civit.
>>108029314Something unpronounceable I'm sure.
>>108029298Yeah I know but I'm just interested, tells you something about a models learning ability, and sometimes unblurring chsnges the image too much or introduces unwanted things etc
https://www.modelscope.cn/models?name=z-image&page=1&tabKey=other&tags=LoRAOh they have. But the filtering is fuckign ass
>>108029075That's from ramtorch sources and that's implementation of the strategy using cuda.
>>108029323Catbox please
>>108029225Danbooru is nowhere near close enough data to saturate that model, doubt they have a anime dataset large enough to, it's just much much cheaper (because money is always the constrain with this stuff) since its less stuff to learn.
>>108029343still learning the model https://files.catbox.moe/n5vvfd.png
>>108029204Actual bullshit, there is no evidence e621 data made it good and way more evidence that it made it worse, to this the vpred Illustrious model that was never released was way better than noob vpred as was NAIV3 which is still the best XL anime model. Both don't have a lick of e621 in it.
>>108029361Thanks.Interesting to see so many style stuff in the negatives. Also having TT in the negatives while prompting raven is certainly a choice.And ughh can't say I am a fan of trannies neither but I don't think schizo prompts about trans stuff in the negatives are helpful.Just add futanari if you are afraid of the model accidentally genning dick girls.
>>108028964>is it impossible for different people to be angry with "comfy" org?
>>108029393>the vpred Illustrious model that was never released was way better than noob vpredYes because the only difference between the two was the dataset and not any hyperparameter amirite?>as was NAIV3 which is still the best XL anime modelNow that's a real argument, but again, how do we know NAIV3 wouldn't have been better if it included e621? Because I'm pretty sure the difference between it and the other models is not merely dataset.
>>108029306I assume 9b is better at preserving things like facial features or overall coherence than 4b right? I upscaled things with 4b and it often introduced a shiny slopped look to skin, is 9b better here?
gm ai sisters
>>108029399>Also having TT in the negatives while prompting raven is certainly a choice.its an old trick from nai/illust/noob to rid characters of their canon style. im not editing any of these old noob prompts save for the primer >I don't think schizo prompts about trans stuff in the negatives are helpful.they are basically required for naked noob which is where these prompts are from
>>108029407Illustrious creator himself blamed e621 dataset on why noob was worse.For NAI they did train an e621 model later on which was not that well liked if I remember. There is no way the didn't try what you said but saw the results weren't worth it which is why they made it separate.
>>108029425Yeah that makes sense.I had a long schizo list of negatives for noob too. (Although I switched to a more concise one later on)
>>108029189find some hottyit's like the anons endlessly experimenting with epstein or trump, at least have the grace of playing with nice girls instead of this genuinely mind rotting stuff
>>108029430makes sense, hopes up that cumfy's dataset is up to match any resemblance of a quality model
>Multiple threads-long meltdown over an anime booru tag prompting modelWas it worth it?
>>108029430>Illustrious creator himself blamed e621 dataset on why noob was worse.The illust creator was seething at Noob too? I should've guessed fucking keeeekkkkkkkkkk
>>108029328>nothing nsfwI guess they block it all? Unless it's behind login
>>108029446Don't they have their own civit clone? NSFW will be probably buried like on HF. Only for people in the know.
Anons playing with video models, is ltx2 finally getting rid of its awful sound quality and random unmoving photo gens?And is nsfw finally working without looking ridiculously bad?
>>108029413>is 9b better here?honestly i felt it was pretty good, yeah. just try it yourself. sometimes it introduced wrinkles that were not there before, but you can always reroll
>>108029461No and not really. It will be some time before it catches up to 2.2, and I expect wan 3 will be a thing by then.
Haven't seen this many watermark/text hallucinations in the bottom for a really long while. And text, watermark are in the negatives too. (Though it is a lot more coherent than when sdxl hallucinates them, you can really feel the qwen vae.)
>>108029461Sorta kinda, it will honestly need that vae+model update that they have planned to really fix everything but they released a new sampler and some settings you can fingle around with to improve quality. Model has potential but consider this one a beta and just fuck around with it
>>108029489Is this with watermarks in negatives?
ltx-2 is so fun (default i2v template in comfy, set frames to 240 or 10s)https://files.catbox.moe/6fivxj.mp4
>>108029535"with an american style accent":https://files.catbox.moe/4h0cyf.mp4
>>108029535How come nobody has taken this whole thing to refer to him as Dorito pope again?
>>108029529Yes but I think it misunderstood prompt all together. The "You are an AI assistant..." stuff started appear on other seeds at top. It's kinda wild how it interprets the prompt sometimes.
>>108029551>You are an AI assistantNigga why are you using this shit. It's a text encoder.
>>108029551Anima doesn't need that part only the lumina models
>flash attention + torch compile + klein 9bwhy doesn't it work?
>>108029489you can have a 2nd pass with klein and something like "Remove watermarks and text"
>>108029549everyone thinks he is shill man now.also there is a ltx2 video extend workflow, which can clone voices or movements, pretty neat imohttps://huggingface.co/RuneXX/LTX-2-Workflows/tree/main
>>108029562did you remember to set the CUDA_ENABLE_SPEEDHACKS=1 environmental variable?
should i directly train on multiple resolutions or start with 512 for a couple thousand steps and then move to something higher? assuming the use of buckets of course
>>108029555Not out of necessity, but to improve quality:You are an assistant designed to produce aesthetically pleasing, high quality images based on user prompts. <Prompt Start>>>108029557No not need, but from limited testing (admittedly needs more) it seemed to improve quality to me. This is the first problem I have run into it with it, after more than hundred gens. Might be worth it to keep it if it causes problems with prompts very occasionally.>>108029568I mean I can also just cut images too.But good idea if watermark is too big and destructive to remove by cutting.
>>108029569lmaoit worked well with the geoff clip, video extend workflow: do skip first frames setting to pick a good start spot, then frame load cap 49 (or whatever) to pick a good end point.https://files.catbox.moe/d5o0y8.mp4
>>108029415>zomg! faggotry guys!!111
>>108029578Just realized how many extra its are there lol.
>>108029578>it seemed to improve quality to meBecause it just crunched the numbers. It's the same as the padding slop for ZiT. If you want system prompts, you'd need to use a direct LLM loader that supports it and then turn the output into conditioning.
>>108029562I asked Klein and he said picrel
>>108029562FA doesn't work with imagegen. Torch compile is pure ass and you'll have to recompile anytime you change prompt or add lora or change lora strength; wasting any time you could've saved anyway. Use -fast fp16_accumulation if oyu want speedhacks.
>>108029580https://huggingface.co/RuneXX/LTX-2-Workflows/blob/main/LTX-2%20-%20V2V%20(extend%20any%20video).jsonthis is a very good workflow, the other one works but this is more refined imo
>>108029581Really glad you liked it anon
>>108029651>xhe only posts gens>xhe doesn't actually sell the figurines that'd sell very well
>>108029647for example, doritos pope with the workflow (if you use distilled ltx bypass the lora below the model loader)10s extend of the first part: it also has nodes for smoothing out the audio.https://files.catbox.moe/kmsa2l.mp4
>>108029578>>108029591Actually despite removing it the problem persisted in some other gens.So it seems to be something else.
Tried a ZiB lora (on fp8)Barely any change in output pic. Do I let it cook for more epochs or bump up the LR even more?
144 seconds to extend/make a 10s clip on a 4080, ltx2 q8 distilled. this model is meme magic.https://files.catbox.moe/bljr06.mp4
>>108029699i had to train twice as much compared to my character lora i trained on zit
GEOFF IS BREAKING THE CONDITIONINGhttps://files.catbox.moe/wu3xj3.mp4
>>108029742More epochs or LR?
>>108029748I've doubled the epochs and it was still only able to achieve %70~ likeness, but that also might be my sampling settings. Gonna try tripling it and see what happens
NovaAnimeXL 15 description>I think this will be the last version before Z-Image version dropsNovaAnimeXL 16 description>Z-Image Base came out last month so I guess I'll switch the base model into it after someone create Z-Image Illustrious model. Nobody knows whether this is the last version or not but I'm looking forward for newer structuresWhy did I expect anything from that retard
>>108029443Anime booru tag prompting models are literally the only thing of value to come out of AI
>>108029762skill issue
how does z have more realistic gen than flux 2? is it an architectural difference?
>>108029773Flux always was shit however Zit is very rigid. Unfortunately Zib is not a replacement so that's the trade off you have. If you want more flexibility you are generally forced to use Flux.
>>108029773it all boils down to the dataset they used for rlhf https://arxiv.org/abs/2512.11883
>>108029773It doesn't.
>>108029763I can see what you're prompting right in this thread buddy and I don't see a lot of skill
>>108029549>this whole thingyou mean the thing that came out a year and a half ago?
>>108029773The excessive quantization to get it to run on consumer hardware certainly isn't helping.
>>108029651I didn't one bit. I think you should have your own board.
Is there a way to leverage VRAM from 2 NVIDIA GPUs for image generation in SDXL-like models (Illustrous/Pony for example) on WebUI Forge? I got a 3060 Ti and a 5070 at hand and I'm wondering if I could use one for something like inference only and the other to load weights separately.
>>108029824How much quantization? A 24GB or 32GB GPU isn't enough?
>>108029834absolutely not
>>108029699DO NOT USE WARMUP STEPS WITH PRODIGYProdigy has its own warmup like logic. It messes with it.
>>108029839
>>108029834Nope. Extremely difficult to implement under normal circumstances and basically impossible with your dinky little forge UI.
>>108029839no you can run full flux2 with a 3090 (if you have a shit ton of ram (like me ;)))
>>108029699>01/02/2026>thinks epochs convey any meaningful informationoh well good luck on your trials and errors
>>108029854Damn. I passed on buying a 3090 to pair with my 96GB of RAM when I had the chance for a reasonable price, so I have been condemned to use a 12GB GPU as punishment.
https://files.catbox.moe/x7u325.mp4
>>108029856It was 13 steps per epoch. 60 pic dataset.
>>108029842Good to know.
>>108029863>the random guy going "yeah"kek
>>108028791I don't really know how these text encoder + image diffusion models work, but is there a reason the text part couldn't be done in RAM? 30b LLMs offer usable performance when most of it is in RAM. And here text is only a part of what needs to be done.
>>10802984812GB cuckbros... we wasted our savings...
>>108029861>I have been condemned to use a 12GB GPU as punishment.16GB is available for under 500 on the used market.
I made this thread
Not a /u/ fag but kino prompt there https://civitai.com/images/119355904Ran it on anima instead.>>108029888You can do that?There is some node, MultiGPU I think, that lets you run shit on CPU. It will be slower on system memory though, obviously.Comfy will typically unload the text encoder from VRAM before running the unet, so there is typically not too much point in that.
>>108029849What about asyncdiff and stuff like that?
>>108029912make me out
>>108029888>is there a reason the text part couldn't be done in RAM?isnt that people do with colossal models? cpu for text embedding and let the unet on vram?
>>108029913anima is such a slop
>>108029840Not even model sharing?
>>108029808but it is
where is the self-refining video comfyui implementation?get to it nerdschop chop
>>108029824flux2 is okay once you really nail down what you want, and describe it in absurdly redundant detailI think its problem is that it's too general, it has a bajillion parameters, but nobody gives a rats ass about 90% of themklein is what happens when you strip the fat and focus on the one thing anybody cares about, bitches
>>108029956i tried it but it has errors with res4yl
>Prompt: a cute bitch in the desert
>>108029969proof thats a bitch?
>>108029971Would it lie to me?
>>108029986>would a diffusion model lie?tourist retard
klein edit 9b to make fent manltx2 to animate it (i2v workflow from here is nice: https://huggingface.co/RuneXX/LTX-2-Workflows/tree/main)https://files.catbox.moe/kmh6ua.mp4
>>108029986if you let it
>>108029989>tourist calling someone else tourist>>108029992I didn't
>>108029989the only retarded ones are the ones who stick to this websitepicrel is a tourist retard, he looks happy, happier than anyone who seems to frequent here enough to know about the """culture"""
the culture btw is an anonymous schizo blood feud
For me the biggest improvement in achieving likeness was to be organized when collecting the dataset. Have a folder structure that has all angles and shot types and fill it and you will know what you are missing and what you have. Also you should have cropped 1:1 headshots from all angles.And if you can't be bothered to manage the bucket sizes just use 1:1 ratio.
time to bake, schizo
>>108029912Good job
Welp, seems like the character threshold is around 500-600 pics on danbooru at the very least, I tried with characters around 300 and 400 pics but it couldn't shit them out, not even close
>>108029699>he fell for the fp8 meme
>>108030066prompt? source is obviously a pepe but is it 1 or 2 images
>>108029996I made this general mongrel
>>108030090Presumably it should be because its not completely trained, else its total horseshit
>>108030093just 1 image>Transfer the image into a vibrant graffiti street-mural style.
>>108030104i dont believe you
https://files.catbox.moe/ch5bgx.mp4
>>108030107jej
Anybody else got this issue with Comfy that it corrupts checkpoints or TEs? Had this happen on Klein and now on Anima, for Klein I had to kept redownloading the TE because it got fucked up after every use, and now for Anima it's the checkpoint. Every time I close Comfy after running those models and then open it again later they're fucked up and I don't know what's causing it. It doesn't do that with WAN or LTX or SDXL.
>>108030108
A LOT OF LOYALTY FOR AN OPENAI SHILLhttps://files.catbox.moe/o1hgfk.mp4
>>108030115>FUD attempt #24526
Let's say I am into realistic nude dolphins.Would Wan2.2 still be the most adequate for the task?
>>108030115I have never heard of that and cannot even fathom how that would happen. Are you saying the weights on your computer have become corrupted?
>>108030108I cringed
>>108030119can you make him give birth
>>108030149it would be extremely painful
>>108030152for you
>>108030166enough shrooms
If im training a character lora for klein9B with 70 images, should i also include a regularization dataset? I also want to combine it with other loras and use it for editing.
>>108030171yeah
>>108030175Any good regularization dataset you can recommend? The ones i found online all look like sdxl slop. Or should i just grab 30 images off google and caption them myself?
>>108030171On paper, you should always use a regularization dataset, but in practice few people ever do.
>>108030182I'm checking, gimme a sec
>>108030145Yeah I have no idea how it's happening either. When I redownload the models it works just fine, but after closing Comfy and reopening it and loading the models again they either produce only patterned noise (Anima), or Comfy gives me an error for the TE (Klein).AND JUST AS I MAKE THIS POST AND RUN MORE TESTS IT STOPS DOING IT WTF. Guess it's "nvm fixed :)" now.
>>108030196NO THERE IT IS AGAIN AAAAAAAAAAAAAThis is the Anima output after reloading it.
>>108030171What do you want to regularize for? No one uses regularization images because no one cares about overfitting stray tags since its a single hot-swappable lora and not a fine-tune>I also want to combine it with other loras and use it for editing.Won't work 100% well unless they are trained together, lora weights conflict when they are trained separatelyhttps://arxiv.org/abs/2311.13600https://arxiv.org/abs/2412.04465
>>108030033its a good train of thought, but you're never gonna have enought training data to cover all those categories with quality images. your dataset should always be focused on teaching what you want to reproduce. for me it's loose dresses or tops that show the natural breast shape, sucking on things, sticking tongue out, breast shape when lying on her back, etc. of course you should balance all this with close-ups, expressions you like, full body images, different lighting, poses. etc. but if you're too focused on just autistically trying to cover every single angle, you're gonna get a lora that's good at making images of her just standing, and not much else.
lmao video extend is gold. cia talking about moot:https://files.catbox.moe/7zlwbv.mp4
>its another catbox video gen episode
>>108030229where is my fent spam
>>108030229
>>108030229WowIncredible10 seconds of shitty ai generated voice
I'll take fentposting over obsessed namefaggots any day
new thread>>108030237>>108030237>>108030237
>>108030220The base model should take care of the poses. What you are teaching is the shape. And, yes, it's quite easy to fill all those categories most of the time.
>>108030204Are using the Flux2 specific latent?
>>108028761soulkino, catbox?