Discussion and Development of Local Image and Video ModelsPrevious: >>108718184https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/tdrussell/diffusion-pipe>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>LTX-2https://huggingface.co/Lightricks/LTX-2>Wanhttps://github.com/Wan-Video/Wan2.2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girl>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
>>108727629>No 2D3DPD, dead on arrival.
>>108727629>illegal>things I didn't like>no animedead
>>108727629>me and a team of peopleare claude agents people?
>>108727656>are claude agents people?+ his daughter/wife from Silly Tavern
>mfw Resource news04/30/2026>ProcFunc: Function-Oriented Abstractions for Procedural 3D Generation in Pythonhttps://github.com/princeton-vl/procfunc>Efficient, VRAM-Constrained xLM Inference on Clientshttps://github.com/deepshnv/pipeshard-mlsys26-ae04/29/2026>Z-Anime | Full Anime Fine-Tune on Z-Image Base https://huggingface.co/SeeSee21/Z-Anime>QuantVideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantizationhttps://github.com/svg-project/Quant-VideoGen>World-R1: Reinforcing 3D Constraints for Text-to-Video Generationhttps://github.com/microsoft/World-R1>Benchmarking Layout-Guided Diffusion Models through Unified Semantic-Spatial Evaluation in Closed and Open Settingshttps://github.com/lparolari/cobench>VibeToken: Scaling 1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generationshttps://github.com/SonyResearch/VibeToken>OmniVTG: A Large-Scale Dataset and Training Paradigm for Open-World Video Temporal Groundinghttps://github.com/oceanflowlab/OmniVTG>Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Modelshttps://github.com/LeapLabTHU/RvR>SketchVLM: Vision language models can annotate images to explain thoughts and guide usershttps://sketchvlm.github.io>Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generationhttps://tuna-ai.org/tuna-2>Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Modelshttps://github.com/huaiyi66/PTI04/28/2026>Illustrious XL & NoobAI-XL Style Explorer https://github.com/ThetaCursed/Illustrious-NoobAI-Style-Explorer>LTX Desktop 1.0.5https://github.com/Lightricks/LTX-Desktop/releases/tag/v1.0.5>Meta-CoT: Enhancing Granularity and Generalization in Image Editinghttps://shiyi-zh0408.github.io/projectpages/Meta-CoT04/27/2026>PixlStash 1.1.0 Updatehttps://pixlstash.dev/whatsnew.html
>mfw Research news04/30/2026>ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidancehttps://arxiv.org/abs/2604.26348>SpatialFusion: Endowing Unified Image Generation with Intrinsic 3D Geometric Awarenesshttps://arxiv.org/abs/2604.26341>Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Modelshttps://arxiv.org/abs/2604.26503>Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Modelshttps://arxiv.org/abs/2604.26365>MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolutionhttps://arxiv.org/abs/2604.26244>SEAL: Semantic-aware Single-image Sticker Personalization with a Large-scale Sticker-tag Datasethttps://cmlab-korea.github.io/SEAL>SnapPose3D: Diffusion-Based Single-Frame 2D-to-3D Lifting of Human Poseshttps://arxiv.org/abs/2604.26620>Co-generation of Layout and Shape from Text via Autoregressive 3D Diffusionhttps://arxiv.org/abs/2604.16552>TAP into the Patch Tokens: Leveraging Vision Foundation Model Features for AI-Generated Image Detectionhttps://arxiv.org/abs/2604.26772>AnimateAnyMesh++: A Flexible 4D Foundation Model for High-Fidelity Text-Driven Mesh Animationhttps://arxiv.org/abs/2604.26917>Delineating Knowledge Boundaries for Honest Large Vision-Language Modelshttps://arxiv.org/abs/2604.26419
By anons hand blessed be this thread
>made it in
>>10872772380s b group with lambos, i'd watch it
>>108727723>>108727803Do you have lora for each car or does klein know them?
>>108727833Just passing a high res reference image of each car, it's a klein finetune btw.
> >108727700> >108727704fuck off
>>108728311why?
>>108728070Neat breastplate design.
>>108727777And quads! Time to run to the gas station for a couple of scratch-offs before your luck wears off.
>>108728340Thanks
44 mentions of /ldg/ in the current /hgg/ thread. Any idea why would someone keep shilling your thread there for weeks?
>>108728503>44 mentions of /ldg/ in the current /hgg/ thread.kekdldg has mind broken some to such a degree. its crazy.
>>108728402ngl those loras literally make me lol
is the acestep.cpp guy here? I am not able to get dcw to produce clean output, with base xl.Tonight, I'll try sft and turbo.
>>108727629Am beta tester, it does anime i2v without issues.
>>108728653proof: i dreamt the output
>>108727671>>108728438>>108728457>>108728070>>108727803nice
>>108727629125k videos reads like a serious finetune.Obviously millions would be desirable, or a better base model than LTX (Not that many alternatives exists) but this has a shot at being something cool I guess.>>108728653Does it do "her clothes disintegrate and she starts bouncing on cock" slop?I might finally download LTX if so.
>>108728503Smells like Animanon* stirring shit again. He does like >90% of the trolling here, usually anti-Anima.*: May or may not be the actual Ani.
>>108727777check'ddid you do anything for that font or was it just a consequence of the style?
>>108728402MOAR JENNIFER
What cloudkek will never experience:Anima to sulfur betahttps://files.catbox.moe/qa1rn2.mp4
>It's pretty good on realism, though every like 1/3 gens you do get body horror which can be annoying. If you run the model fully undistilled that problem goes down to like 1/8, but regardless.Well at least he is honest.
>>108728751LTX can do 20 second gens?
>>108728752You can lower a lot using i2v. Most issue are from T2V gen
>>108728758Up to 30 but gen times goes insane higher than 20 in my experience.
>>108728758hardware permitting, but that's generally the limit of consumer cards, you could get more framees by lowering the resolution but that seiously kills quality
>>108728402She looks autistic anon...
End of this month is two years of LDG :-)
>>108728402Can we see her fee anon
>>108728751But Apicucks can make catalogs and lame meme...
>>108728779Neat, do you know how to use acestep.cpp's dcw? I'm having to use really low values with base xl.
>>108728779crazy how little local progressedat least API is welcome here now thanks to comfyui
>>108728867local community is full of envious and bitter people. comfy made the right call when he betrayed local, nobody actually cares about freedom in computing anymore. any attempt to change the status quo is sabotaged and ridiculed
Holy sperg out with those previous two posts
>>108728962It's like one troll who has been doing a lot of overtime since GPT Image-2 release.I wonder what he gets out of it.
>>108728774Oh, she's genuinely very autistic. Big part of her charm, really.>>108728796Here's an oldie courtesy of Wan, but her actual feet don't feature heavily in my dataset (just two pics with visible toes). I could potentially revise that though.
>>108728989she shits in her hands and smears it on the walls?I've dealt with bonafide autistic people before, it's a fucking nightmare. Only the truly divorced from reality could romanticise that
>>108728995When people say "autistic" they mean neurotypical but somewhat socially awkward in an endearing way.Yes dealing with low functioning autistics (even many high functioning ones) is hell.
>>108729008Read it, he said "very autistic"
>>108728751>short hair>french>brownInto the trash it goes
I will be dumpster diving, I guess.
>load up Wan 2.2 after a several months>lora key not loaded: diffusion_model.blocks.ALL.OF.THEMGuh, what did Comfy do to these now, do I need some special loader and redo my whole workflow, or what?>>108728995>>108729013BUT! I also didn't say "non-verbal and mentally retarded".
>>108728298Arguably the most impressive line of LoRAs in the history of LoRAs. I kneel.
>>108729132Oof, 16m25s for this with the broken LoRAs. 7m of that was "Model Initializing". I don't even...
>>108729148Latest one turned out really good>>108729181Just saw Escape from New York. Carpenter was one lucky dude
>>10872850335 stars
>>108728653When you say anime are you refering to actual anime or 2.5D slop like >>108728751
>>108729202unironically hope you blow your brains out with a shotgun as soon as possible. in 3 years you've been obsessing about this hobby you contributed nothing to the community, in fact you only ever made it worse. i can guarantee that everyone hates your guts irl as well
I'm learing ace step.
haha melty
>>108729221Is it Somali friendly
>>108729273yes very quality
acestep in 500 steps, yes this will be the gen that really starts the local ai music revolution!!!When it finishes '-.-
>>108729181>fixed the LoRA problem (one of them defaulted and was pointing to nothing)>same 960x1024 res and prompt>25m gen timeGoddamn, whatever this "Model Initializing" is doing it now does it four times per Wan 2.2 gen (3min, 6min, 4min, 6min). This same workflow could do higher resolutions in about 5min total back in the day. It barely uses my GPU now too, it just spikes up and down and the 3D portion only fills up a quarter of the way with VRAM usage topping out at 22GB (resting at about 17-18GB for most of it). It's all fucked up...I had no idea it got this fucking shitty for video (LTX isn't as slow, but the output is also ass) with all their worthless memory "optimizations". I'm gonna have to setup and freeze an old version just for Wan because this shit as it currently stands is totally unusable.>>108729353>500 stepsMore steps always seems to help audio diffusion (I use the full 100 with VibeVoice for instance), but 500 seems like overkill.
>>108728402idk what the story is with this character but face looks like my sister when she was younger and it creeps me out a bit.
>>108728503Only you think that an anon from ldg is gonna shill their general indiscriminately and namedrop it. Probably someone trying to make ldg look bad.
>>108728758i can do up to 1 minute on my 12gb card but the problem with ltx is that it was trained on 5 second clips so the further that you go out of bounds, the more stuttering you get. i find it is better to keep it at 5 seconds but do it at the highest FPS that you can go, then just keep extending the clips forever since it degrades slower when the frame rate is higher
>>108729656Local?
...
>>108729722This is related to what? Missing context
>>108729743Context window limit reached
>>108729722SaaS discussion: upgrades, initiatives, discoveries, and research that lead to factual improvements, boosting user experience, quality and speed, SOTALocal discussion: money, "we give money to him but only a little guys, don't worry", Python, making the user experience worse, DOA models, more money talk, subjetive improvements, 1:2 fixes to bugs ratio, vibecoding, chinks&jeets
>>108729819Local talk: weekly types of CFG discovered that shows opinionated prompt understanding
>>108729819don't mention sora
>haven't proompted for months>boot up>my old wf from January doesn't even start ('output error' or something), even though it worked before and I haven't updooted since then>say fuck it and pull>not only does it work now, but the update didn't break a single nodeWhat kind of black magic is that? Is comfy /based/ again?
>>108728298Anywhere I can get this lora?
>>108729662Thanks anon. If it is not too much trouble:Which 12gb card, what are the generation speeds like and how much system memory do you have? Also are you running bf16, fp8, int8, q8 or some other quant?I wonder if I should finally try LTX with my 3060 and 32gb system memory before that finetune releases.
>>1087298733080ti and 32gb and int8 distilled. it takes 5 minutes per second of 4k resolution 24fps, so you can imagine that it's quite good if you want to do low resolution and then process it afterwards
>do celeb lora>result is meh>use photo as base for img2img>likness is perfectexplain this to a brainlet
>>108729902undertrained lora
>>108729869https://civitai.red/models/2280663?modelVersionId=2908963
>>108729959>.red
>>108729899Thanks for the response anon. That's encouraging.>>108729902Klein? It has bad facial likeness in general.
>gemma 4 as prompt enhancer for z image baseit's good. anon should try it
>>108730078give prompt
>>108730083https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo/blob/main/pe.py
fartfags >>> footfags
what do you use to organize and apply your LoRAs in comfy?
>>108728321I thought it was funny. Is it too mean? Originally was gonna do one with her eating popcorn watching herself.
>>108730325Nono I was just surprised at seeing her. I'm also pretty sure 18 year old Mayli would have loved this
>>108727629>look my new model saar>join my discord saarthis jeet just spent 8k to make a worse version of ltx 2.3everyone laugh at himhttps://www.reddit.com/r/StableDiffusion/comments/1t0auqh/comment/oj7txk3/
>>108728989Kino... Sad we lost focus on her face
>>108730437Worse for pron? You have no idea what you're talking about
>>108730551just use a lora lol
So what's the current local meta way to upscale images?I am asking for real images, not gens.Like if I have a photo from 2000s, what can I run that will still make it look like photo from 2000s, just higher res?Most upscale GANs I know either excessively blur the image of sharpen the artifacts to an unnatural degree.
>>108728737Dunno I just put "huge stylized impact font text" but it's probably a combination of both.
>>108730644you could try seedvr2, YMMV if that is "accurate" enough for you. like GAN it obviously has to fill in the additional pixels
>>108730732Workflow? How much does 3B vs 7B matter?
>>108729656scissor time
>>108730778https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler/tree/main/example_workflows don't have anything else at hand rndunno, i just used 7b
>>108730798Ok thanks I will see if it works well enough.
>>108730437Any leaked clips yet? Or are we going to have to wait until the public unveiling?>>108730519Oh wow, it got deleted... you know, I think that might be a first. It's never been deleted over on >>>/tv/ that I can recall.
>>108730870>Any leaked clips yet?nothing worth leaking, its just run of the mill body horror slop
>>108730870jebby stuffing her foot in my mouth and pouring that milk down her leg and into my mouth via her foot
>>108730078And who has 32gb vram to run gemma?
>>108731087Gee gee uuufs!
>>108731087Q5 with enough context fits in 20GB. Or just offload to CPU. You don't need to keep it loaded after inference.
>>108730732>>108730811I am finding it to be unusable for any real content.Way too many changes to small details for me.Extra storage saved on my drive I suppose.
>>108727629Why not just tune wan or ltx?
>>108731154And I don't mean super small and irrelevant shit neither.Legible text on book covers turn into AI slop gibberish.
According to ACEStep's dev, 30% of ACEStep was trained using LM codes. 50% was trained without it. As he explains>The purpose of this design is to give you that udio-like experience.>Because diffusion models are really good at creation.>LM, on the other hand, easily falls into overfitting, though it has better prompt adherence and higher accuracy.>Our trade-off is to make diffusion training harder and less dependent on LM codes.Otherwise, it would lose creativity and degenerate into a mere renderer that decodes codes into latents, rather than having its own personality.That makes perfect sense. I've been getting "boring outputs" that while accurate aren't on par with Udio, so I thought it's a creativity/RLHF issue. But to my surprise turning the LM off has pleasing results on XL.This is a LoRA I'm testing, it's about 75% done. But anyways, the result is very high quality, that I couldn't get out of the regular LM. As with image models, the DiT is very creative, and likely the only way to get Udio-like creativity. The cost is just a bit of prompt following (still good lyrics), with seed variance not an issue.First try of DiT only output on a WiP Kawaii Super Bass LoRAhttps://vocaroo.com/1hxGbL1pBTaIAfter hearing that, I was certain. Udio is a diffusion model. Which is why it's so damn good and creative. To squeeze even more creativity out of ACEStep, turning the LM off helps bridge that gap even further.>>108728545>I am not able to get dcw to produce clean output, with base xl.DCW has different values that work with it according to the creator. I have not tried the Base XL model, I reserve it exclusively for LoRA training. My advice is to not use DCW with SFT, you will get the most advantages out of the Turbo model, which is more creative than SFT. But if you must, the 0.05 DCW values given by the creator on Turbo have to be lowered on SFT (and possibly also on Base) for it to work properly. Make sure you pair DCW with Scrag's custom VAE.
never touched AI before. I just want to make 5 second loops of secks out of the blender 3D characters I make for my own personal enjoyment. any recs on what I should use?
>>108731288Wan 2.2 I2V + some sex lora.Render characters into appropriate sex position beforehand.
>>108731288Have you tried Blender?
>>108731154>>108731197Honestly doesn't feel great for artistic stuff neither. I gave it an image where the character was crying and it decided to wipe off the tears in upscale, alongside other changes.Maybe I am missing something but yeah I think I am done.I am not trying to be combative with the guy who initially suggested it btw, I appreciate the help. But these were my impressions.
>>108731302thanks, I'll check that out. for a sex lora, should I be looking for one specifically for 3D or does that not matter?>>108731304>AI general>"have you tried not using AI?"animation is an enormous time sink. just to get down the fundamentals takes a long time, let alone getting it to look good. I'd rather craft characters and make environments. I just wanna see it in action as I've seen impressive work from r34 """artists""" who take 3D CG images and make great looking AI animations from them
>>108731347>does that not matter?Shouldn't matter if it is trained well but I don't think you are gonna find 3d specific ones anyway.Speaking of which sex loras for image to video models get nuked from most places under deepfake rules.They are difficult to find and you need to get them from some place like civitaiarchive (poorly maintained but most accessible)
I figured I'd test the Anima RL Lora with a simple prompt and no artist tag, since russell said it's trained to emulate high danbooru scores. The effect isn't as strong as I expected though.
what do you use to organize or apply your LoRAs in comfy? or do you just download and chain them manually?
>>108731393rgthree power loader
>>108731393I use ComfyUI-Lora-Manager.It runs alongside comfy as an extension. It's very nice you can save recipes and keep your stuff visually organized.
>>108731378Here's what happens if you throw in the Turbo Lora and use 8 steps, cfg 1.
>>108731393I chain them manually because I rarely use more than 1 or 2 loras.
cozy
>>108731277>First try of DiT only output on a WiP Kawaii Super Bass LoRAStrategy applied to a Miku LoRA I trained. This is the first result. Keep in mind it was not trained at all on this type of cutesy/techno prompt, it's generalizing the genre (what it saw was exclusively some DECO-27 songs).https://vocaroo.com/1gonlSXjg3b3This musicality is insane. I know it's a LoRA, but I don't wanna hear anyone telling me Suno v5.5 is better kek (minus out of the box world knowledge, they're about on par now).For comparison, not same prompt, but before turning off the LM, this was the best sort of output I was getting from this Miku LoRA.https://vocaroo.com/1hANeTMzTMF1This was not a first try, took lots of nitpicking, and it's quite meh, I couldn't get it to produce music exactly how I wanted and it was poorly composed, every other output with the LM was also meh.Lastly, because I want the difference in musicality to be clear, this is the equivalent of that very first prompt with the LM turned on (5Hz LM codes not removed on ACEStep cpp). Though it's not the same seed, you get the point. Note I also did not bother to master this awful output (and I also noticed with LM off the output quality pre-mastering is higher, though it might be RNG).https://vocaroo.com/14UM0LfFwjflI now have to retest all my "failed" LoRAs kek. If this is v1.5, v2 with its enhanced world knowledge is gonna be some fun stuff.
>>108731845No offence but suno is way better
>>108731872>the closed model with loads of funding and money is better than free thingwow, I'm SHOCKED. utterly SHOCKED!
>>108731845More interested in an ACE-Step lora training writeup.
Can I train Anima Loras with AI toolkit? what architecture?
>>108731902iirc ostris said there is a license conflict so he can't implement anima.just use the standalone trainerhttps://github.com/gazingstars123/Anima-Standalone-Trainer
>>108731902Nohttps://github.com/ostris/ai-toolkit/issues/791
>>108731872Suno peaked at v4.5. It hasn't improved much since then, in fact at v5.5 the audio quality has regressed. ACEStep 1.5 XL has already caught up in musicality.
>>108731936Well that's too bad. Thanks, i'll look into standalone.But isn't Anima basically based on Qwen? or was it Flux, and if so, couldn't you technically just train on those architectures? Well I guess not, I suppose, is the correct answer, Otherwhise Ostris wouldn't specifically mention it. But I kind of feel like I want to experiment with it none the less.
>>108731944>https://github.com/ostris/ai-toolkit/issues/791>>108731985nevermind, so much for that bright idea.
>>108727700>>108727704thanks!
>>108732009https://github.com/gazingstars123/Anima-Standalone-Trainer
>>108731985Anima is Cosmos base
>>108732009it's a very easy model to train either way.
Am I missing anything in my program saars?
>>108732052>>108732054Thanks!
>>108732096temp and other sampler settings?
>>108732096Is it not inheriting your global app theme? Or are your window decorations just different? Other than that looks kewl anon
>>108732149it's not inheriting. I probably should make it do that since it's just QT. thanks.>>108732140true!
We need an anima swastika lora, and a separate Adolf Hitler lora. So we can fight those bad guys.
>>108732096Is this for anima? A short natural language paragraph will probably perform better than pseudo-tags. Not that I think it will matter too much for a simpler lora like yours.Jessica is also a bit too generic jessica (rick and morty) might perform better.
>>108732193Oh wait you are asking for the software.Listing tags by frequency, ability to remove a single undesired tag from all captions with a simple click (blacklisting), automatically adding a tag to all captions.
>>108732193she's the only person without a real last name for some reason but yeah I'm just testing right now. every image tagger i've used has been too shit/bloated or won't let you change out to whatever model you want so I made this. thanks for the advice, though.>>108732209>automatically adding a tag to all captions.I have that with tag prefix or do you mean something else? also thanks again.
>>108732193>A short natural language paragraph will probably perform better than pseudo-tagsNatural language + booru tags or alternating tags between pure tags and natural description is the way to go. Max the batch size before anything.
>>108731902use the easyscripts forkhttps://github.com/67372a/LoRA_Easy_Training_Scripts/tree/refresh
>>108732096Next step: vibe out inference, booru, and gen history support. Then you'll never have to leave the program.
>>108732096>>108732272*and support for whatever training scripts you use :3
>>108732229>do you mean something else?Besides prefix. Can't think of a good example right now but let's say you noticed that your LLM hasn't added large breasts to any of the captions and you want that captioned. A function to easily add that to all captions (if it doesn't exists already) is useful in some cases.>>108732234Did you actual test this? Not that I did neither, but I've heard that just natural language performs best and the official lora only had natural language captions.Btw:>booru tagsYour LLM isn't exactly outputting proper booru tags anima was trained on. I think the TE inside of it is smart enough that this shouldn't matter too much, but I feel the need to emphasize what you are doing is different than running WD14 or any other proper booru tagger.
>>108732282>>108732272now that's overscope bloat. no thanks
>>108732266this looks kinda cool. and so well documented. thanks.
trying to learn how to use dcw with ace step base xlso far, it's worse with it, but I'm using very small amounts and may figure out good settings.
>>108732286>Did you actual test this?yes
just use diffusion-pipe. youre not a brainlet... right anon?
>>108729181why such a framerate. just quadruple it with lossless scaling
>>108732343im retarded sorry man, i need everything in a colorfui gui
>>108732343>just use diffusion-pipe. youre not a brainlet... right anon?I have big brian and I'm on windows, so no ty
Any news in Anima ControlNet or Anima edit?
>>108732422base model isnt even done yet bro
>>108732426Fuck but I need something to add detial to background and textures with good color quality without altering the base image, Klein does not respect style and SDXL ControlNet is washed out
>>108732434sounds like a problem for YOU to solve
>>108731154could be. maybe you just need to do bicubic or w/e no actual change upscales for now and call it a day.
>>108732438Wtf? Give me the solution, bro, I don’t have time for jokes
Anima controlnet status? fuck tdrusell stop sharing experiment slop loras and make a controlnet model bro, deph and canny first
>>108732434How exactly are you proooompting Klein for "improve but no change"? It can do that fairly well.
>>108732422>>108732472Unofficial Controlnet LLLite released for anima. Google it.
>>108731985No, Anima is exactly Cosmos-2 2B modified to use Qwen3 0.6B as a text encoder instead of T5-XXL, and then fine tuned on a few million images.
Any of you have a good captioning prompt for Gemma 4?
i am bored
>>108732577Try genning arm amputees.
>>108732557https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneJust build a template in extra option and copy it to gemma?
Hello? Where's the arm amputee lora with trigger words for different types of amputation?
>>108732658https://civitai.com/models/1088883/armless-double-forequarter-amputeethis is so inadequate.
>>108732595
>honglu picnice but thats actually the art splash from ring id
Limbus out of nowhere
>>108732675If she were real, her jaw couldn't open that wide.
>>108732557This is what I personally use with both Gemini 3.1 Pro and Gemini 3 Flash at least (via their direct APIS, and with high thinking on):https://pastes.io/hdDaVoFtIt covers basically all possible use cases SFW or NSFW and pretty much takes the most possible advantage of their ability to comprehend images, with formatting / phrasing that usually is basically ready to use as-is.
>>108732440There are a couple images in my dataset I want to keep but they are too small for 1MP training.I guess I will just let training script's lanczos do its job. Their numbers are too few to hurt lora quality too much either way, and I would rather not teach AI synthetic nonsense data.Upscale models disappoint me, they either change too much or too little.
>>108732675I like how the arm magically appears when she hits the ground.
>>108732786Why don't you just do multires bucketing, where it will still train images that are below the "base" res just at whatever their proper original low res bucket is?
>>108732796yea thats because her stump was off screen for a moment. i have genned better ones but nsfw uhu
>>108732713glory to limboos company
>>108732491Prompts:Enhance colors, shaders and lighting = character slop style face and everythingHigh res = slop lineart and slop detailsSharpness = the sameKlein defaults to a slop style. I would love if a controlnet exists for Klein or a denoise control to control how much I want to change of the gen
>>108732806In sd-scripts there is do not upscale option which buckets low res images without upscaling, I think, but I don't mind images just below the threshold, say 950000 pixels, to get resized a bit and put into a proper bucket with the rest.I guess I can edit the script, but is it worth it for like <1% of my dataset? This will also create tiny bucket below batch size probably.
>>108732507Will try it. Thanks
>>108732877You should be saying more like: "significantly improve the XYZ of the WhateverStyle image 1 while keeping the ThingYouMayWantToPreserve and OtherThing and blah blah exactly the same as they are."
>>108732925Really? But that's very gay. Aren't there more manly alternatives like denoise value or ControlNet? I don't want to speciify like a troon.
>>108732942how else would it work?
>>108733005It think it needs a >zit>sxdl refine
Chat, we're cookin'Need to work on my prompt but this works nicely. Love me some ollama.
>>108727613noob ass question: I just want to get started making nsfw gens, pictures and videos.... where tf do i start? it's overwhelming
>>1087331642D or 3D?
>>1087331743d
>>108733164for images, pony and illustrious models in Forge UI.Pinokio app with wan2gp for videos (wan 2.2) or editing pics (flux 2 klein).thats just what i do. if you wanna make it complicated try comfyUI instead of these
>>108733236i have no idea what any of that means but i'll check the OP for info and I'll figure it out. thanks for the tips. i hope its easy lol
>>108733235https://civitai.red/models - filter to sdxl 1.0 base model, the naughty bits will be better than flux, z-image, and others and probably gen faster too. if you have a 24 + GB gfx card then by all means go for something newer than 12 and 16 vram plebs struggle with.>>108733236i second using forge
>>108733284ah, i have 12gb vram. i'll try some simple stuff out. if i need a new card then ill get one fuck it. idk what civitai is or sdxl but im guessing they are like models or something
>pony and illustrious modelsWhat is this 2024? KEK you need to upgrade anon
>>108732769>never do x>dont do x>dont do this Putting those tokens in its context sure does make it more likely to produce those outputs, anon
>>108732966A Flux Klein compendium of prompt synthax for basic image editing prompts.
>>108733279sry it wasn't clear. install forge UI (1 minute), then download an illustrious or pony base model on civitai.put that in the models folder of your Forge UI installation.then pick it in the drop down menu in Forge UI and start generating.alternatively do whatever the cool kids recommend but i think that's a pretty good start
>>108733305no one here still uses sdxl let alone posts about it so why would anon even recommend it? idgi is he purposely trolling?
I build the llamacpp memeversion to quant image models but I still get shape errors after quanting. how do I unfuck this?
>>108733372anon asked for nsfwhere is a blue board
>>108733368i meant in the folder models>Stable-diffusionto be more clear
>>108733379what does that have to do with recommending an antiquated model tho
>>108733372most of the pics posted here are crap and i generate better stuff with sdxl, that's why i recommended that. then again i just tried to help, i don't know all the newer stuff but i'm not very impressed
hes so confident in his sdxl slop that he never posts it here lul
>>108733388others who use z-image, flux, qwen, just about anything newer or different on other boards use sdxl models to fill in the nsfw details those models are terrible with.
>>108733376>llamacpp memeversionQue?I know how you quant llms with default llama cpp but no idea what is specially needed for diffusion models.
>>108733411unfortunate that so many suffer from skill issues with anima but thats usually how it goes with new model
>>108733430https://github.com/city96/ComfyUI-GGUF/blob/auto_convert/tools/README.md
>>108733435I've started experimenting with anima this week. And while I see the potential, it also gives horrible anatomy fails a lot, despite the way better composition prompting options.It also can't gen my wAIfu until someone trains a lora (or I have enough time to learn this, so probably not anytime soon)
>>108733477>I've started experimenting with anima this weekalright well then let the pros give the recommendations while you work on getting better
>>108733492i look forward to seeing your nsfw anima gens
>>108733435yeah i feel like anima is going to be the base layer in all of my gens for the foreseeable future. it does everything i want on the cartoon side and it's decent enough at realism with some finagling.
>>108733497if youd spent some time lurking youd have already seen mine and others
>>108733492pros wouldn't dismiss illustrious yet since it still may be the best tool for the job.
>>108733497>nsfw anima gensis this the new meme, anima can't do nsfw?
>>108731887cope
Can you run full WAN 2.2 with RX 9070 XT and if you can, how slow is it?
>>108733533>cope is when you realize the facts
>>108733515why would anyone want to use a shitty 4ch VAE model that doesnt even know its lefts from its rights >nb4 upscaling and regional prompting cope
>>108733164new anon wants to get into nsfw>>1087332353D nsfw>>108733236>>108733284good, helpful suggestions based on experience>>108733372resident shit stirrer chimes in>>108733522new meme is born ig
>>108731898>More interested in an ACE-Step lora training writeup.Made an XL 1.5 LoRA training rentryhttps://rentry.co/s8fg8berIf you do everything right then you will get high quality results similar to>>108731845I did not cover LoKR etc... because it's experimental and I never had good results with that anyways.
>he can't get anima to do 3dHOLY skill issue
>>108733553because the only other choice for anime is anima, which is still in its infancy period.But soon™
>>108733577And yet you wasted electricity and bandwidth for this post.
>>108733236>>108733284>>108733368>>108733381thanks dude(s)I'm just trying img2img gens with a prompt and oh man they are wildy retarded. i guess i'll have to learn what the fuck all these settings and models/loras are... idk just wanted to make my wife naked from some images lel
>>108733577>3dewww
>>108733577The classic "I can't figure out how to work with this model therefore it's the models fault". We saw the same thing when noob first released.
>>108733469Dunno. It sounds simple enough, it's seems to be just adding an extra dimension to the tensor, that someone on the correct trannycord somewhere can tell you how it is fixed most likely.README is 9 months old and probably not up to date.There are some discussions when searching for "is:issue z-image", see anything useful?Or just dump every relevant piece of code into Gemini API and it can most probably also unfuck this.Sorry anon well past my bedtime.
>>108733568Nice, thanks.>10–25 high-quality songsFeels like unless you want to slop loras by the dozens all the captioning automation isn't necessary.>16 GB (minimum)Fuck, are goofs no good? I got 16 but part is used by the monitor.
>>108727613about time for my bi-annual check - does any of this shit run reasonably on AMD video cards yet? I'm not going to spend the equivalent of a used car on a fucking nvidia GPU just to mess around with running AIgen shit locally
Trve diffusionists KNOW that SD 3.5 Medium prompted a very particular way at the top end of its resolution range is still more kinosovl than anything else since
>>108733643>bi-annual checkthat's a pretty long time for ai development speeds.> does any of this shit run reasonably on AMD video cards yet? Yes, but it also did 2 years ago.
>>108733372Speak for yourself, non artist tech addict troon. Just because I don’t post anime in non anime generals doesn’t mean no one uses SDXL. Comfy, Krita, and tool automation allow me to make better and more decisions with SDXL than just prompting and waiting for results to appear on the screen for a dopamine spike, you worthless techtroon.
>>108733546>It's better than Suno 5.5!>Acktually stop comparing it to Suno, it's local.Full damage control lol
>>108733658>Yes, but it also did 2 years ago.bullshitI've tried to get something usable up and running every few months for the last few years on an ATI/AMD setup and shit always fucking breaks and keeps breaking until I say fuck it and give up
What's with the "ldg is not an anime thread" meme? Most of the posts here and in the faggollages are in fact anime desu?
>>108733629Yeah I fixed it. Needed a fork of the gguf repo.>>108733678It's the 35stars doing his daily assault.
>>108733305the slop is too good to give up... There are very few things people want that SDXL can't do
>>108733634Honestly, I've got 24GB so I can't tell you exactly how much is required for lower ones. I grabbed the requirement straight from the official training guide (which is specific to the Gradio). Side-Step is different, so I'll update that part of the guide, but I'm currently not sure if this applies to XLhttps://github.com/koda-dernet/Side-Step#vram-profiles>Feels like unless you want to slop loras by the dozens all the captioning automation isn't necessary.When I made the script I was working with 50+ songs on non-XL 1.5, but that's how I unfortunately found out about a bunch of bad practices like automatic lyrics being wrong, training Turbo instead of base, etc...
>>108733305Anima is still not at SDXL level. Wake me up once it's out of the preview
>>108733677I have it running on 2 and a half years old hardware, my first image on it was genned on August 11th, 2024, But that's just when I finally took the time to set it up, it was possible before that date as well.
>>108733678/ldg/ is not anime and never will be. Nobody wants to post their gens in a fast moving, low effort slop general.
>>108733687Cool.>a fork of the gguf repoJust incase I run into something similar myself, which fork?
All these XLslop defenders coming out of literally nowhere
>>108733711>forkI mean branchhttps://github.com/city96/ComfyUI-GGUF/tree/auto_convertthe autoconvert one. But the issue was the main didn't have z-image support.
>>108733699Did you know that most illust merges and tunes were based on 0.1 even though 1.0 and 1.1 were already out? Talk about a "preview" lol
>>108733353No it doesn't, not in thinking mode. Did you last use an LLM in like 2022 or something?
>>108733722Because they were made before 1.0 came out, and after it many shitmerges mixed that into themselves. Also, 1.0 and 1.1 were disappointing anyway.
>>108733734You clearly were not around during that time. Best to keep quiet.
>>108733734IDK why there's not many 2.0 based ones though, it was much much better than 0.1
>>108733734>. Also, 1.0 and 1.1 were disappointing anyway.And what makes you think we’re actually progressing, and that Anima Base 1.0 will be better than the others?
It's up
>advertising
>>108733760I'm still waiting for a Dreamshaper ZTurbo finetune...
>>108733746Because training checkpoints and LoRAs is a matter of chance, anon, it’s not a linear progression. People who trained on 0.1, if they repeat the same work on 2.0, will get different results. It’s all trial and error, and in this hobby you can end up somewhere good without really knowing how you got there or how to replicate the result.
>>108733738I was around. 1.0 was long awaited, delayed, and when it finally came people were not so happy with it. >>108733748nta, but we'll have to see where it goes.At least they didn't close up after 0.1 and keep releasing their progress, so there is hope it actually improves.
>>108733760Do you think propensity for slop, as demonstrated in that gen, is a matter of IQ or your upbringing. I ask myself this question all the time.
>>108733760It's up is an ldg meme anon
spark chroma HD is 90% complete btw
>>108733760Sars we will redeem!
>>108733778>3 months to train on 2500 images
>>108730870I need a fully jenny pron site now...
>>108731288Send one, I'll show you what you can get
What's the SOTA for prompt part-of-speech comprehension? like understanding the difference between 'candle lights fire' and 'fire lights candle'?
>i can suckwhy are localkeks so pathetic?https://litter.catbox.moe/5b9fwjrhcpjlogei.mp4
>>108733866stop trolling, go away
>>108733866Kys
>>108733643Depends on your GPU model, but the situation's improved for newer hardware. See if you can find yours in this table:https://rocm.docs.amd.com/en/7.12.0-preview/compatibility/compatibility-matrix.htmlComfyUI also has AMD portable builds that you can try unzipping and running to see if things just werk. (Might still need the latest AMD Adrenalin drivers installed.)
Which company will give us a true base model in 2026 like how we got SD 1.5?
>>108733866you made this movie?
>>108733864wat
>>108733802I mean he's doing it on just one 4090
I may - or may not - get a supermodel gf at church because I nofapped for a week.
>>108733919part-of-speech: understand what in the sentence is doing the action (subject), what action is being done (verb), and what in the sentence the action is being done to (object)
>>108733930she's not a 10.
Why do people upload this as the cover image for their checkpoints?
>>108733930You guys goon to that? How old are you, 60?
>>10873393510s are overrated. your point?
>>108733953idk man, do you start being a man after a certain age? or you're just gay forever?
>>108733930this is slop, the head coloring and lighting is detached from the body
>>108733949Because the average slopper believes this is the final form of every anime model.
Where can I find a lora dataset to understand what a good dataset looks like and how it should be tagged? For an anime videogame character. Don't even know if I should rely on 2d fanart or take screenshots of the 2.5d ingame models? Tried with the former and while it picks up elements they're weak, rough and generated images look low quality. Guessing it's because of bad tagging and bad images used for training, but don't know what they should be looking like. Still learning the ropes
>>108733949whats wrong with it
>>108733955If you're gonna attract a 10, you gotta send off getting ten waves.
>>108733960okay boomer
the average slopper has a micro penis.
Fresh>>108733994>>108733994>>108733994>>108733994Fresh
>>108733985I literally just said 10s are overrated. the most beautiful woman I ever saw was a 6, at best.
>>108733068ZIT can't into 1792x2304 anon, and refining with a 4-channel VAE model like SDXL is retarded
>>108734016i only refine the bodies for a little more detail that I feel Zit tends to struggle with. ZIT seems to struggle with bodies in general imo. It gives you perfect faces but the bodies of the people you're genning just aren't exactly what you're used to on that person sometimes. Anyway. Zit is kinda hazy on the details and SXDL really shines with those details. And inpainting is an amazing tool because it can pick up whatever style is on the image you're inpainting.
>>108734039that's a lot of cope anon
"cope" is Swahili for "artistry". What a beautiful language.
>>108734054perhaps. but it's true cope. besides. I'd rather settle for a smaller image that looks good than a large image that looks like ai slop.
>>108733996no anime collage no posting
>>108734016It totally can