Discussion and Development of Local Image and Video ModelsPrevious: >>108707111https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/tdrussell/diffusion-pipe>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>LTX-2https://huggingface.co/Lightricks/LTX-2>Wanhttps://github.com/Wan-Video/Wan2.2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girl>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
Anima final WHEN?
>inb4 n*gbo
>>108711956Once you stop touching yourself at night
>>108711969Just one night?
>>108711969But at day is ok still?
https://civitai.com/models/2583128/anima-rlAnima reinforcement learning with custom reward model, to enhance aesthetics and details.new tech unlocked
>>108712003Looks like it sloppifies the image very much a downgrade, imo.
>>108712003Russell, please release preview4 next Tuesday so I have something to waste time on when recovering from my surgery. PLEASE SIR!!!
>mfw Resource news04/28/2026>Illustrious XL & NoobAI-XL Style Explorer https://github.com/ThetaCursed/Illustrious-NoobAI-Style-Explorer>LTX Desktop 1.0.5https://github.com/Lightricks/LTX-Desktop/releases/tag/v1.0.5>Meta-CoT: Enhancing Granularity and Generalization in Image Editinghttps://shiyi-zh0408.github.io/projectpages/Meta-CoT04/27/2026>PixlStash 1.1.0 Updatehttps://pixlstash.dev/whatsnew.html>AURA AI Studio Vault: One-stop management app for models, images and morehttps://github.com/TheGho7t/AURA-AI-Studio-Vault>UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models https://mo230761.github.io/UniGeo.github.io>LumiPic — Single-Image SDR to HDR LoRAhttps://huggingface.co/oumoumad/LumiPic>SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecturehttps://github.com/OpenSenseNova/SenseNova-U104/26/2026>ControlNet-LLLite for Animahttps://github.com/kohya-ss/sd-scripts/pull/2317>Qwen3.6-27B-Uncensored-HauhauCS-Balancedhttps://huggingface.co/HauhauCS/Qwen3.6-27B-Uncensored-HauhauCS-Balanced>VOID: Video Object and Interaction Deletion [ComfyUI Repackage]https://huggingface.co/Comfy-Org/void-model04/25/2026>StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognitionhttps://kwanyun.github.io/StyleID_page04/24/2026>MAI-Image-2https://playground.microsoft.ai/chat>ComfyUI-NAG-Extended: NAG support for Flux 2 Klein and Animahttps://github.com/BigStationW/ComfyUI-NAG-Extended>UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detectionhttps://github.com/Zhangyr2022/UniGenDet>VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolutionhttps://github.com/EternalEvan/VARestorer>Sapiens2https://github.com/facebookresearch/sapiens2>Vista4D: Video Reshooting with 4D Point Cloudshttps://eyeline-labs.github.io/Vista4D
>mfw Research news04/28/2026>Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modelinghttps://arxiv.org/abs/2604.23586>Latent Inter-Frame Pruning: A Training-Free Method Bridging Traditional Video Compression and Modern Diffusion Transformers for Efficient Generationhttps://arxiv.org/abs/2604.23858>TextGround4M: A Prompt-Aligned Dataset for Layout-Aware Text Renderinghttps://arxiv.org/abs/2604.24459>Diffusion Templates: A Unified Plugin Framework for Controllable Diffusionhttps://arxiv.org/abs/2604.24351>MuSS: A Large-Scale Dataset and Cinematic Narrative Benchmark for Multi-Shot Subject-to-Video Generationhttps://arxiv.org/abs/2604.23789>Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimizationhttps://arxiv.org/abs/2604.23540>Edit Where You Mean: Region-Aware Adapter Injection for Mask-Free Local Image Editinghttps://arxiv.org/abs/2604.23763>CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swappinghttps://arxiv.org/abs/2604.24493>$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Modelshttps://arxiv.org/abs/2604.23536>V-GRPO: Online Reinforcement Learning for Denoising Generative Models Is Easier than You Thinkhttps://arxiv.org/abs/2604.23380>PhysLayer: Language-Guided Layered Animation with Depth-Aware Physicshttps://arxiv.org/abs/2604.23574>LearnPruner: Rethinking Attention-based Token Pruning in Vision Language Modelshttps://arxiv.org/abs/2604.23950>DeepSignature: Digitally Signed, Content-Encoding Watermarks for Robust and Transparent Image Authenticationhttps://arxiv.org/abs/2604.23016>SycoPhantasy: Quantifying Sycophancy and Hallucination in Small Open Weight VLMs for Vision-Language Scoring of Fantasy Charactershttps://arxiv.org/abs/2604.24346>ServImage: An Image Generation and Editing Benchmark from Real-world Commercial Imaging Serviceshttps://arxiv.org/abs/2604.24023
>>108712010you write all this or you let some AI do it?
desu no one here even cared about the tuna model because the examples looked bad desu
>>108712003Haven't used it long enough but seems to help with anatomy/coherence/etc when combined with artist tags I use. Very cool, I assume this is an experiment done in preparation for a "aesthetic finetune" of the base model once it finishes?
>>108712051you can just ask GPT-image-2 (through comfy API nodes!) to generate a political cartoon about Meta removing layers from their new tuna-2 image model while localkeks cope and api gods celebrate.only fluxtards are still writing 4+ paragraphs, new models can extrapolate
>>108712051The tower is in the reverse order, so obviously not precisely enough.
>>108712003Seems good... some artist styles get changed lot by it though.
> >108712033> >108712040fuck off
>>108712003The one's on the left are the improved image, right?
>>108712144/>mfw resource news/i/>mfw research news/i
>>108711866i just used natural language tags. one thing i found that works decently well, at least while the loras are still baking, is running a double ksampler with low denoise. it blunts the edges and helps kill the rendered look.still haven't found an upscaler i like.
>>108712179nofuck that rodent
>>108712010That's crazy to read an api LLM "thinking". Instead of following the user's prompt it spends sooo much time thinking about censorship and how to justify it... So much compute lost for bigotry.
>>108712144ranfag...
debo took everything from him
>>108712003Seems to apply a mild darkening filter over the image like what they do whenever there is flashing in an anime or game these days (epilepsy countermeasure). Not a fan of that. Aside from that, seems alright. The effects are subtle.
>>108712084Do you think cloud models are just a single giant safetensor not hooked up to a language model?
>saasussy is still proud to have paid 10 buckslmao
>he still replies to the troll literally every single threadwhy do you do it anon?
Reinforcement learning implementation in AI Toolkit https://github.com/ostris/ai-toolkit/pull/808
I HATE THIS MOTHERFUCKER I HATE HIM I HATE EVERYTHING HE MAKES
uh oh meltie
>>108707312>>108707379you guys are retarded, this isn't a mystery, Anima is a finetune of Cosmos 2 2B, his dataset and / or training approach just isn't big enough or aggressive enough to erase the knowledge of the base model in that regard, I guess.
What happened now? I'm pretty new here.
>>108708015no it won't, not without a lora
>mfw
>mfw Resource news04/28/2026>Illustrious XL & NoobAI-XL Style Explorerhttps://github.com/ThetaCursed/Illustrious-NoobAI-Style-Explorer>LTX Desktop 1.0.5https://github.com/Lightricks/LTX-Desktop/releases/tag/v1.0.5>Meta-CoT: Enhancing Granularity and Generalization in Image Editinghttps://shiyi-zh0408.github.io/projectpages/Meta-CoT04/27/2026>PixlStash 1.1.0 Updatehttps://pixlstash.dev/whatsnew.html>AURA AI Studio Vault: One-stop management app for models, images and morehttps://github.com/TheGho7t/AURA-AI-Studio-Vault>UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Modelshttps://mo230761.github.io/UniGeo.github.io>LumiPic — Single-Image SDR to HDR LoRAhttps://huggingface.co/oumoumad/LumiPic>SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecturehttps://github.com/OpenSenseNova/SenseNova-U104/26/2026>ControlNet-LLLite for Animahttps://github.com/kohya-ss/sd-scripts/pull/2317>Qwen3.6-27B-Uncensored-HauhauCS-Balancedhttps://huggingface.co/HauhauCS/Qwen3.6-27B-Uncensored-HauhauCS-Balanced>VOID: Video Object and Interaction Deletion [ComfyUI Repackage]https://huggingface.co/Comfy-Org/void-model04/25/2026>StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognitionhttps://kwanyun.github.io/StyleID_page04/24/2026>MAI-Image-2https://playground.microsoft.ai/chat>ComfyUI-NAG-Extended: NAG support for Flux 2 Klein and Animahttps://github.com/BigStationW/ComfyUI-NAG-Extended>UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detectionhttps://github.com/Zhangyr2022/UniGenDet>VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolutionhttps://github.com/EternalEvan/VARestorer>Sapiens2https://github.com/facebookresearch/sapiens2>Vista4D: Video Reshooting with 4D Point Cloudshttps://eyeline-labs.github.io/Vista4D
not a lot of chroma gens lately...
>>108712561chroma status?
>>108712003@tdrusellYou need to share how you captioned your artist tags.For your Greg lora, you used prose instead of tags because "prose helps the lora learn better that particular style." That raised many questions.Can you upload 2 or 3 .txt examples per artist showing the vocabulary and which LLM you used for captioning? That way we can use the words and vocabulary the caption model used and get better results from Anima.Like, if you captioned Greg Rutkowski with prose using prefixes like "dramatic painting" or "digital oil painting", what vocabulary did the LLM use for artists like Nixeu or WLOP?Sharing sample caption files would help us match the vocabulary and squeeze more performance out of the model.
>>108712512But it's not a finetune? It's trained from scratch using the same arch, with photo data.
>>108712608no it's not, it's exactly Cosmos 2 2B modified to use Qwen3 0.6B instead of T5-XXL, and finetuned on a few million images. You can clearly see that the way Anima looks for realism is just the Fluxy default style of Cosmos blending with the varied illustrative data
>>108712529same seed with lora
Where is the Lora blocks breakdown for Anima? Only thing I managed to figure out so far is that blocks 13~17 tend to have a lot of influence over composition, and stripping everything else helps avoid style bias.
>>108712604My reasoning:Obviously the Rutkowski lora is trained with "@greg rutkowski. A natural language caption by Gemma4 31b with a bad, minimalistic prompt."For the actual data, the tag order is the same as pic related with @ added to the artist tag, and/or Gemini natural language captions of "This is a..." (Gemma4 is more terse with "A woman is...") separated by \n\n or something. Presumably the NL captions are from the era of men and women being "persons" and not mentioning things like race or skin color.
>>108712003>left soul>right slopthis lora STINKS what were you thinking?
>>108712658How do you know Flux face is not just the result of training on high res posed stock photos?
>>108712710Thanks, yeah. I suppose it's important to know what type of GPT slop the LLM uses since the tag language is basically verifiable with Danbooru, but the type of GPT slop the LLM used we don't know and this type of stuff I think is important and helps increase the image quality because these GPTisms are structural in Anima.
>>108712722Sar please is very aesthetic look at benchmark
>>108712722WAI fags should be all over this mhm
which image model lets me use a reference image to modify another image?
>>108712808Klein 9b.
>>108712003Anima Flux2VAE when??
>>108712003my sloppacopter goes SHLOP SHLOP SHLOP SHLOP SHLOP
>>108712819thanks
>>108712669>linesthat's how you know it's chroma lol
Ok serious question then: what is the ideal default style for a post-trained version of Anima? I can train the reward model to reward *anything* as long as I can quantify it. This super quick prototype is trained to give high scores to high-rated danbooru images. Maximizing that reward function with RL gives the results of the lora that you see. What am I to optimize for if not "look like a high scoring danbooru image"?
Is wan gp actually good?
>>108712943Idk since these are not universal things, but can you do "looking at viewer", "simple/white background" as negatives?
so i'm trying klien 9b and i can generate a 1080p image with plenty of vram to spare but then i turn it up to 1440p and it crashes with cuda saying it needed an additional 4gb of vram. is this model not able to go higher than 1080p?
>>108712943It's honestly a really hard question to answer, maybe focus on images from artists that have "good anatomy" as well as those who draw detailed backgrounds? It would be a pain in the ass to figure all that out though.
>>108712943>What am I to optimize for if not "look like a high scoring danbooru image"?What YOU think looks nice. There is no one true single answer for "looks like a nice image".
>>108713019This desu
can anon train an rl on traditionally "bad" images? i bet it would unironically be so kinosovl.
>>108712943i like the turbo lora aesthetics
it's nearly there. Are you jazzed?
>>108713107>2500 images [512/1024]nigga, thats a lora
>>108712943Do you know NoobAI? Well, that's a good starting point. The less bias a model has, the better.
ok i have the correct video models and image editing models. now i can officially stop spamming trial accounts for apis
>>108713101Underrated anime gennerPeace bro
>>108713125Are you retarded?
>>108713144peace to you, kind anon
>>108713144How do you know Denk?
>>108713144>>108713162Sad, pathetic and desperate.
>>108713184Welcome to 4chan.
>>108712003All the replies to this post acting as if aesthetics are objective kek
>>108712003>>108712722100% agree and you're not the only one, unless this is you lol
>>108713239>people providing their opinions means acting as if it were objective
So tdrusell releases an anime lora in his home general and shockingly, nobody even bothers to test it. I mean, this isn’t an anime general, which makes it even funnier. So what’s the deal here? Does he have some kind of ComfyUI arrangement or something? Like, does this so called Comfy grant mean he’s only allowed to post in /ldg/?
>>108713234the humanity!>>108713248sd1.5 was peak tho
>>108713260you'e be surprised how many believe their "good" is THE good
>>108712992Yes the fact that your computer doesn't have enough VRAM means the model itself cannot do it.
>>108713264In /adt/ there are a couple of anime Twitter influencers. In /edg/ there are two or three well known lora makers from CivitAI and in /h/ there’s no shortage of degenerates. So you’re really telling me that posting this in those places wouldn’t get way more reception and feedback than sharing it here where it just gets blown away like nothing?
>>108712003Kig Russ with another banger
>>108713275I think he means the shitmixes.And you are wrong, 1.4 is peak kino.
>>108713275Underrated anime poster
*yawn*
>>108713275You need an artist tag wildcard ASAP and mog every anime genner
>>108713364this is 1.4. I don't know how to prompt it, but this looks legit hilarious, cherry picked out of 4 anime ones.
>>108713397zit doesn't support artists too well >>108713401sovl
AI / ML engineer here I mostly work on applying AI to real products and workflows, not research-heavy stuff.A big part of my work is making LLMs actually usable in production, and connecting them with automation.Recently I’ve been working on things like:- Chatbots backed by internal data (RAG-style setups)- Improving search / Q&A over docs and knowledge bases- Automating repetitive workflows (Slack, Notion, internal tools)- Some image and voice features when neededDay to day it’s mostly Python + FastAPI, LLM APIs, vector search, plus automation tools like n8n / Zapier / webhooks / background jobs, and cloud (AWS, Docker).Usually integrating everything into existing web or mobile apps.If you’re building something AI-related or need help automating workflows, happy to chat
>>108713413Use Anima!
>>108713414
>>108712003So you could have released a ControlNet for Anima but no, you thought this slopifier was and is more necessary... what kind of idiot...
>>108713414Prompt Engineer here.Would be happy to chat, please pm me.
why are my pictures so assnvm that, should i combine multiple controlnet or is just a waste of time, they don't seem to be made to work together.
>>108713414Sure, can we get DCW working for Ace Step XL, and other gens too, in ComfyUI?my speculation is that some existing custom nodes might do the trick.
>>108713473Did you forget to @ your artist tag?
>>108713473You don't have what it takes. It's okay, not everyone can be successful artists.
>>108712003Cute but NAI (NovelAI) doesn't need that and it's better and a year old model.
>>108713423I just started using it a few days agoseeing if I can get it to do gacha-style stuff like I did long ago. need to learn it a bit better
Does that Anima LLLite controlnet actually work well has anyone tried it
>>108713499That's not local. You are in the wrong thread.
We *NEED* a somali anima lora.
>>108713463sent. also sent penis pics.
>>108712943Why would a default style be necessary? You have so much control over style with prompt, I don't really understand who needs that except the terminally lazy. What's the difference between training on high-rated images vs. just prompting "masterpiece, best quality" in the first place?
Why is everyone into reinforcement learning all of the sudden?
>>108711911qrd for videos? using comfy ui and have no idea what im doing just trynna coom
>>108713607>all of the suddenbenchod
>>108713629https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xtThis model is the current meta for coom. You may need to find an appropriate workflow for it though.
>>108713629ltx2.3 is kino
>>108713629im a fan of seed variation animations
>>108712003could be cool but you need a better, modern reward model
>>108713629The only decent options for local are Wan 2.2 or LTX 2.3. Wan2.2 is heavier, can only really do 5 seconds at 16 fps but has higher quality, especially motion. LTX has awful motion but can do audio (of dubious quality), variable framerates, and is much faster. You can use wan 2.2 to gen a 5 second video then extend it with ltx to make it 10+ seconds and add audio.
>>108712943Reward 5 fingers, 5 toes and coherent backgrounds.
>>108712033>>108712040thanks!
>>108712943use the technique tencent did with flux.1-dev-SRPO
>>108713611
>>108713794look at that stupid racist chud stuttering and speaking gibberish as his bike gets stolen
>>108713801don't you mean reverse racist trying to keep the racial minority from increasing total happiness?
>>108713809is he shaped like a reverse swastika
>>108713809which one is the minority?
>>108713794
>>108713819Am I supposed to use a trigger word or no?
>>108713823>Am I supposed to use a trigger wordyou can say nigger, retard, faggot, whichever floats your boat.
>>108713828I mean @stonetoss, stonetoss, stone toss? I just added "nose" to the negative prompt.
>>108713832strength_model 1.10, stength_clip 0.60
>>108713832sure you can call stonetoss a retard faggot if you want
>increase batch size and let it run while i go do chores>come back expecting a collection of kinos>torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.55 GiB.
>>108713846You better not, you'll go to HELL.
I hope xinsir will make controlnet union for Anima.
>>108713872man feet
Stylized text with character labelText Line 1: "magical girl merfolk"Text Line 2: "Element: plague"Text Line 3: "Job: beast master"why doesn't anima like this
Stylized text with character labelText Line 1: "magical girl merfolk"Text Line 2: "Element: plague"Text Line 3: "Job: beast master"
>>108713877i dont believe it's trained to handle text
>>108713842
>>108713877learn2prompt, use natural language... "In the upper right side of the image there is a text that spells: "Magical Girl Merfolk" in white glowing typefont, etc etcNo wonder your gens suck so much after all these years, you clearly can't prompt
>>108713883
is there such thing as an image to image model for automatically fixing image errors? like removing random body parts from under desks in busy scenes
>>108713962yeah, its called inpainting
>>108713977i want to click a single button and go from prompt to final product without any effort
>>108713984@grok do this for me im dumb use like nodejs and implement some api and make it a like a uh iphone app grok remembner do not make many mistakes or preferably no mistakes
>>108713984yeah, inpainting with an edit model, you mask, prompt and voila, you can even be more proactive and use an LLM with a vision model to see the mistakes and generate a prompt to fix them too
>>108713832TW is @stonetoss for both loras posted.
>>108714015i tried inpainting and prompted "remove the leg" and it didn't work. it's like it doesn't even realize that's a leg. i'm using klein 9b
>>108714105anima
Thank you for letting us know what plebbit is up to, anon!
t
>>108714105>hello saar, I want to generate nude influencer model with nano banana pro level but I only got 6gigs of vram, why local is so behind?
>>108714138bitch basterd
>>108713397Could you share some wildcards perhaps?
>>108714105There are two uses for diffusion models:porn and scams.Anyone who complains is a scammer, usually of Indian origin.
you guys lied to me. i asked if you could use profile images to put characters into scenes but you said local models aren't good at that. liars!
>>108712722idiots thinking that anime=anime fanart
>>108714105He's right, just not about images.SaaS Video & Audio have local by the balls.
>>108714314there's one nutcase on this thread that you should not listen to
why does entering inpaint mode in gradio cause firefox to use up like 2gb of vram? what kind of jeet poopie doodoo is this?
>>108712943Give me ControlNet or an edit model or don’t show up here at all.
>>108712003>>108714737Faggot make a controlnet model
Wheres the realism lora russel
/ourguy/ with the news(They will be some dogshit 2B research preview models with only 240p training)
>>108714903given this guy's recent track record of hyping up models like happyhorse as 'local', i'm eagerly anticipating the release of seedance 3.0!
>>108714903Wheres the z anime illustrious motherfucker
>>108715048I think Z team got disbanded or moved on to another project at this point.Z-Edit is MIA too.
>>108715048Not illustrious but this just dropped https://huggingface.co/SeeSee21/Z-Anime
does a node like this exist? i want to feed an image to a node, draw a mask on the image, and then output that mask.
Now adapted for anime-style generation !!!
>>108715116If things still work the same you should beable to right click the loaded image and pick mask editor or whatever it was. But masking in comfy is defintely a thing
>>108715109>brutalslopped title image>obvious WAI slop in preview gallerynext
Someone keeps shilling this thread in /h/, been like a week now. Any idea what that's about? Do you guys need some hentai posted in here or can I go back
>>108715109This is not Z Team on Noob dataset.And desu if you actually claim to make a full finetune including how many steps you run on how many images, even rough ballpark figures, would help.Using actual artist styles instead of generic illustrious slop too.The "Partially NSFW capable" part makes me think <100k images even if a proper finetune.The Engineer LLM thing seems interesting though. I will try it on ZIT to see if it helps or if it is a snake oil.
>>108715116The load image node allows you to draw masks?
>>108715158you can go back
>>108715131>>108715171yeah the load image node lets you do that. but i load the image from somewhere else. but i just realized i was retarded and used an automatic masker instead.
>>108715116there is a built in node called "Painter"
>>108715109Tell me, please, what will you do with a 6B anime model, 1girl, cowboy shot? Anima has proven that no matter how much technology and parameters change, your creativity remains the same and stagnant. Sure, maybe you can decorate your 1girl with pins and tiny things that won't melt from the VAE, but the slopper spirit and low effort approach are still the same, you are after all /ldf/ - local debugging general
>>108715158Failed dev, please undersrand. >>108715189
>>108715199I'm not your boogeyman. As local consoomers you are, you have no right to even think about wanting a new anime model. You're years behind the artistic vision of any anime general. Best you can do is shut up and smile when a new model drops.
>>108715189take your meds
>>108715167Tried Qwen 4B Engineer.Don't bother. It generates slightly less blurry and slightly more detailed images than base Qwen 4B, but prompt adherence is slightly worse, details aren't necessarily coherent and I managed to get one melted hand and extra arm in the few test gens I made.Pointless side grade like almost all text encoder experiments people made for diffusion models.
Just want to let everybody know that I updated my ComfyUI and nothing exploded.This is a miracle.
winner of yesterdays prompt to image challengecongrats to @Jonatan83https://www.reddit.com/r/StableDiffusion/comments/1syfihl/winner_of_yesterdays_prompt_to_image_challenge/
>>108715048Any Alibaba models we were promised are just not happening. Even stuff hyped at different periods by researchers but then memory holed (music model being one of them).
SDXL till the end.SDXLを最後まで。SDXL hasta la muerte.SDXL até ao fim.SDXL дo кoнцa.
>>108715441bit cringe lad
Is there any snake oils (like SAG/PAG for sdxl) for anima?
>>108715473how?? im trying to make the AI space fun and engaging. i made a site where people can post their AI content so be sure to check it out
>>108715479ComfyUI-NAG-Extended works now
>>>108712484>https://www.reddit.com/r/StableDiffusion/comments/1syhp27/reinforcement_learning_implementation_in_ai/Anyone know if this applicable to ACEStep? Would be massive for steering the model more towards Udio and closing the gap even further. I know ACEStep 1.5 is a hybrid flow-matching DiT, so not entirely sure.
Is there a guide somewhere on porting your nodes to nodes 2.0 (or at least make them compatible)?The documentation is sorely lacking and only ever scratches the surface.
>>108713485ACEStep cpp should work on every platform though. If it doesn't work for your AMD setup then you'd have better luck opening an issue on Github since the dev there is more responsive than someone as big as Comfy.
>>108714903LOCAL IS SAVED BABY!!!!
>>108714903that faggot said z image base training will be fixed but nothing came of itfuck that clown
>>108712003Do not come back here again until you make ControlNet or Anima Editwork or get your LoRA training script running on Windows.
>>108715870kohya already made a controlnet and there is a standalone lora trainer. anima edit would be pretty nice.
It seems like whoever made this isn't lazy at least
>Remove the piercing. Do not alter anything else.>Klein decides to delete the anatomical concept of belly button altogetherI like this model but this is the most ludicrous edit request gone wrong I had so far.>>108715904>kohya already made a controlnet andHuh?
>>108715923Does it know artists?
>>108715931https://huggingface.co/kohya-ss/Anima-LLLitehttps://github.com/gazingstars123/Anima-Standalone-Trainer
>>108715955Interesting. I should try it but is it stable or does it gen body horror?The effect doesn't seem too strong in the examples (The sword is down in the CN and up in the generated images.)
>>108715996its shit, dont bother
>>108715923>whoever made thishis Claude agent
>>108716013if ai made all that then we are cooked
>>108716008Can you say "Python and tdrussell are great" so that I can know this is real review of it and not thread schizo fud?
>>108715923More anime models??? When is /ldg/ going to go back to it's videogens and chroma roots? I'm sick of this anime spam every single day.
>>108714490>SaaS Video & Audio have local by the balls.Voice to text is in the worst state. There's nothing I've seen so far that comes close to Eleven Labs.
>>108715996i only fucked with it a bit yesterday so maybe it's user error, but there were a lot of body horror or washed out gens.
>>108716013lol, yeah, I hate vibecoded shitty "finetunes", shitty custom nodes, fuck
should i overclock my gpu?
>>108715189Z Anima looks uber slopped. Architecture is the issue there, not the fact that it's 6B.
Man, Kaleidoscope could have been so good if Kekstone didn't train it at retardedly low resolution. This is one Lora I trained on Klein 4B Base at 1024px, on 216 images captioned with Gemini 3 Flash, running on Klein 4B Distilled. Same seed and overall prompt for both images, style change is just changing one line of the prompt.https://files.catbox.moe/4nz443.jpg
https://github.com/OpenSenseNova/SenseNova-U1seems like these unified models that can generate interleaved image-text are the next step for /ldg/. thoughts?
>>108716491I should have listened...
>>108716504
>>108716504i don't see anyone using it for anything outside of throwaway memes, and it's easier to just use chatgpt for that.
I wish there was an edit model that could>capture the essence of a cartoon (incl. anime) character>gen photo real version of the characterIllustrious with character lora was as close at it got to that, however that was not really edit but (re)creating a new image.
>>108716618that was before the 5 millioncopy the inference code to a few nodes is what, an hour?
>>108716633gpt image 2
>>108716650
>>108714105>free thing is lagging behind expensive thinguhhhh... yeah? what the fuck?
>>108716693doesn't help that their understanding of local models ended 3-4 years ago.
>>108714105>6th fingerUh I couldn't care less. Just like inpaint the hand until it works?
turdrussel status?
>>108714105>Compartilhar
>>108716765Unbothered. Moisturized. Happy. In His Lane. Focused. Flourishing.
>>108716366The "realism" is so plasticky. That's by all means a failure, unless there was a way to get rid of that plastickiness. Have you forgotten what Chroma.1 can do?
so freakin cute
>>108716633>capture the essence of a cartoon (incl. anime) character>gen photo real version of the characterIsn't that already possible with Klein?
>>108716809They took their pills, right?
>>108716818>Isn't that already possible with Klein?possible =/= good
>>108716818no unless the cartoon already is sort realistic (no deformed or anime)
>>108716843probably better than the sdxl shitmix you were using.
>>108716819no, and he went in rawACKhttps://files.catbox.moe/7deje1.jpg
>>108716907Haha. Aids.
>>108716844works on my machine
What is the best model that you can run on a GTX 1070? Is it still just SDXL?t. haven't paid attention to image generation in 2 years, after models got too gud for me :(
>>108716960Screenshot?
>>108716973yes, still sdxl on a 3090/4090 too
>>108716973You can run anima &lin a CPU, so I guess anima would work
apichad gens:>cute girls>funny memes>useful infolocalkek gens:>degen porn (homosexuality, bestiality, etc)>tranime slop>normie cartoons (family guy)
>>108716269>should I burn my gpu If you want to do that
>>108717005>funny memes
>>108716987What about models like Qwen-Image-2512? Or is it not worth it for the quality it outputs?
>>108717005You can't do any kind of porn, degen or not even if you wanted to APIkek.
>>108717023oh yeah, forgot about that, if its not for pure coom/porn klein is probably the best, qwen is on a similar level but slower
>>108716491Thanks thanos
>>108717005Can you show one of these funny memes?
>>108716844Didn't look for long>https://www.reddit.com/r/StableDiffusion/comments/1r32rd7/comment/o53fgov/Maybe you have a skill issue anon. Anyways, even if the model is weak out of the box at this task, it can still be tuned to get better results. There was one Anime2Real LoRA for Qwen Image, but that's archaic Klein is the fun new stuff now.
>>108717060>flux chin
you aint getting that on api, thats for surehttps://files.catbox.moe/qubn68.jpg
what the fuck is a flux chin? my flux chins seem completely normal
>>108717025grok. high quality, instant resultsseedance 2 if you want to be fancy
>>108717060Also, pretty soon we're going to have this model. This is massive because an autoregressive model understands prompts pixel by pixel as an LLM understands next tokens, so it will be able to reason about turning the image into realism, not just doing a task it doesn't understand. Local has already almost entirely caught up with API thanks to that model.
>>108717112>ernie chin
>>108717005>normie cartoons (family guy)Woah, you must be so cool for disliking a popular thing. Very edgy!>posts sydney sweeneyOh... just a different shade of normie.
>>108717105cool porn
>>108717123no no, but you see she's CTO of open aithat's ridiculous, she's a model/actress, she has no business being a CTO of an ai companydon't you get it? that's humor motherfuck, read a book
>>108717112>hernia edit
>>108717123normies hate sydney sweeney
>>108717060>big eyes>huge heads>samefaceYeah cool. That's as far as I managed to get as well.
>>108717138Yeah bro, they totally hate her. Celebrity slop is totally not normie, I sweaar saar I'm different!
This is an APIcuck in his natural habitat.
>>108717005>apinormalfag and jeet central>localfor refined otaku and indies
>>108717112There's no point in waiting for chink promises to be fulfilled, they also said we would get Z-Edit and Qwen Image 2
>>108717157what does follower count have to do with anything?only an indian would judge something purely by the numbers. its clear you have zero understanding of the american political climatenormies hate syndey sweeney because shes white, mr. patel
>>108717235>only an indian would judge something purely by the numbersindeed. thats why we shouldn't judge a UI by the number of github stars it has
>>108717235I'll just leave this here.
>>108717138abnormie here, I also hate her
Today's gens
>>108717295Repost them there >>>/g/adt
>>108716790I mean it wasn't really trained on actual photographic data, it's more of a fantasy NSFW Lora, mixed anime / western cartoon / 3DPD dataset. Not everything is always actively supposed to look like fucking flash smartphone photography kek
>>108717005That's the thing about GPT. This image proves it. The colors are not natural. They are washed. The images all miss dynamic range. Do APIkeks not notice how gray their images are? It's soulless. Luckily, local hasn't suffered than since SDXL days.
nothing beats sdxl prompting
>>108717122?
>>108716973AnimaZIT
>>108717360*vomits*
>>108717355Sar, you don't understandm I need photorealistic dataset model for my AI influencer saar
can klein do some artistic stuff or it's just 1girl and realism
>>108717558Klein can do "artistic" stuff, depending on what you exactly mean by that.It's much better than Flux 1 but worse than ZIT for realism. I wouldn't call realism its forte.And almost any model (well besides SD3 maybe) can do 1girl.
>>108717558Klein's output are kinda slopped but it can still steer away from realism. The best use for the model is editing though, and sky's the limit in that regard.
>>108717595how is zit for editing real photos? and how many reference images does it support?
>>108717652If this is not a troll post about Z-Edit never releasing, ZIT can't do edits.
>>108717614do you use the full fp16 model of klein?
>>108717652>>108717658ZIT can do inpainting though and so editing of real photos is possible
>>108717683oh, i need the whole reference image thing. i started with klein 9b and it's quite good for putting people into scenes but body modification prompts give bad results. i am trying to find some loras to fix it
Is there a node for comfy that allows you to organize your own list of image dimensions from a dropdown menu? Writing them manually is such a pain.
Why is there no good general purpose NSFW tune for LTX? And no this isn't a stealth request, all the civitai crap is snake oil
>>108717817what are you looking for? DR34ML4Y is excellent and has proper audio training as well
>>108717809I just keep nodes with my favorites around and reroute as needed.
Is there a custom node that does everything for me? I'm too much of a lazy loser who can't put any effort on anything and I pretty much don't want to do anything that requires a little of work, I just want the AI to do everything for me so I can produce slop and flood every platform there so I can monetize? Why is local so behind?
>>108717834I said LTX, wan actually has decent NSFW comparatively speaking, but doesn't support V2V (no I'm not talking about controlnet / vace, I'm talking about actual V2V)
>>108717855well i am using it with ltx and it's working. i haven't tried v2v but i2v certainly works
>>108717809i just made a subgraph with toggles so i can switch ratios and size easily.
can anon bake soon? I have to go to lunch
>>108717809Just use resolution master
>>108717854uhh, acktually thats the whole purpose of AI, in the future we will do nothing and the AI will do everything for us, thats the whole purpose of AI, how dumb are you?
>>108717907The only thing ai will do is drop a nerve gas grenade from a drone into your window
>>108712943reward larger breasts, no volume limit
>>108717809Empty latent size picker.comfyui-essentials
>>108717667NTA but yes.No reason not to unless you are seriously VRAMlet (Runs fine on my 12gb VRAMlet setup)
>>108716722dogshit gen, as expected of a third worlder
Blessed thread of frenship
>>108716366Chroma is run on donations, 1024 is exponentially more expensive.Also seek help anon, not saying this to be mean.
>>108717903beg more fag
>>108716366It just downed on me, did you have any qualms about sending bestiality images to google's servers?
>>108718153>It just downed on mekek
Fresh>>108718184>>108718184>>108718184>>108718184Fresh
>>108716504Was already the case last year with the first Qwen...We're just not into shitty menu and catalogs unlike apifags