Discussion and Development of Local Image and Video ModelsPrevious: >>108887123https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, & Upscalershttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.info>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/tdrussell/diffusion-pipehttps://github.com/kohya-ss/sd-scriptshttps://github.com/kohya-ss/musubi-tuner>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>Wanhttps://github.com/Wan-Video/Wan2.2>LTX-2.3https://huggingface.co/collections/Lightricks/ltx-23>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girl>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
(penis weenus:1.9)
>>108891701>https://github.com/BigStationW/ComfyUi-Untwisting-RoPEI tried this and it's kinda ass.>Poor prompt adherence.>Larger changes from reference image is impossible, prominent content will seep into your gens.>Decreased anatomical stabilityI didn't bother tinkering with the dozen different settings it has, maybe it has just poor defaults i dunno.(Was typing a longer post, but alas I refreshed by accident.)
>mfw Resource news05/23/2026>Klein Tiled Upscaler for ComfyUIhttps://github.com/Gavr728/ComfyUI_KleinTiledUpscaler>Anima AI Character & Artist search engine with 49,000 sample imageshttps://animadex.net>ComfyUi-Untwisting-RoPE (Training-Free Style Transfer)https://github.com/BigStationW/ComfyUi-Untwisting-RoPE>LongCat-Video-Avatar-1.5 https://huggingface.co/meituan-longcat/LongCat-Video-Avatar-1.5>IMG Dataset Refiner v4.3https://github.com/NyxAwroo/IMG-Dataset-Refiner/releases/tag/v4.3>Sulphur-2-base https://huggingface.co/SulphurAI/Sulphur-2-base05/22/2026>[real] Lens: Rethinking Training Efficiency for Foundational Text-to-Image Modelshttps://github.com/microsoft/Lens>L2P: Unlocking Latent Potential for Pixel Generationhttps://nju-pcalab.github.io/projects/L2P>GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillationhttps://ephemeral182.github.io/GenEvolve>EasyVFX: Frequency-Driven Decoupling for Resource-Efficient VFX Generationhttps://easy-vfx.github.io>FashionLens: Toward Versatile Fashion Image Retrieval via Task-Adaptive Learninghttps://github.com/haokunwen/FashionLens>PALLAIDIUM Refactoredhttps://github.com/tin2tin/pallaidium_refactor05/21/2026>Follow the Mean: controlling flow-matching generative models by shifting endpoint means with reference exampleshttps://github.com/pedrocurvo/follow-the-mean>LiTo: Surface Light Field Tokenizationhttps://github.com/apple/ml-lito>Q-ARVD: Quantizing Autoregressive Video Diffusion Modelshttps://github.com/tsa18/Q-ARVD>StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generationhttps://dsl-lab.github.io/StreamGVE>PGC: Peak-Guided Calibration for Generalizable AI-Generated Image Detectionhttps://github.com/xiaoyu6868/PGC>OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generationhttps://henghuiding.com/OcclusionFormer
>mfw Research news05/23/2026>MaTe: Images Are All You Need for Material Transfer via Diffusion Transformerhttps://arxiv.org/abs/2605.15660>ASAP: Attention Sink Anchored Pruninghttps://arxiv.org/abs/2605.22372>TAME: Test-Time Adversarial Prompt Tuning via Mixture-of-Experts for Vision-Language Modelshttps://arxiv.org/abs/2605.17577>Findings of the Counter Turing Test: AI-Generated Image Detectionhttps://arxiv.org/abs/2605.20787>Rethinking Cross-Layer Information Routing in Diffusion Transformershttps://arxiv.org/abs/2605.20708>Conflict-Aware Additive Guidance for Flow Models under Compositional Rewardshttps://arxiv.org/abs/2605.20758>Semantic Granularity Navigation in Image Editinghttps://arxiv.org/abs/2605.21190>TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Designhttps://arxiv.org/abs/2605.20731>PFlow-T: A Persistence-Driven Forward Process for Topology-Controlled Generationhttps://arxiv.org/abs/2605.17555>MARR: Module-Adaptive Residual Reconstruction for Low-Bit Post-Training Quantizationhttps://arxiv.org/abs/2605.17997>PERL: Parameter Efficient Reasoning in CLIP Latent Spacehttps://arxiv.org/abs/2605.18464>Edit-GRPO: A Locality-Preserving Policy Optimization Framework for Image Editinghttps://arxiv.org/abs/2605.16951>RoPeSLR: 3D RoPE-driven Sparse-LowRank Attention for Efficient Diffusion Transformershttps://arxiv.org/abs/2605.20659>Factorized Latent Dynamics for Video JEPA: An Empirical Study of Auxiliary Objectiveshttps://arxiv.org/abs/2605.17165>Ablate-to-Validate: Are Vision-Language Models Really Using Continuous Thought Tokens?https://arxiv.org/abs/2605.21642>Focus-then-Context: Subject-Centric Progressive Visual Token Reduction for Vision-Language Modelshttps://arxiv.org/abs/2605.20950>Exact Linear Attentionhttps://arxiv.org/abs/2605.18848>Mind Your Margin and Boundary: Are Your Distilled Datasets Truly Robust?https://arxiv.org/abs/2605.20606
>>108891780>>108891785Thanks for news
>>108891790my pleasure :)
>>108891780>>Klein Tiled Upscaler for ComfyUI>This custom node is 100% vibecodedderp
>>108891863AGENIC ENGINEERING
100% vibeart thread
>>108891877no vibecrits please
>>108891877I prefer my images to be slopped, not the code I am executing on my machine.
>>108891780>https://animadex.netis there offline version?
>>108891863>I love ai, look at this 1girl it made>wow so cool>i love ai, look at this node i vibecoded>hello HR???
I should start making fake HR complaints at companies I've never even visited.
People kind of do this on tiktok. But I don't think they see the real potential.
Give me obscure artist tags to try.
>>108891907sort ofhttps://github.com/ThetaCursed/Anima-Style-Explorer
>>108891945but if i tell you itll no longer be obscure
>>108891945'booru image count threshold to be considered obscure?
>>108891690the TE is the most important part
>https://huggingface.co/Comfy-Org/Lens>https://civitai.com/models/2645928/microsoft-lens?modelVersionId=2970963
I wonder what the maximum i2v generation speed achievable on a 3090 is while still keeping coherent output. I'm talking about speeds using TeaCache Sage Attention lowquant Wan 5b with taesd and a lowstep lighting lora kinda speeds. Has anyone done it before? You know, for the sake of competition
>>108891945Greg Rutkowski
>>108892021my hormuz isn't straight anymore
>>108891692Is there a new inpainting/mask detailing workflow?Current one doesn't work because it depends on the Impact Pack, which was compiled with Numpy1.Downgrading to Numpy1 doesn't work anymore, at least not in ComfyUI portable.
>>108892041inpainting is cope, just keep pulling the gacha lever on today's fast models until you get what you like, stack good images and wait a few months until we finally get a good pixelspace edit model, after which you wont be having to fuck with inpainting nearly as much if at all anymoreno matter what happens, 5 years from now you'll be able to point your ai to your old gen folders and just tell it "fix all inconsistencies" and be done with it.
>>108892016A fine choice.
Computer, analyze all of my RP chats dating back to c.ai days and generate a perfect world with my preferences in mind. Then analyze all of my 1girls and populate the world with them, randomizing their backstory, outfits, and personalities. Turn on 8k per eye VR realtime video generation, engage the masturbation machine suction, disable safety protocols.How over will it be for you?
>>108892113I don't think I'll ever truly accept it until it actually is like the holodeck.
>>108892091Good ol' Greg doing wonders still.
anything new since anima?
>>108892113Did you know that the first Matrix was designed to be a perfect human world? Where none suffered, where everyone would be happy. It was a disaster. No one would accept the program. Entire crops were lost. Some believed we lacked the programming language to describe your perfect world. But I believe that, as a species, human beings define their reality through suffering and misery. The perfect world was a dream that your primitive cerebrum kept trying to wake up from.
>>108892163>>108892113it's the smell
>>108892163I wouldn't get bored, and people could finetune the amount of struggle they want to feel in their own matrix if needed anyway. Everything in life is cope, it's just different forms.
>>108892141No doubt.
And one Raven gen from me.
>>108892187>>108892216Based Ravenchads.
You know what would be cool? A lora trained in the style of those illustrations in the NWO Illuminati cards. That artstyle is really appealing to me.
does anima work better with nat language or tags
>>108892298I remembered these! But I don't know where I am remembering them from, I am certain I never owned them.No promises but I am somewhat interested, assuming there is a reasonably high quality dataset of the art somewhere.>>108892310Depends on what you are trying to do.Tag prompts are often significantly more accurate than nat lang, but also more limited.
>>108891928>Imageslop is a little bit off in one small area>Inpaint it and it's good to go, or just dgaf who cares>Codeslop is a little bit off in one small area>Delet you are entire computer
>>108892328I used to save some pngs from ebay listings
>>108892353These are a bit too low res for my taste, I wouldn't bother anything less than 768p (I mean the art not the card) and I don't like training on upscaled images.
>>108891954I'm dying of AIDS I won't share them with anyone.>>108891970anything
>giant JennyI didn't prompt for it, but I'll take it.>>108891991So, will we be able to just drop in one of the uncensored GPT models to this? Vanilla GPT OSS usually checks for copyright and explicit content before doing anything else and I can see that getting in the way real fast.>>108892113You were supposed to relate to Data, not Reggie!
>>108892366Yeah. That's a shame, I don't think you're gonna find many in an ok resolution. The game is ancient.
>>108892375Ok.@garrett hanna@shirosaki london@diathorn>>108892414Maybe they published a book or something with higher quality assets but yeah besides that tough luck.I am going to bed now I might attempt a brief search tomorrow to see if anything comes up.
>>108892456>I might attempt a brief search tomorrow to see if anything comes upNice. Good luck.
>justified rows>recursive bisect>"collages are solved"I prefer white gaps (v2)
>>108892465how did you make this?
>>108892550And yes, of course I chose the (approximate) relative sizes of the images myself. And I chose which ones were allowed to crop* and along which dimensions.*mandatory up to 5% aspect fudging to fake tighter gaps is currently turned on however
>>108892465i kneel
whats the purpose of using both z image base and turbo? is turbo a finisher for things like skin texture? if i am doing img2img do i even need base?
>>108892747i dont know
>>108892747They're different models. You choose one or the other.
>>108892747turbo is a distilled model, which means it brings in strong opinions. although this produces limitations in composition and gen variety, it works well to refine. base as base gives better base gens with turbo tag-teamed in for a second pass to improve detail
>>108892772how do I choose?
>>108892562Anima>>108892715o7
>>108892810very nicecan I have the prompt?
>>108892813nyo
>>108892814
>>108892813Sure, using a gits 1995 lora from citiai>masterpiece, best quality, score_7, akagi ritsuko, ayanami rei, ikari gendou, ikari shinji, katsuragi misato, neon genesis evangelion, 1girl, arm up, armpits, bath, bathing, bathroom, bathtub, blue hair, blush, breasts, character doll, closed mouth, collarbone, completely nude, convenient censoring, doll, dripping, eyelashes, hair between eyes, hand up, indoors, knees up, lego, lego minifig, light blush, light smile, looking at object, looking down, nude, otaku, partially submerged, pink water, playing, red eyes, scene reference, shadow, short hair, sitting, small breasts, smile, solo, tile wall, tiles, toy, water, wet, wet hair, window
>>108892834Thanks
>>108892010Prompt?
hm the movement seems a little weird
>>108892864The image displays a promotional graphic featuring an anime-style character against a pink background with a subtle grid pattern. The background is decorated with white four-pointed stars, pink hearts, and thin white lines.In the top left corner, white text reads "Skibidi Archive."Dominating the right side of the image is a female character with long pink hair styled in twin buns on top, with the rest of the hair flowing down in loose waves. She wears a black headband with cat ears. Her expression involves winking with her left eye closed and smiling with her mouth open. She wears a white short-sleeved shirt with red trim, a red bow tie, and a black pleated mini-skirt with a zipper detail on the front. On her right leg, a black thigh-high stocking is visible. She holds a large assault rifle that is colored pink and black with yellow stars on the magazine. Her right hand is raised near her face making a peace sign, and she wears a red bracelet on that wrist.Behind the main character on the right is a larger, semi-transparent close-up of the same character's face, showing her smiling with her mouth open.To the left of the main character, large text in a bubbly brown font with white outlines reads "Edging, and all set to mog that skibidi!" in yellow, with smaller text "fr fr ong" below.In the bottom left area, there is a smartphone screen displaying a smaller image of the character winking. Various objects are scattered around the left side: a bottle of pink nail polish near the top, a red beaded bracelet, a pair of gold earrings, and a bottle of red nail polish at the very bottom.