Discussion and Development of Local Image, Video, and Music ModelsPrevious: >>109140721https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUISDWebUI: https://rentry.org/ldg-lazy-getting-started-guide#the-stable-diffusion-web-ui-lineageWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, & Upscalershttps://huggingface.co/modelshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.info>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/tdrussell/diffusion-pipehttps://github.com/kohya-ss/sd-scriptshttps://github.com/kohya-ss/musubi-tuner>Krea 2https://huggingface.co/krea/Krea-2-Rawhttps://huggingface.co/krea/Krea-2-Turbo>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/https://animadex.net>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>LTX-2.3https://huggingface.co/collections/Lightricks/ltx-23>Wanhttps://github.com/Wan-Video/Wan2.2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
>mfw Resource news06/26/2026>OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generationhttps://correr-zhou.github.io/OmniShow>Adobe to Acquire Topaz Labshttps://news.adobe.com/news/2026/06/adobe-to-acquire-topaz-labs>LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editinghttps://live-edit.github.io>PhysRAG: Enhancing Physics-Awareness in Video Generation via Retrieval-Augmented Generationhttps://github.com/sediment1024/PhysRAG>SAM2Matting: Generalized Image and Video Mattinghttps://henghuiding.com/SAM2Matting>Unison: Benchmarking Unified Multimodal Models via Synergistic Understanding and Generationhttps://github.com/FudanCVL/Unison>ComfyUI-AppleSilicon-FP8 - a compatibility layer custom node for Apple Siliconhttps://github.com/pawel-mazurkiewicz/ComfyUI-AppleSilicon-FP806/25/2026>Bernini-R — GGUF (high & low noise experts) https://huggingface.co/neuregex/Bernini-R-GGUF>Physics Question Scene Graph: Fine-grained Evaluation of Physical Plausibility in Text-to-Video Generationhttps://github.com/atinpothiraj/pqsg>VPA-Guard: Defending and Benchmarking Image-to-Video Generation Against Visual Prompt Attackshttps://huggingface.co/datasets/CSU-JPG/VVA-Bench>Minimalist Preprocessing Approach for Image Synthesis Detectionhttps://github.com/vohoaidanh/adof06/24/2026>Krea-2-Turbo Training Adapter https://huggingface.co/ostris/krea2_turbo_training_adapter>Vera: A Layered Diffusion Model for Content-Preserving Video Editinghttps://vera-layered-diffusion.github.io>Advancing WordArt-Oriented Scene Text Recognition: Datasets and Methodshttps://github.com/YesianRohn/WATER>DramaDirector: Geometry-Guided Short Drama Generationhttps://github.com/iLearn-Lab/DramaDirector>PG-MAP: Joint MAP Optimization for Inference-Time Alignment of Diffusion and Flow-Matching Modelshttps://github.com/sophialanlan/PG-MAP>Safe Few-Step Generation via Velocity Editinghttps://uzn36.github.io/VESFlow
>mfw Research news06/26/2026>From Celebrities to Anyone: Characterizing AI Nudification Content, Technology, and Community Dynamics on 4chanhttps://arxiv.org/abs/2606.27234>LearniBridge: Learnable Calibration of Feature Caching for Diffusion Models Accelerationhttps://arxiv.org/abs/2606.26778>LCG: Long-Context Consistent Image Generation with Sparse Relational Attentionhttps://arxiv.org/abs/2606.26171>Disco-LoRA: Disentangled Composition of Content, Style, and Motion for Multi-concept Video Customizationhttps://arxiv.org/abs/2606.26668>ResilPhase: Plug-and-Play Phase Mapping for Diffusion Accelerationhttps://arxiv.org/abs/2606.26769>NaviCache: Test-Time Self-Calibration Caching for Video Generationhttps://arxiv.org/abs/2606.26795>DanceDuo: Bridging Human Movement and AI Choreographyhttps://arxiv.org/abs/2606.26507>PhyEditBench: A Real-World Multi-Stage Benchmark for Physics-Aware Image Editinghttps://arxiv.org/abs/2606.26551>TMP: Tree-structured Mixed-policy Pruning for Large-scale Image Generation and Editinghttps://arxiv.org/abs/2606.27089>DanceOPD: On-Policy Generative Field Distillationhttps://danceopd.github.io>Do Image Editing Models Understand Lighting?https://arxiv.org/abs/2606.26738>Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generationhttps://arxiv.org/abs/2606.26907>Adversarial Diffusion Across Modalities: A Fusion Survey of Attacks, Defenses, and Evaluation for Text, Vision, and VLMshttps://arxiv.org/abs/2606.26566>Safe Autoregressive Image Generation with Iterative Self-Improving Codebookshttps://arxiv.org/abs/2606.27147>SpatialFlow-GRPO: Where Spatial Credit Drives Image Editinghttps://arxiv.org/abs/2606.26872>Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewardshttps://arxiv.org/abs/2606.27376>Scaling Multi-Reference Image Generation with Dynamic Reward Optimizationhttps://arxiv.org/abs/2606.26947
>>109143792>>109143828It may have done alright in those cases, but the main issue with it is that it fucks up too often. Bad hands, etc... Also it was terrible at styles, worse than Z-Image, and obviously way worse than Krea 2. I'm not doing anything wrong here, I don't think
>mfw API news>ByteDance launches Seed Audio 1.0 Unified AI Audio Generation for Speech, Music and Ambient Sound Creationhttps://fal.ai/models/bytedance/seed-audio-1.0>Midjourney goes from generating cat images to full-body ultrasound scanshttps://www.theverge.com/ai-artificial-intelligence/952011/midjourney-medical-ai-ultrasound-scan>Alibaba releases HappyHorse 1.1 Available on Alibaba Cloudhttps://www.alibabacloud.com/blog/happyhorse-gets-stronger-motion-expressiveness-higher-generation-consistency-and-enhanced-visual-quality_603293>ByteDance's New AI Video Model Can Make 30-Second Clips From a Single Prompthttps://www.cnet.com/tech/services-and-software/bytedance-introduces-new-seedance-2-5-video-model/>Luma Introduces Ray3.2 Model & API: Complete Creative Control for Video Generationhttps://lumalabs.ai/news>The Layout Bet — Reve 2.0https://blog.reve.com/posts/the-layout-bet>Introducing Gemini Omni — Google’s multimodal video creation/editing modelhttps://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/>Nano Banana 2 and Nano Banana Pro are generally available via Gemini Enterprise Agent Platformhttps://cloud.google.com/blog/products/ai-machine-learning/nano-banana-2-and-nano-banana-pro-are-generally-available>Grok Imagine 1.5 Previewhttps://x.ai/news/grok-imagine-1-5>Seedance 2.0 in Runway APIhttps://docs.dev.runwayml.com/api-details/api_changelog/
>>109143898Krea 2 is good at styles except the faces. Every oil painting gen I've seen from it has an out of place Dreamshaper XL face
I know absolutely nothing about anime genning but I'm having fun.
>>109141439>I like to believe 4chan appeals to individuals who value open discussion of ideas above all else.Not really.I have a hard time getting people to actually post sources for their claims. I've been on 4chan for years and getting people to act intellectually honest is nearly impossible in some discussions. might be a /pol/ thing to some extent but i see it on other boards too.>>109141614I'd say its just the barrier to entry on reddit is lower. most normies dont like being insulted over trivial things or some of the more negative aspects of 4chan.people exaggerate how bad the janitors are on reddit. its probably better for discussion in a lot of areas than here.i think it also might be that tech savvy people on reddit tend to be more anti-AI than on here.
>>109143928looks good anon. keep it up. i like the colors
>>109143930Whatever problems this site has, they're much worse on reddit. It's a cesspool of toxicity.
>>109143957Classic SDXL 1.0 vibes.
>>109143930/pol/ is almost entirely schizos and government agencies talking to each other spreading misinformation on purpose. it's an entirely useless board for conversation of any kind. most people on 4chan are just retard social rejects in general, though.
Krea 2 is the first local model I used that could oneshot this without inpainting or finetuning.>An Anime screenshot of Hatsune Miku taking a selfie. Her eyes are wide opened and move independently. Her left eye is rolling upwards, and her right eye is rolling downwards.
>>109144045But her left eye is not rolling upwards, and her right is not rolling downwards.
is this a fucking joke
>>109144060>gay man makes and runs civitai>surprised he supports gay people
>>109144058I'll let it slide and take the W. Every model I've tried just makes them stare at the camera, kek.
>>109143930reddit is way more corporate and censored
krea is pissing me the fuck off
>>109144085pick up a pencil then
>>109144066i wouldn't even be mad but he logged me out and i have no idea what my login details weregod AI is so hard
>>109144085whatll make you feel better is sperging out about it for the whole thread and calling anyone who says anything good about it at all a shill then you could pivot into calling local a sham or whatever but only after youve done the first thing
>>109144060were you under the impression that CivitAI was run by /pol/tards or something? Why is this surprising in any way whatsoever lmao
>>109144103no he'll start talking about the VAE and crying about it all day.
>>109144105normal people exist too bruhit's not just "gigantic faggots" and "le heckin' other side"
>>109144108i do understand why it uses qwens vae but i cant say it doesnt make me a bit sad. not sad enough to constantly harp on it tho
>>109144045I'm not committed enough to that prompt to try it out more than once on a couple of models so I'll take your word for it
>>109144111normal people don't whine about sharing spaces
>>109143837What does "kino" and "sovl" mean to you, anon?
>>109144091and do what shove it in your asshole, asshole?>>109144103i was hoping to get some helpful tips on how to make my gens more close to what they look like in my imagination. i'm not the vae asshole
>>109144139make the white guy charlie kirk
>>109144150doesn't know him :(
anons how do i change my password on civitai
>>109144170Tell me your old password and new password and I'll update it for you
>>109144170I haven't used it in months but don't you login through an email verification code?
>>109144177i have no idea lmao>>109144178yeah but any logical person would give the option to change password afterwards. does he really want me to make a new account every time he decides to log me out
>>109144184retard
>>109144060this is just silicon wafer reflection, bro, chill
>>109144224fucking fag chips
>>109144111Heh you aren't clearly black pilled
>>109144269I WILL NOT EAT YOUR FROZEN MEALS GORDON
is it possible to gen the exact image you imagined inside your head? has anyone ever done this?
>>109144226dont care. still not paying your subscription altman
>>109144301Yes, just write
New experiment. Chroma 1 HD Flash for initial composition, then taken through Krea for refinement, in a hybrid Chroma-Krea workflow. Same prompts. The results aren't half bad, oneshot fixed the incoherent background on Chroma https://files.catbox.moe/oh4w8f.png
New experiment. Just use Krea2.
>>109144600The chroma faggot is back
>>109144600>The results aren't half bad, oneshot fixed the incoherent background on ChromaAgreed but I enjoy the look of Choma's more.
>>109144616The faggot is back
>>109144362stop lyieing, it's literally impossible without bounding boxes
how much more like does DiT have in it really
>>109144622you can spot ideogram from the fucked up eyes. good gen though.
113th iteration and it's NOT exactly the image inside my head
>>109144600Did something similar a while back, used Chroma's great understanding of different cinematic and photographic styles, generated lots of ~640 res images, then used cherry-picked ones with img2img ~1280 res on Z-Image Turbo to get detail and nice anatomy and trained a set on the images, results were really good as far as purely synthetic sets go
>>109144636its krea2
what drives people to lie on the internet?
>>109144301Like generative music i.e. eurorack-esque, it's more about setting conditions that produce interesting or novel outcomes. >>109144642Nice gen.
>>109144616he's been here, he's the only one shilling krea. for someone obsessed with blurry analog ''realism'' he somehow always picks plastic models to shitgen with
>>109144659Ideogram is dead, go cry in a corner
>>109144667considering you're the only jeet left shilling chroma in 2026 i'd say you have a pretty bad track record for picking models
>>109144657yeah, it's painful getting something precise with krea right now. the future of local ai will be like making little 3d dolls and objects in blender, or some other program like that, posing/composing them exactly how you want, then give that to comfy somehow, or take a screenshot, and put bounding boxes around them describing what they are so that a edit model can turn it into a real picture. at least for getting things from our imagination perfectly right.
huh, CivitAI trainer has a lot of models now
>>109144690>the future of local ai will be like making little 3d dolls and objects in blender, or some other program like that, posing/composing them exactly how you want, then give that to comfy somehow, or take a screenshot, and put bounding boxes around them describing what they are so that a edit model can turn it into a real picture. at least for getting things from our imagination perfectly right.this is already possible
>>109144611This works extremely well. It's like a finetune with no LoRA. You will quickly find small nuances Krea does not understand and can tune them.>>109144611Try doing picrel (unblurred soles) with just Krea. Turns out this idea works extremely well. Until a Chroma style finetune comes around this is the best way to experience a hypothetical Chroma Krea. This was the initial img https://files.catbox.moe/xegzqc.pngFor all you haters complained about Chroma, you sure aren't very creative. It also brings all of Chroma's styles to Krea, all of which just needed some further inaccessible form of tuning to look half decent.