Previous /sdg/ thread : >>108815950>Beginner UIEasyDiffusion: https://easydiffusion.github.ioSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI>Advanced UIComfyUI: https://github.com/comfyanonymous/ComfyUIForge Classic: https://github.com/Haoming02/sd-webui-forge-classicStability Matrix: https://github.com/LykosAI/StabilityMatrix>Z-Imagehttps://comfyanonymous.github.io/ComfyUI_examples/z_imagehttps://huggingface.co/Tongyi-MAI/Z-Imagehttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>Flux.2 Dev/Kleinhttps://comfyanonymous.github.io/ComfyUI_examples/flux2https://huggingface.co/black-forest-labs/FLUX.2-devhttps://huggingface.co/black-forest-labs/FLUX.2-klein-4Bhttps://huggingface.co/black-forest-labs/FLUX.2-klein-9B>Chromahttps://comfyanonymous.github.io/ComfyUI_examples/chromahttps://huggingface.co/lodestones/Chroma1-HDhttps://huggingface.co/silveroxides/Chroma-GGUF>Animahttps://huggingface.co/circlestone-labs/Anima>Qwen Image & Edithttps://docs.comfy.org/tutorials/image/qwen/qwen-imagehttps://huggingface.co/Qwen/Qwen-Image>Text & image to video - Wan 2.2https://docs.comfy.org/tutorials/video/wan/wan2_2>Models, LoRAs & upscalinghttps://civitai.comhttps://huggingface.cohttps://tungsten.runhttps://yodayo.com/modelshttps://www.diffusionarc.comhttps://miyukiai.comhttps://civitaiarchive.comhttps://civitasbay.orghttps://www.stablebay.orghttps://openmodeldb.info>Index of guides and other toolshttps://rentry.org/sdg-link>Related boards>>>/aco/sdg>>>/b/degen>>>/d/ddg>>>/e/edg>>>/gif/vdg>>>/h/hdg>>>/r/realistic+parody>>>/tg/slop>>>/trash/sdg>>>/u/udg>>>/vp/napt>>>/vt/vtaiOP https://rentry.co/twkuk8tz
mfw
>mfw Resource news05/15/2026>Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generationhttps://github.com/thu-ml/Causal-Forcing>EntityBench: Towards Entity-Consistent Long-Range Multi-Shot Video Generationhttps://catherine-r-he.github.io/EntityBench>ClickRemoval: An Interactive Open-Source Tool for Object Removal in Diffusion Modelshttps://github.com/zld-make/ClickRemoval>Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Videohttps://yyfz.github.io/warp-as-history>RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPOhttps://yanzuo.lu/raven>Image Restoration via Diffusion Models with Dynamic Resolutionhttps://github.com/StarNextDay/SubDAPS.git>Does Synthetic Layered Design Data Benefit Layered Design Decomposition?https://github.com/YangHaolin0526/SynLayers>InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generationhttps://github.com/LeapLabTHU/InsightTok>ComfyUI-AsymFlowhttps://github.com/CanFromEarth/ComfyUI-Klein9B-AsymFlow>Microsoft Lens: 3.8B parameter text-to-image model. Available in RL-tuned and fast 4-step Lens-Turbohttps://huggingface.co/microsoft/Lenshttps://huggingface.co/microsoft/Lens-Turbo>snwy/SD1.5-DALLE-2https://huggingface.co/snwy/SD1.5-DALLE-2/tree/main>stable-diffusion-webui-codex v0.3.0-betahttps://github.com/sangoi-exe/stable-diffusion-webui-codex05/14/2026>Anima base v1.0 releasedhttps://huggingface.co/circlestone-labs/Anima>TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Trackinghttps://cvlab-kaist.github.io/TrackCraft3r>AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillationhttps://nvlabs.github.io/AnyFlow>Skill-Aligned Annotation for Reliable Evaluation in Text-to-Image Generationhttps://abdo-eldesokey.github.io/skill-aligned-eval
>mfw Research news05/15/2026>Head Forcing: Long Autoregressive Video Generation via Head Heterogeneityhttps://jiahaotian-sjtu.github.io/headforcing.github.io>HASTE: Training-Free Video Diffusion Acceleration via Head-Wise Adaptive Sparse Attentionhttps://arxiv.org/abs/2605.14513>Reduce the Artifacts Bias for More Generalizable AI-Generated Image Detectionhttps://arxiv.org/abs/2605.14486>ACE-LoRA: Adaptive Orthogonal Decoupling for Continual Image Editinghttps://arxiv.org/abs/2605.14948>Compositional Video Generation via Inference-Time Guidancehttps://arxiv.org/abs/2605.14988>Few Channels Draw The Whole Picture: Revealing Massive Activations in Diffusion Transformershttps://aimagelab.github.io/MAs-DiT>Supersampling Stable Diffusion and Beyond: A Seamless, Training-Free Approach for Scaling Neural Networks Using Common Interpolation Methodshttps://arxiv.org/abs/2605.08698>Delta Forcing: Trust Region Steering for Interactive Autoregressive Video Generationhttps://arxiv.org/abs/2605.14382>Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecturehttps://arxiv.org/abs/2605.14448>MiVE: Multiscale Vision-language features for reference-guided video Editinghttps://arxiv.org/abs/2605.14664>HDRFace: Rethinking Face Restoration with High-Dimensional Representationhttps://arxiv.org/abs/2605.14821>Unlocking Complex Visual Generation via Closed-Loop Verified Reasoninghttps://arxiv.org/abs/2605.14876>Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformershttps://arxiv.org/abs/2605.14270>KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Explorationhttps://arxiv.org/abs/2605.14278>DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Modelshttps://arxiv.org/abs/2605.15055>RefDecoder: Enhancing Visual Generation with Conditional Video Decodinghttps://arxiv.org/abs/2605.15196
>mfw MORE Research news>CreFlow: Corrective Reflow for Sparse-Reward Embodied Video Diffusion RLhttps://arxiv.org/abs/2605.14274>Aligning Latent Geometry for Spherical Flow Matching in Image Generationhttps://arxiv.org/abs/2605.15193>CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformershttps://arxiv.org/abs/2605.14191>Mitigating Mask Prior Drift and Positional Attention Collapse in Large Diffusion Vision-Language Modelshttps://arxiv.org/abs/2605.14530>Unified Pix Token And Word Token Generative Language Modelhttps://arxiv.org/abs/2605.14028>TeDiO: Temporal Diagonal Optimization for Training-Free Coherent Video Diffusionhttps://arxiv.org/abs/2605.14136>LiWi: Layering in the Wildhttps://arxiv.org/abs/2605.14552>HeatKV: Head-tuned KV-cache Compression for Visual Autoregressive Modelinghttps://arxiv.org/abs/2605.14877>From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editinghttps://arxiv.org/abs/2605.15181>MHSA: A Lightweight Framework for Mitigating Hallucinations via Steered Attention in LVLMshttps://arxiv.org/abs/2605.14966>SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformerhttps://nvlabs.github.io/Sana/WM>Your CLIP has 164 dimensions of noise: Exploring the embeddings covariance eigenspectrum of contrastively pretrained vision-language transformershttps://arxiv.org/abs/2605.14893>Representative Attention For Vision Transformershttps://arxiv.org/abs/2605.14913>The Velocity Deficit: Initial Energy Injection for Flow Matchinghttps://arxiv.org/abs/2605.14819>Editor's Choice: Evaluating Abstract Intent in Image Editing through Atomic Entity Analysishttps://arxiv.org/abs/2605.14842>On the Cultural Anachronism and Temporal Reasoning in VLMshttps://khushboo0012.github.io/tab-vlm-webpage>CoCoEdit: Content-Consistent Image Editing via Region Regularized Reinforcement Learninghttps://arxiv.org/abs/2602.14068