[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Previous /sdg/ thread : >>108815950

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Advanced UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge Classic: https://github.com/Haoming02/sd-webui-forge-classic
Stability Matrix: https://github.com/LykosAI/StabilityMatrix

>Z-Image
https://comfyanonymous.github.io/ComfyUI_examples/z_image
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Flux.2 Dev/Klein
https://comfyanonymous.github.io/ComfyUI_examples/flux2
https://huggingface.co/black-forest-labs/FLUX.2-dev
https://huggingface.co/black-forest-labs/FLUX.2-klein-4B
https://huggingface.co/black-forest-labs/FLUX.2-klein-9B

>Chroma
https://comfyanonymous.github.io/ComfyUI_examples/chroma
https://huggingface.co/lodestones/Chroma1-HD
https://huggingface.co/silveroxides/Chroma-GGUF

>Anima
https://huggingface.co/circlestone-labs/Anima

>Qwen Image & Edit
https://docs.comfy.org/tutorials/image/qwen/qwen-image
https://huggingface.co/Qwen/Qwen-Image

>Text & image to video - Wan 2.2
https://docs.comfy.org/tutorials/video/wan/wan2_2

>Models, LoRAs & upscaling
https://civitai.com
https://huggingface.co
https://tungsten.run
https://yodayo.com/models
https://www.diffusionarc.com
https://miyukiai.com
https://civitaiarchive.com
https://civitasbay.org
https://www.stablebay.org
https://openmodeldb.info

>Index of guides and other tools
https://rentry.org/sdg-link

>Related boards
>>>/aco/sdg
>>>/b/degen
>>>/d/ddg
>>>/e/edg
>>>/gif/vdg
>>>/h/hdg
>>>/r/realistic+parody
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vp/napt
>>>/vt/vtai

OP https://rentry.co/twkuk8tz
>>
>>
mfw
>>
>mfw Resource news

05/15/2026

>Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
https://github.com/thu-ml/Causal-Forcing

>EntityBench: Towards Entity-Consistent Long-Range Multi-Shot Video Generation
https://catherine-r-he.github.io/EntityBench

>ClickRemoval: An Interactive Open-Source Tool for Object Removal in Diffusion Models
https://github.com/zld-make/ClickRemoval

>Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video
https://yyfz.github.io/warp-as-history

>RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPO
https://yanzuo.lu/raven

>Image Restoration via Diffusion Models with Dynamic Resolution
https://github.com/StarNextDay/SubDAPS.git

>Does Synthetic Layered Design Data Benefit Layered Design Decomposition?
https://github.com/YangHaolin0526/SynLayers

>InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation
https://github.com/LeapLabTHU/InsightTok

>ComfyUI-AsymFlow
https://github.com/CanFromEarth/ComfyUI-Klein9B-AsymFlow

>Microsoft Lens: 3.8B parameter text-to-image model. Available in RL-tuned and fast 4-step Lens-Turbo
https://huggingface.co/microsoft/Lens
https://huggingface.co/microsoft/Lens-Turbo

>snwy/SD1.5-DALLE-2
https://huggingface.co/snwy/SD1.5-DALLE-2/tree/main

>stable-diffusion-webui-codex v0.3.0-beta
https://github.com/sangoi-exe/stable-diffusion-webui-codex

05/14/2026

>Anima base v1.0 released
https://huggingface.co/circlestone-labs/Anima

>TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking
https://cvlab-kaist.github.io/TrackCraft3r

>AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
https://nvlabs.github.io/AnyFlow

>Skill-Aligned Annotation for Reliable Evaluation in Text-to-Image Generation
https://abdo-eldesokey.github.io/skill-aligned-eval
>>
>mfw Research news

05/15/2026

>Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity
https://jiahaotian-sjtu.github.io/headforcing.github.io

>HASTE: Training-Free Video Diffusion Acceleration via Head-Wise Adaptive Sparse Attention
https://arxiv.org/abs/2605.14513

>Reduce the Artifacts Bias for More Generalizable AI-Generated Image Detection
https://arxiv.org/abs/2605.14486

>ACE-LoRA: Adaptive Orthogonal Decoupling for Continual Image Editing
https://arxiv.org/abs/2605.14948

>Compositional Video Generation via Inference-Time Guidance
https://arxiv.org/abs/2605.14988

>Few Channels Draw The Whole Picture: Revealing Massive Activations in Diffusion Transformers
https://aimagelab.github.io/MAs-DiT

>Supersampling Stable Diffusion and Beyond: A Seamless, Training-Free Approach for Scaling Neural Networks Using Common Interpolation Methods
https://arxiv.org/abs/2605.08698

>Delta Forcing: Trust Region Steering for Interactive Autoregressive Video Generation
https://arxiv.org/abs/2605.14382

>Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecture
https://arxiv.org/abs/2605.14448

>MiVE: Multiscale Vision-language features for reference-guided video Editing
https://arxiv.org/abs/2605.14664

>HDRFace: Rethinking Face Restoration with High-Dimensional Representation
https://arxiv.org/abs/2605.14821

>Unlocking Complex Visual Generation via Closed-Loop Verified Reasoning
https://arxiv.org/abs/2605.14876

>Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers
https://arxiv.org/abs/2605.14270

>KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration
https://arxiv.org/abs/2605.14278

>DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models
https://arxiv.org/abs/2605.15055

>RefDecoder: Enhancing Visual Generation with Conditional Video Decoding
https://arxiv.org/abs/2605.15196
>>
>mfw MORE Research news

>CreFlow: Corrective Reflow for Sparse-Reward Embodied Video Diffusion RL
https://arxiv.org/abs/2605.14274

>Aligning Latent Geometry for Spherical Flow Matching in Image Generation
https://arxiv.org/abs/2605.15193

>CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers
https://arxiv.org/abs/2605.14191

>Mitigating Mask Prior Drift and Positional Attention Collapse in Large Diffusion Vision-Language Models
https://arxiv.org/abs/2605.14530

>Unified Pix Token And Word Token Generative Language Model
https://arxiv.org/abs/2605.14028

>TeDiO: Temporal Diagonal Optimization for Training-Free Coherent Video Diffusion
https://arxiv.org/abs/2605.14136

>LiWi: Layering in the Wild
https://arxiv.org/abs/2605.14552

>HeatKV: Head-tuned KV-cache Compression for Visual Autoregressive Modeling
https://arxiv.org/abs/2605.14877

>From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing
https://arxiv.org/abs/2605.15181

>MHSA: A Lightweight Framework for Mitigating Hallucinations via Steered Attention in LVLMs
https://arxiv.org/abs/2605.14966

>SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer
https://nvlabs.github.io/Sana/WM

>Your CLIP has 164 dimensions of noise: Exploring the embeddings covariance eigenspectrum of contrastively pretrained vision-language transformers
https://arxiv.org/abs/2605.14893

>Representative Attention For Vision Transformers
https://arxiv.org/abs/2605.14913

>The Velocity Deficit: Initial Energy Injection for Flow Matching
https://arxiv.org/abs/2605.14819

>Editor's Choice: Evaluating Abstract Intent in Image Editing through Atomic Entity Analysis
https://arxiv.org/abs/2605.14842

>On the Cultural Anachronism and Temporal Reasoning in VLMs
https://khushboo0012.github.io/tab-vlm-webpage

>CoCoEdit: Content-Consistent Image Editing via Region Regularized Reinforcement Learning
https://arxiv.org/abs/2602.14068
>>
File: debo_cs_anima1_00027_.png (1.94 MB, 1408x1126)
1.94 MB PNG
>>
>>
>>
File: debo_cs_anima1_00031_.png (1.9 MB, 1408x1126)
1.9 MB PNG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.