Previous /sdg/ thread : >>107766236>Beginner UIEasyDiffusion: https://easydiffusion.github.ioSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI>Advanced UIComfyUI: https://github.com/comfyanonymous/ComfyUIForge Classic: https://github.com/Haoming02/sd-webui-forge-classicStability Matrix: https://github.com/LykosAI/StabilityMatrix>Z-Image Turbohttps://comfyanonymous.github.io/ComfyUI_examples/z_imagehttps://huggingface.co/Tongyi-MAI/Z-Image-Turbohttps://huggingface.co/jayn7/Z-Image-Turbo-GGUF>Flux.2 Devhttps://comfyanonymous.github.io/ComfyUI_examples/flux2https://huggingface.co/black-forest-labs/FLUX.2-devhttps://huggingface.co/city96/FLUX.2-dev-gguf>Qwen Image & Edithttps://docs.comfy.org/tutorials/image/qwen/qwen-imagehttps://huggingface.co/Qwen/Qwen-Imagehttps://huggingface.co/QuantStack/Qwen-Image-GGUFhttps://huggingface.co/QuantStack/Qwen-Image-Distill-GGUFhttps://huggingface.co/QuantStack/Qwen-Image-Edit-2509-GGUF>Text & image to video - Wan 2.2https://docs.comfy.org/tutorials/video/wan/wan2_2https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUFhttps://huggingface.co/QuantStack/Wan2.2-T2V-A14B-GGUFhttps://huggingface.co/QuantStack/Wan2.2-I2V-A14B-GGUF>Chromahttps://comfyanonymous.github.io/ComfyUI_examples/chromahttps://github.com/maybleMyers/chromaforgehttps://huggingface.co/lodestones/Chroma1-HDhttps://huggingface.co/silveroxides/Chroma-GGUF>Models, LoRAs & upscalinghttps://civitai.comhttps://huggingface.cohttps://tungsten.runhttps://yodayo.com/modelshttps://www.diffusionarc.comhttps://miyukiai.comhttps://civitaiarchive.comhttps://civitasbay.orghttps://www.stablebay.orghttps://openmodeldb.info>Index of guides and other toolshttps://rentry.org/sdg-link>Related boards>>>/h/hdg>>>/e/edg>>>/d/ddg>>>/b/degen>>>/vt/vtai>>>/aco/sdg>>>/u/udg>>>/tg/slop>>>/trash/sdg>>>/vp/napt>>>/r/realistic+parody
Thanks for baking anon.
>>107785867you're welcome. i am sorry, i had bad timing, didn't mean to take away the baking opportunity
>>107785902It's okay :)
>mfw Resource news01/06/2026>LTX-2: DiT-based audio-video foundation modelhttps://github.com/Lightricks/LTX-2>DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformerhttps://guoxu1233.github.io/DreamID-V>Diffusion Toolkit v1.10https://github.com/RupertAvery/DiffusionToolkit/releases/tag/v1.10>NVIDIA RTX Accelerates 4K AI Video Generation on PC With LTX-2 and ComfyUI Upgradeshttps://blogs.nvidia.com/blog/rtx-ai-garage-ces-2026-open-models-video-generation>fp8_e4m3fn conversion of Gemma 3 12b it text encoderhttps://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn>Diffuse: Windows desktop UI for Huggingface Diffusershttps://github.com/TensorStack-AI/Diffuse>SwinIFS: Landmark Guided Swin Transformer For Identity Preserving Face Super Resolutionhttps://github.com/Habiba123-stack/SwinIFS>PartImageNet++ Dataset: Enhancing Visual Models with High-Quality Part Annotationshttps://github.com/LixiaoTHU/PartImageNetPP>E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Modelshttps://github.com/shengjun-zhang/VisualGRPO>A Comprehensive Dataset for Human vs. AI Generated Image Detectionhttps://huggingface.co/datasets/Rajarshi-Roy-research/Defactify_Image_Dataset>MagicFight: Personalized Martial Arts Combat Video Generationhttps://MingfuYAN.github.io/MagicFight01/04/2026>Invoke AI 6.10 - now supports Z-Image Turbohttps://github.com/invoke-ai/InvokeAI/releases/tag/v6.10.0rc2>ComfyUI Wan VACE Video Joinerhttps://github.com/stuttlepress/ComfyUI-Wan-VACE-Video-Joiner>UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinementhttps://pku-yuangroup.github.io/UltraShape-1.0>OpenVINO AI Plugins for GIMPhttps://github.com/intel/openvino-ai-plugins-gimp/releases/tag/3.2.0>Comfyui-GeminiWebhttps://github.com/Koko-boya/Comfyui-GeminiWeb
>mfw Research news01/06/2026>NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generationhttps://arxiv.org/abs/2601.02204>Unraveling MMDiT Blocks: Training-free Analysis and Enhancement of Text-conditioned Diffusionhttps://arxiv.org/abs/2601.02211>BiPrompt: Bilateral Prompt Optimization for Visual and Textual Debiasing in Vision-Language Modelshttps://arxiv.org/abs/2601.02147>VIBE: Visual Instruction Based Editorhttps://arxiv.org/abs/2601.02242>ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectorshttps://mapooon.github.io/ExposeAnyonePage>Agentic Retoucher for Text-To-Image Generationhttps://arxiv.org/abs/2601.02046>HyperCLOVA X 8B Omnihttps://arxiv.org/abs/2601.01792>Forget Less by Learning from Parents Through Hierarchical Relationshipshttps://arxiv.org/abs/2601.01892>TalkPhoto: A Versatile Training-Free Conversational Assistant for Intelligent Image Editinghttps://arxiv.org/abs/2601.01915>MotionAdapter: Video Motion Transfer via Content-Aware Attention Customizationhttps://arxiv.org/abs/2601.01955>AFTER: Mitigating the Object Hallucination of LVLM via Adaptive Factual-Guided Activation Editinghttps://arxiv.org/abs/2601.01957>GDRO: Group-level Reward Post-training Suitable for Diffusion Modelshttps://arxiv.org/abs/2601.02036>VINO: A Unified Visual Generator with Interleaved OmniModal Contexthttps://sotamak1r.github.io/VINO-web>DatBench: Discriminative, Faithful, and Efficient VLM Evaluationshttps://arxiv.org/abs/2601.02316>A Comparative Study of Custom CNNs, Pre-trained Models, and Transfer Learning Across Multiple Visual Datasetshttps://arxiv.org/abs/2601.02246>VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generationhttps://arxiv.org/abs/2601.02256>DeepInv: A Novel Self-supervised Learning Approach for Fast and Accurate Diffusion Inversionhttps://arxiv.org/abs/2601.01487
>mfw MORE Research news>FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editinghttps://arxiv.org/abs/2601.01720>Improving Flexible Image Tokenizers for Autoregressive Image Generationhttps://arxiv.org/abs/2601.01535>Unified Generation and Self-Verification for Vision-Language Models via Advantage Decoupled Preference Optimizationhttps://arxiv.org/abs/2601.01483>Guiding Token-Sparse Diffusion Modelshttps://arxiv.org/abs/2601.01608>Image Synthesis Using Spintronic Deep Convolutional Generative Adversarial Networkhttps://arxiv.org/abs/2601.01441>Slot-ID: Identity-Preserving Video Generation from Reference Videos via Slot-Based Temporal Identity Encodinghttps://arxiv.org/abs/2601.01352>Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Modelshttps://arxiv.org/abs/2601.01085>Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignmenthttps://arxiv.org/abs/2601.01224>Evolving CNN Architectures: From Custom Designs to Deep Residual Models for Diverse Image Classification and Detection Taskshttps://arxiv.org/abs/2601.01099>YODA: Yet Another One-step Diffusion-based Video Compressorhttps://arxiv.org/abs/2601.01141>NarrativeTrack: Evaluating Video Language Models Beyond the Framehttps://arxiv.org/abs/2601.01095>CRoPS: A Training-Free Hallucination Mitigation Framework for Vision-Language Modelshttps://arxiv.org/abs/2601.00659>TimeColor: Flexible Reference Colorization via Temporal Concatenationhttps://bconstantine.github.io/TimeColor>DynaDrag: Dynamic Drag-Style Image Editing by Motion Predictionhttps://arxiv.org/abs/2601.00542>FreeText: Training-Free Text Rendering in Diffusion Transformers via Attention Localization and Spectral Glyph Injectionhttps://arxiv.org/abs/2601.00535
desuarchive.org/g/thread/1037820801 year ago
>>107786576RIP dance anon
>>107786576>First for schizo containment general appears this anon was finally defeated>some comfy postsmaybe the last we'd ever get
>>107786690we lost all the schizos except the one
>>107786781???
gn
why comfy github repo returning 403 on fetch
>>107790572gn>>107790581he moved it to the orghttps://github.com/Comfy-Org/ComfyUI
There is much untruth in a dream
i miss schizo anon
>>107790664It's working now but when I posted that the org was "disabled", so said the remote(GitHub).Never seen that before except when repos get DMCA requests.
Cakes on the griddle
I am using Chatgpt to try to create this weird tales art style like the one on the left.I was trying a prompt like thisPrompt: 1930s Weird Tales pulp magazine cover art, commercial gouache illustration style. Flat opaque color application with simplified modeling, poster-aesthetic with naturalistic forms. Bold color zones with controlled edges, limited soft transitions only for form modeling. Strong value contrast, theatrical stage lighting, hard-edged cast shadows. Vintage four-color printing look, slight registration offset. Illustrative realism with posterized color planes.A chaotic, night-time battle scene at a bandit camp. Sir Gottfried, wounded and bleeding, is a "whirlwind of silver and death," fighting multiple bandits simultaneously with his sword and shield. His pose is one of desperate, exaggerated heroism. In the background, there are burning tents and a crossbowman reloading in the shadows. The moon is a "sliver of bone," casting eerie light, while the fires create a warm, dramatic glow on the armor and faces.Saturated dramatic lighting. 1930s adventure pulp, heroic fantasy illustration, magazine cover compositionNOT / Negative Prompt: NOT: oil painting, visible brushstrokes, impasto, heavy texture, painterly, blended edges, soft focus, atmospheric, chiaroscuro, tenebrism, academic painting, fine art, canvas texture, alla prima, glazing, scumbling, loose brushwork, impressionist, expressionist, textured surface, thick paint application, Rembrandt lighting, naturalistic rendering, photorealistic, digital art, concept art, artstation, modern illustration, 3D render, airbrushed, gradient meshbut it produced photo on the right, What am I doing wrong? I imagine overprompting but not sure what to remove and what to include
more new captcha puzzles3 tries to submit
Morning anonsQwen made a huge quokka dog lmao.
>>107795929happy work anniversary, office quokka dog
think i am gonna try to train, with civitai, a koff3 lora for flux. idk if flux is dead but whatever>>107795929morning
>>107795941Train The Ninth Gate lora... from the book images...
>>107795941could always try a z-image lora if you're worried about model relevance
>>107795940That was back in December, today is just regular birthday>>107795941Nice pomni, morning
>>107796195oh, happy birthday office quokka dog
>>107796233Thank you Debo.
>>107795977meh, i'm going to hold off anyway, the civitai site is annoying me. i guess there will be a base z-image released soon?, though i will be unable to run it locally anyway
>>107796274>there will be a base z-image released soon?no idea. there's been mixed messages about the base model release cuz they've suggested its almost ready but then have been mostly silent otherwise
>>107796274>though i will be unable to run it locally anywayz-image is pretty small. have you tried the ggufs to see if they'd work locally? surely something can fitmodel:https://huggingface.co/jayn7/Z-Image-Turbo-GGUF/tree/mainencoder:https://huggingface.co/unsloth/Qwen3-4B-GGUF/tree/main
>>107796274If you have 6GB and at least Ampere card you can run fp16 (or the non guff) one with 16gb shared ram. It's not a big deal.GGUF Q8 takes even less memory but it is slightly slower as it is not hardware accelerated.If I can run it, you can do it too.Flux is way heavier.
>>107796400>>107796474guess i'll start the process of installing and trying to use comfyui, again
>>107796507New comfyui is somewhat bad experience. I don't have a cutting edge version but it has memory management issues. But this means it will oom after every few gens. It's not unusably bad but slightly so.
>>107796641I'm not pulling as long as possiblehopefully someone forks a ComfyClassic and unfucks a lot of their bad decisions
well.. got it working, though i don't yet understand how to get the results i wanthttps://files.catbox.moe/jnbti2.jpgrate the flow/offer guidance as far as the settings i should be using
done some performance optimizations, nice one is fused geglu, also added flash attention, be ready for merge soon but it's tested and working>UNet2DConditionModel runwayml/stable-diffusion-v1-5 @ 512x512 with batch 2 (batch 1 CFG) on 3090>22.8ms>~43.86 it/svery nicei think i can get faster, haven't fused qkv yet, groupnorm and layernorm kernels aren't the best, maybe some more gemm/conv2d fusions are possible or maybe more tile description tuning, and try tuning concat and elementwise kernels more
>>107797148forgot an image
>>107796768>>107797067>[Elton John's "imagine" piano melody playing]>*dum**dum**dum**dum**dum**dum**dum**dum*>*dururururumdum*>Imagine there's weird industrial machinery in every roof>*dururururumdum*
Lunch
>>107797095here's one of my most recent workflows for reference:https://files.catbox.moe/56hg3d.png the auraflow node with shift param [supposedly] improves encoding; people typically a range between 5-7>>107797148>>22.8ms>>~43.86 it/show does this compare with the non-modular models?>>107797191>>Imagine there's weird industrial machinery in every roofthis is the future according to AI. its not like we're using roofs for anything else
>>107796768It will be a long project. I wish I was a dev but I can barely read books.Now would be the time to capture the codebase I think.
>>107797365Before the git gets erased or closed.
>>107797338I just found it funny
>>107797338>the non-modular modelswdym
>>107797487dino transforms models into optimized standalone models, right? I'm always wondering about the benchmarks, non-optimized model vs optimized dino-module
>>107797525ohh>diffusers 23.05 it/s>diffusers with torch.compile max-autotune fullgraph 29.29 it/s
>update comfy>workflow that worked before now oomswowzer!
>>107798245Most of the 3rd party nodes were not using apis' memory management anyway.It was more and more obvious with chinese animation models.
>>107798280eg. They are wrappers.
see the stargs. these text goof-ups humor me >>107798245>he pulledoh no>>107798330ah, my stocks seem to be doing well>>107798488very cool. would play (the pc port)
Entombed left Hand Path.https://www.youtube.com/watch?v=weyYzWU-FNI
>>107799099Yewtube has decided that my current song is this:<
>>107799158https://www.youtube.com/watch?v=hdp7snFv7YI
https://www.youtube.com/watch?v=qeeOc8naPIg
https://suno.com/s/cqz2js53aL29kxGXnice when it ignores my dumb lyrics and instead generates a decent 8 minute instrumental
>>107799340idk why but the drum beat gives me early 90s vibes. fun energyalso, I'd like 4lbs of slame meat please
#wisdom
>>107799980I approve this gen