Previous /sdg/ thread : >>107766236>Beginner UIEasyDiffusion: https://easydiffusion.github.ioSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI>Advanced UIComfyUI: https://github.com/comfyanonymous/ComfyUIForge Classic: https://github.com/Haoming02/sd-webui-forge-classicStability Matrix: https://github.com/LykosAI/StabilityMatrix>Z-Image Turbohttps://comfyanonymous.github.io/ComfyUI_examples/z_imagehttps://huggingface.co/Tongyi-MAI/Z-Image-Turbohttps://huggingface.co/jayn7/Z-Image-Turbo-GGUF>Flux.2 Devhttps://comfyanonymous.github.io/ComfyUI_examples/flux2https://huggingface.co/black-forest-labs/FLUX.2-devhttps://huggingface.co/city96/FLUX.2-dev-gguf>Qwen Image & Edithttps://docs.comfy.org/tutorials/image/qwen/qwen-imagehttps://huggingface.co/Qwen/Qwen-Imagehttps://huggingface.co/QuantStack/Qwen-Image-GGUFhttps://huggingface.co/QuantStack/Qwen-Image-Distill-GGUFhttps://huggingface.co/QuantStack/Qwen-Image-Edit-2509-GGUF>Text & image to video - Wan 2.2https://docs.comfy.org/tutorials/video/wan/wan2_2https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUFhttps://huggingface.co/QuantStack/Wan2.2-T2V-A14B-GGUFhttps://huggingface.co/QuantStack/Wan2.2-I2V-A14B-GGUF>Chromahttps://comfyanonymous.github.io/ComfyUI_examples/chromahttps://github.com/maybleMyers/chromaforgehttps://huggingface.co/lodestones/Chroma1-HDhttps://huggingface.co/silveroxides/Chroma-GGUF>Models, LoRAs & upscalinghttps://civitai.comhttps://huggingface.cohttps://tungsten.runhttps://yodayo.com/modelshttps://www.diffusionarc.comhttps://miyukiai.comhttps://civitaiarchive.comhttps://civitasbay.orghttps://www.stablebay.orghttps://openmodeldb.info>Index of guides and other toolshttps://rentry.org/sdg-link>Related boards>>>/h/hdg>>>/e/edg>>>/d/ddg>>>/b/degen>>>/vt/vtai>>>/aco/sdg>>>/u/udg>>>/tg/slop>>>/trash/sdg>>>/vp/napt>>>/r/realistic+parody
Thanks for baking anon.
>>107785867you're welcome. i am sorry, i had bad timing, didn't mean to take away the baking opportunity
>>107785902It's okay :)
>mfw Resource news01/06/2026>LTX-2: DiT-based audio-video foundation modelhttps://github.com/Lightricks/LTX-2>DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformerhttps://guoxu1233.github.io/DreamID-V>Diffusion Toolkit v1.10https://github.com/RupertAvery/DiffusionToolkit/releases/tag/v1.10>NVIDIA RTX Accelerates 4K AI Video Generation on PC With LTX-2 and ComfyUI Upgradeshttps://blogs.nvidia.com/blog/rtx-ai-garage-ces-2026-open-models-video-generation>fp8_e4m3fn conversion of Gemma 3 12b it text encoderhttps://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn>Diffuse: Windows desktop UI for Huggingface Diffusershttps://github.com/TensorStack-AI/Diffuse>SwinIFS: Landmark Guided Swin Transformer For Identity Preserving Face Super Resolutionhttps://github.com/Habiba123-stack/SwinIFS>PartImageNet++ Dataset: Enhancing Visual Models with High-Quality Part Annotationshttps://github.com/LixiaoTHU/PartImageNetPP>E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Modelshttps://github.com/shengjun-zhang/VisualGRPO>A Comprehensive Dataset for Human vs. AI Generated Image Detectionhttps://huggingface.co/datasets/Rajarshi-Roy-research/Defactify_Image_Dataset>MagicFight: Personalized Martial Arts Combat Video Generationhttps://MingfuYAN.github.io/MagicFight01/04/2026>Invoke AI 6.10 - now supports Z-Image Turbohttps://github.com/invoke-ai/InvokeAI/releases/tag/v6.10.0rc2>ComfyUI Wan VACE Video Joinerhttps://github.com/stuttlepress/ComfyUI-Wan-VACE-Video-Joiner>UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinementhttps://pku-yuangroup.github.io/UltraShape-1.0>OpenVINO AI Plugins for GIMPhttps://github.com/intel/openvino-ai-plugins-gimp/releases/tag/3.2.0>Comfyui-GeminiWebhttps://github.com/Koko-boya/Comfyui-GeminiWeb
>mfw Research news01/06/2026>NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generationhttps://arxiv.org/abs/2601.02204>Unraveling MMDiT Blocks: Training-free Analysis and Enhancement of Text-conditioned Diffusionhttps://arxiv.org/abs/2601.02211>BiPrompt: Bilateral Prompt Optimization for Visual and Textual Debiasing in Vision-Language Modelshttps://arxiv.org/abs/2601.02147>VIBE: Visual Instruction Based Editorhttps://arxiv.org/abs/2601.02242>ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectorshttps://mapooon.github.io/ExposeAnyonePage>Agentic Retoucher for Text-To-Image Generationhttps://arxiv.org/abs/2601.02046>HyperCLOVA X 8B Omnihttps://arxiv.org/abs/2601.01792>Forget Less by Learning from Parents Through Hierarchical Relationshipshttps://arxiv.org/abs/2601.01892>TalkPhoto: A Versatile Training-Free Conversational Assistant for Intelligent Image Editinghttps://arxiv.org/abs/2601.01915>MotionAdapter: Video Motion Transfer via Content-Aware Attention Customizationhttps://arxiv.org/abs/2601.01955>AFTER: Mitigating the Object Hallucination of LVLM via Adaptive Factual-Guided Activation Editinghttps://arxiv.org/abs/2601.01957>GDRO: Group-level Reward Post-training Suitable for Diffusion Modelshttps://arxiv.org/abs/2601.02036>VINO: A Unified Visual Generator with Interleaved OmniModal Contexthttps://sotamak1r.github.io/VINO-web>DatBench: Discriminative, Faithful, and Efficient VLM Evaluationshttps://arxiv.org/abs/2601.02316>A Comparative Study of Custom CNNs, Pre-trained Models, and Transfer Learning Across Multiple Visual Datasetshttps://arxiv.org/abs/2601.02246>VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generationhttps://arxiv.org/abs/2601.02256>DeepInv: A Novel Self-supervised Learning Approach for Fast and Accurate Diffusion Inversionhttps://arxiv.org/abs/2601.01487
>mfw MORE Research news>FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editinghttps://arxiv.org/abs/2601.01720>Improving Flexible Image Tokenizers for Autoregressive Image Generationhttps://arxiv.org/abs/2601.01535>Unified Generation and Self-Verification for Vision-Language Models via Advantage Decoupled Preference Optimizationhttps://arxiv.org/abs/2601.01483>Guiding Token-Sparse Diffusion Modelshttps://arxiv.org/abs/2601.01608>Image Synthesis Using Spintronic Deep Convolutional Generative Adversarial Networkhttps://arxiv.org/abs/2601.01441>Slot-ID: Identity-Preserving Video Generation from Reference Videos via Slot-Based Temporal Identity Encodinghttps://arxiv.org/abs/2601.01352>Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Modelshttps://arxiv.org/abs/2601.01085>Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignmenthttps://arxiv.org/abs/2601.01224>Evolving CNN Architectures: From Custom Designs to Deep Residual Models for Diverse Image Classification and Detection Taskshttps://arxiv.org/abs/2601.01099>YODA: Yet Another One-step Diffusion-based Video Compressorhttps://arxiv.org/abs/2601.01141>NarrativeTrack: Evaluating Video Language Models Beyond the Framehttps://arxiv.org/abs/2601.01095>CRoPS: A Training-Free Hallucination Mitigation Framework for Vision-Language Modelshttps://arxiv.org/abs/2601.00659>TimeColor: Flexible Reference Colorization via Temporal Concatenationhttps://bconstantine.github.io/TimeColor>DynaDrag: Dynamic Drag-Style Image Editing by Motion Predictionhttps://arxiv.org/abs/2601.00542>FreeText: Training-Free Text Rendering in Diffusion Transformers via Attention Localization and Spectral Glyph Injectionhttps://arxiv.org/abs/2601.00535
desuarchive.org/g/thread/1037820801 year ago
>>107786576RIP dance anon
>>107786576>First for schizo containment general appears this anon was finally defeated>some comfy postsmaybe the last we'd ever get
>>107786690we lost all the schizos except the one
>>107786781???
gn
why comfy github repo returning 403 on fetch
>>107790572gn>>107790581he moved it to the orghttps://github.com/Comfy-Org/ComfyUI
There is much untruth in a dream
i miss schizo anon
>>107790664It's working now but when I posted that the org was "disabled", so said the remote(GitHub).Never seen that before except when repos get DMCA requests.
Cakes on the griddle
I am using Chatgpt to try to create this weird tales art style like the one on the left.I was trying a prompt like thisPrompt: 1930s Weird Tales pulp magazine cover art, commercial gouache illustration style. Flat opaque color application with simplified modeling, poster-aesthetic with naturalistic forms. Bold color zones with controlled edges, limited soft transitions only for form modeling. Strong value contrast, theatrical stage lighting, hard-edged cast shadows. Vintage four-color printing look, slight registration offset. Illustrative realism with posterized color planes.A chaotic, night-time battle scene at a bandit camp. Sir Gottfried, wounded and bleeding, is a "whirlwind of silver and death," fighting multiple bandits simultaneously with his sword and shield. His pose is one of desperate, exaggerated heroism. In the background, there are burning tents and a crossbowman reloading in the shadows. The moon is a "sliver of bone," casting eerie light, while the fires create a warm, dramatic glow on the armor and faces.Saturated dramatic lighting. 1930s adventure pulp, heroic fantasy illustration, magazine cover compositionNOT / Negative Prompt: NOT: oil painting, visible brushstrokes, impasto, heavy texture, painterly, blended edges, soft focus, atmospheric, chiaroscuro, tenebrism, academic painting, fine art, canvas texture, alla prima, glazing, scumbling, loose brushwork, impressionist, expressionist, textured surface, thick paint application, Rembrandt lighting, naturalistic rendering, photorealistic, digital art, concept art, artstation, modern illustration, 3D render, airbrushed, gradient meshbut it produced photo on the right, What am I doing wrong? I imagine overprompting but not sure what to remove and what to include
more new captcha puzzles3 tries to submit
Morning anonsQwen made a huge quokka dog lmao.
>>107795929happy work anniversary, office quokka dog
think i am gonna try to train, with civitai, a koff3 lora for flux. idk if flux is dead but whatever>>107795929morning
>>107795941Train The Ninth Gate lora... from the book images...
>>107795941could always try a z-image lora if you're worried about model relevance
>>107795940That was back in December, today is just regular birthday>>107795941Nice pomni, morning
>>107796195oh, happy birthday office quokka dog
>>107796233Thank you Debo.
>>107795977meh, i'm going to hold off anyway, the civitai site is annoying me. i guess there will be a base z-image released soon?, though i will be unable to run it locally anyway
>>107796274>there will be a base z-image released soon?no idea. there's been mixed messages about the base model release cuz they've suggested its almost ready but then have been mostly silent otherwise
>>107796274>though i will be unable to run it locally anywayz-image is pretty small. have you tried the ggufs to see if they'd work locally? surely something can fitmodel:https://huggingface.co/jayn7/Z-Image-Turbo-GGUF/tree/mainencoder:https://huggingface.co/unsloth/Qwen3-4B-GGUF/tree/main
>>107796274If you have 6GB and at least Ampere card you can run fp16 (or the non guff) one with 16gb shared ram. It's not a big deal.GGUF Q8 takes even less memory but it is slightly slower as it is not hardware accelerated.If I can run it, you can do it too.Flux is way heavier.
>>107796400>>107796474guess i'll start the process of installing and trying to use comfyui, again
>>107796507New comfyui is somewhat bad experience. I don't have a cutting edge version but it has memory management issues. But this means it will oom after every few gens. It's not unusably bad but slightly so.
>>107796641I'm not pulling as long as possiblehopefully someone forks a ComfyClassic and unfucks a lot of their bad decisions
well.. got it working, though i don't yet understand how to get the results i wanthttps://files.catbox.moe/jnbti2.jpgrate the flow/offer guidance as far as the settings i should be using
done some performance optimizations, nice one is fused geglu, also added flash attention, be ready for merge soon but it's tested and working>UNet2DConditionModel runwayml/stable-diffusion-v1-5 @ 512x512 with batch 2 (batch 1 CFG) on 3090>22.8ms>~43.86 it/svery nicei think i can get faster, haven't fused qkv yet, groupnorm and layernorm kernels aren't the best, maybe some more gemm/conv2d fusions are possible or maybe more tile description tuning, and try tuning concat and elementwise kernels more
>>107797148forgot an image
>>107796768>>107797067>[Elton John's "imagine" piano melody playing]>*dum**dum**dum**dum**dum**dum**dum**dum*>*dururururumdum*>Imagine there's weird industrial machinery in every roof>*dururururumdum*
Lunch
>>107797095here's one of my most recent workflows for reference:https://files.catbox.moe/56hg3d.png the auraflow node with shift param [supposedly] improves encoding; people typically a range between 5-7>>107797148>>22.8ms>>~43.86 it/show does this compare with the non-modular models?>>107797191>>Imagine there's weird industrial machinery in every roofthis is the future according to AI. its not like we're using roofs for anything else
>>107796768It will be a long project. I wish I was a dev but I can barely read books.Now would be the time to capture the codebase I think.
>>107797365Before the git gets erased or closed.
>>107797338I just found it funny
>>107797338>the non-modular modelswdym
>>107797487dino transforms models into optimized standalone models, right? I'm always wondering about the benchmarks, non-optimized model vs optimized dino-module
>>107797525ohh>diffusers 23.05 it/s>diffusers with torch.compile max-autotune fullgraph 29.29 it/s
>update comfy>workflow that worked before now oomswowzer!
>>107798245Most of the 3rd party nodes were not using apis' memory management anyway.It was more and more obvious with chinese animation models.
>>107798280eg. They are wrappers.
see the stargs. these text goof-ups humor me >>107798245>he pulledoh no>>107798330ah, my stocks seem to be doing well>>107798488very cool. would play (the pc port)
Entombed left Hand Path.https://www.youtube.com/watch?v=weyYzWU-FNI
>>107799099Yewtube has decided that my current song is this:<
>>107799158https://www.youtube.com/watch?v=hdp7snFv7YI
https://www.youtube.com/watch?v=qeeOc8naPIg
https://suno.com/s/cqz2js53aL29kxGXnice when it ignores my dumb lyrics and instead generates a decent 8 minute instrumental
>>107799340idk why but the drum beat gives me early 90s vibes. fun energyalso, I'd like 4lbs of slame meat please
#wisdom
>>107799980I approve this gen
>>107785765catbox?
>>107801484Sent you a pm.
>>107801730i have this in my files but i dont remember why i made it
>>107801956I don't think you have ever announced a specific reason for your 24/7 spam.
>>107801968ok
>>107801956Thats.... Interesting.
>>107802011>ran took everything from me
>>107802196>InterestingEYE thought so too, heh heh heh
>>107802248EYE SEE what you did there lol. What's good man. Very ominous gen.
>>107802259nm, just makin it through the week. slowly but surely. are you back in your groove?>Very ominous gen.I grabbed an interesting lora earleir the week that I've been having fun with. it adds a neat mood and interpretation to gens
>>107802253he tried to warn us...
>>107802354Yep, I'm back in my groove. Had a very productive few days at work so far. I'm one of these people who can't stand to be idle. I don't understand how people can sit in front of a television and binge watch episodes of a TV show for hours on end. I have to be up and moving. Anyway, yeah, those are some pretty cool gens. What exactly is the Lora theme?
>>107802405Thank you for letting us know.
>>107802411You're welcome. Didn't mean to strike a nerve if you're one of these people who can watch long bouts of television shows or movies or anything. I didn't mean to imply that those people were lazy. Just that I don't have the capacity for that.
>>107802405>I'm one of these people who can't stand to be idle.good for you. I'm always kind of envious of people who always keep moving. my natural state is more sedentary >What exactly is the Lora theme?its suppose to be 'cassette fururism' or something like that. I didn't quite get the original aesthetic I was aiming for out of it but have been getting lots of other cool stuff. did you see the space gens earlier in the thread? I thought those were super cool
>>107802468Ah, I see. That's a cool idea for a Lora. Anyway, you talk frequently about going to the gym, so you're at least combating your sedentary nature with some healthy habits.
>>107802504I do stuff, but there's always a barrier in front of doing stuff. doing stuff is more of a duty than a desire. I wish I had that innate pull towards motion. if I did, I prob would have conquered the world by nowbut for now, I'm being innately pulled towards my pillows. gn
>>107802523I get it, but you should embrace who you are and the positive qualities that you have. Everyone has negative qualities, but wishing they were different is just to harp on yourself in a negative manner. I'm sure you have a ton of good qualities that you can espouse and be proud of. Anyway, have a good night's sleep.
fuckin riphttps://files.catbox.moe/ycwd3z.jpg
G'mornin Anons, have a great day!
>>107803942Needs more jpeg
>>107803942she's now a lizard
yo niggas what's the go to realistic 1girl sloppa nowi used pony a year ago and updated and it's worse now, not interested in video gen (only image) but never messed with the funny qwen and flux and chinese models and all
>>107804603ZIT is based but you have to go a little out of your way to avoid sameface + samebackground + sameangle. It also doesn't change much if you randomize seeds if you use it raw
>>107804603>>107804617zit and chroma are best for realism
>>107804617damn. sameeverything is always annyoing>>107804628realism or actual photorealism? seems like everyone forgot how to train skin tones that don't look like a doll in the past year. goddamn chink datasets
>>107804628I like z-image but can't do lora easily.
qwen is also nice.
the more elaborate the captchas are, the more intelligent we will become. no longer are captchas being used to train AI, they're being used to train ourselves
>>107805685i'm still quite unable to just glance and see the differences in the star patternsit's pretty annoying mobile is even worse
>>107805753one of my laptops only has to do 1 captcha puzzle and its the easiest one. it seems like maybe theres some cookie or some useragent that can trigger easier captchas.....>>107805825I'm more of a jameson guy if I had to go with big brands, though I've never met a whiskey I didn't like
>>107805887I didn't ask.
>>107805887I don't know I'm speculating it is Cloudflare who calculates your IP's risk factor plus it is based on the age of the cookies too. We are Amazon turk workers now...
gmusing fal . ai, queued up a lora, trained on a few punk rock album covers, for zit, will see if it works
>>107805994Cute
>>107805994cool, hope it turns out well!>>107806027>budget keanuthat pizza looks good. I want a pizza...
prompting like 'there is ___, there is ___, there is ___' random things
>>107806154can you try '___ is occluded by a ___ in the foreground'?
>>107806264i'll try
Morning anonsIt was a fun birthday yesterday :)
>>107806513gm
lora possibly turned out alright
>>107806581nice. that seemed fast
>>107785765which one of those lets me make porn and also I want Lara Croft Legend porn so bad
>>107806621fal . ai seems to run pretty quick, much fast than civitai, which is an all-day process of waiting
>>107807035what are your thoughts about z-image so far?
>>107807081thoughts are limited to: upgrade from flux, loras seem essential
hey anons, sorry if this isn't the place to ask (pls le me know where instead):what's the best way to make myself a "live" 2d/3d avatar for youtube videos where i'm just talking with different backgrounds like gaming and/or text/articles?imagine a design as simple as imu (pic related) that's just mimicking myself talking to the camera during the video, is that explanation good enough?i know people have been doing with live2d and/or the free 3d thing but it's between mega expensive shit and super uglythank you and have an awesome day/night
>>107807234unless there is a better place on here, you will probably be better off having chatgpt help you set that up.
>>107807234you want to make an avatar of yourself and use it to generate youtube content?
>>107807234avatar generation is a very well-traversed topic, but its not something much talked about here so I can't tell you what is "the best way". what you can try is checking out avatar spaces on huggingface and seeing if any of the popular/active projects meet your needshttps://huggingface.co/spaces?q=avatarotherwise, here are a few of the more recent avatar tools I have links for. again, unsure which perform best:https://huggingface.co/meituan-longcat/LongCat-Video-Avatarhttps://liveavatar.github.iohttps://github.com/AA-Factory/aafactory
ought eye bake
>>107807180>Anal Cunt: I just Saw The Gayest Guy On Earth
>>107807627>>107807627>>107807627
>>107807632god damnit i forgot subject
>>107807370yes, it wont look like me at all though, the whole persona will be more of a shadow thing, think TheBurntPeanut but less zoomer (i also considered his Snapchat filter method but it only looks good on streams where the avatar can be wonky 24/7)>>107807428thanks, i'm very new at the whole AI creation thing (only have made 2-3 static anime images with comfyui like months ago) but aren't most of these models trying to mimic humans? my intent is more of an object and/or thing like a blob or a slime or a cat/bear without a backgroundThanks!