/g/ - /ldg/ - Local Diffusion General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/ldg/ - Local Diffusion Genera(...) 04/04/26(Sat)20:16:52 No.108528950

File: highlights_g_108524999_17(...).jpg (1.2 MB, 3638x2413)

/ldg/ - Local Diffusion General Anonymous 04/04/26(Sat)20:16:52 No.108528950

User Error Edition

Discussion and Development of Local Image and Video Models

Previous: >>108524999

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
04/04/26(Sat)20:17:35 No.108528954

Anonymous 04/04/26(Sat)20:17:35 No.108528954

>mfw Resource news

04/04/2026

>STAGE: Storyboard-Anchored Generation for Cinematic Multi-shot Narrative
https://github.com/escapistmost/Storyboard-Anchored-Generation

>Regularizing Attention with Bootstrapping
https://github.com/ncchung/AttentionRegularization

>LTX2.3-Multifunctional: Functionality optimization based on LTX desktop version
https://github.com/hero8152/LTX2.3-Multifunctional

>Gemma 4 31B IT NVFP4 model is quantized with NVIDIA Model Optimizer
https://huggingface.co/nvidia/Gemma-4-31B-IT-NVFP4

>AP Netflix VOID – ComfyUI Custom Nodes
https://github.com/adampolczynski/AP_Netflix_VOID

04/03/2026

>JoyAI-Image: Awakening Spatial Intelligence in Unified Multimodal Understanding and Generation
https://github.com/jd-opensource/JoyAI-Image

>Netflix VOID: Video Object and Interaction Deletion
https://huggingface.co/netflix/void-model

>OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning
https://huggingface.co/tencent/HY-OmniWeaving

>Bias mitigation in graph diffusion models
https://github.com/kunzhan/spp

>Control-DINO: Feature Space Conditioning for Controllable Image-to-Video Diffusion
https://dedoardo.github.io/projects/control-dino

>FlowSlider: Training-Free Continuous Image Editing via Fidelity-Steering Decomposition
https://huggingface.co/spaces/dominoer/FlowSlider

>SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers
https://github.com/deng12yx/SafeRoPE

>NearID: Identity Representation Learning via Near-identity Distractors
https://gorluxor.github.io/NearID

>Generative World Renderer
https://alaya-studio.github.io/renderer

>Universal Hypernetworks for Arbitrary Models
https://github.com/Xuanfeng-Zhou/UHN

>InTraGen: Trajectory-controlled Video Generation for Object Interactions
https://github.com/insait-institute/InTraGen

>SDXL Node Merger: A visual, node-based model merging tool for Stable Diffusion XL
https://github.com/georgebanjog/sdxl-node-merger

Anonymous
04/04/26(Sat)20:17:53 No.108528957

Anonymous 04/04/26(Sat)20:17:53 No.108528957

No joke, SaaS models hoarding their weights are causing a second dark age. Just imagine how far ahead we'd be in AI if everyone actually released their stuff.

Anonymous
04/04/26(Sat)20:18:36 No.108528962

Anonymous 04/04/26(Sat)20:18:36 No.108528962

>mfw Research news

04/04/2026

>PhysVid: Physics Aware Local Conditioning for Generative Video Models
https://arxiv.org/abs/2603.26285

>GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation
https://nicolasvonluetzow.github.io/GaussianGPT

>From Natural Alignment to Conditional Controllability in Multimodal Dialogue
https://arxiv.org/abs/2603.29162

>RecycleLoRA: Rank-Revealing QR-Based Dual-LoRA Subspace Adaptation for Domain Generalized Semantic Segmentation
https://arxiv.org/abs/2603.28142

>LogiStory: A Logic-Aware Framework for Multi-Image Story Visualization
https://arxiv.org/abs/2603.28082

>ConceptWeaver: Weaving Disentangled Concepts with Flow
https://arxiv.org/abs/2603.28493

>IP-Bench: Benchmark for Image Protection Methods in Image-to-Video Generation Scenarios
https://arxiv.org/abs/2603.26154

>AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation
https://arxiv.org/abs/2603.28068

>Engineering Mythology: A Digital-Physical Framework for Culturally-Inspired Public Art
https://arxiv.org/abs/2603.27801

>ANVIL: Accelerator-Native Video Interpolation via Codec Motion Vector Priors
https://arxiv.org/abs/2603.26835

>Understanding Semantic Perturbations on In-Processing Generative Image Watermarks
https://arxiv.org/abs/2603.27513

>ObjectMorpher: 3D-Aware Image Editing via Deformable 3DGS Models
https://arxiv.org/abs/2603.28152

>LongCat-Next: Lexicalizing Modalities as Discrete Tokens
https://arxiv.org/abs/2603.27538

>On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models
https://zhaoc5.github.io/DyMoE

>A Comprehensive Information-Decomposition Analysis of Large Vision-Language Models
https://riishin.github.io/pid-lvlm-iclr26

>Explaining CLIP Zero-shot Predictions Through Concepts
https://arxiv.org/abs/2603.28211

>WorldFlow3D: Flowing Through 3D Distributions for Unbounded World Generation
https://light.princeton.edu/worldflow3d

Anonymous
04/04/26(Sat)20:19:08 No.108528964

Anonymous 04/04/26(Sat)20:19:08 No.108528964

>>108528957
>doomposting about saas
Just use API Nodes

MYTH: api models are censored
FACT: api models are less censored than local models and are in fact trained on NSFW imagery

MYTH: api models are too expensive
FACT: it's actually quite cheap to use API through ComfyUI API Nodes. the price for api has went down in comparison to the price of hardware

MYTH: api nodes collect your data and are unsafe to use
FACT: api is safer than local because nothing is stored on your hard drive. with local models, you need to download hundreds of loras and custom nodes, any of which could be infected

MYTH: an api can pull the plug at any time, why use something like that?
FACT: everything you generate can be saved to your desktop so nothing is lost

MYTH: it's impossible to train a custom style of character with api, loras make local way better
FACT: api can learn any style or character with a single image reference, which is much faster and smarter than loras

MYTH: if i buy api credits and don't like the model, that's money wasted
FACT: comfyUI's API nodes credit system allows you to prompt hundreds of cutting-edge api models. the credits share between models so you aren't locked in to any one ecosystem

MYTH: api users are poor and from third world countries
FACT: the top hollywood productions and anime studios all use api models. api is the weapon of choice for everyone world-wide

MYTH: discussion of api models is off-topic
FACT: api models are part of the comfyui experience and are relevant to this thread. combining api models with local workflows is still local

Anonymous
04/04/26(Sat)20:19:12 No.108528965

Anonymous 04/04/26(Sat)20:19:12 No.108528965

File: 1747666648534082.png (1.66 MB, 1168x1704)

1.66 MB PNG

Anonymous
04/04/26(Sat)20:21:27 No.108528976

Anonymous 04/04/26(Sat)20:21:27 No.108528976

>of all the images in the previous thread, these were the ones baker selected

lmao

Anonymous
04/04/26(Sat)20:24:16 No.108528988

Anonymous 04/04/26(Sat)20:24:16 No.108528988

>>108528964
>MYTH: an api can pull the plug at any time, why use something like that?
>FACT: everything you generate can be saved to your desktop so nothing is lost
this one is just retarded, pretty good otherwise

Anonymous
04/04/26(Sat)20:31:16 No.108529032

Anonymous 04/04/26(Sat)20:31:16 No.108529032

>>108528988
it's important to free yourself from over-reliance on one specific model. we see it with the sdxl cult, the chromakeks, the mentally ill gpt-4 addicts, etc. the great thing about API models is when one shuts down, 3 better ones take its place. sora may be shutting down, but better models like seedance are emerging. this is why it's extremely important to use API nodes instead of subscribing directly to openAI. with API nodes, your credits are never lost and can be used on any model.

Anonymous
04/04/26(Sat)20:31:36 No.108529035

Anonymous 04/04/26(Sat)20:31:36 No.108529035

Blessed thread of frenship

Anonymous
04/04/26(Sat)20:46:23 No.108529091

Anonymous 04/04/26(Sat)20:46:23 No.108529091

has ran killed himself yet?

Anonymous
04/04/26(Sat)20:56:58 No.108529139

Anonymous 04/04/26(Sat)20:56:58 No.108529139

ive had moderate success using OFTv2 for lora training noob. it doesnt overfit to the style and is mostly coherent on unseen characters. or maybe its because the number of epochs is low, i usually do 3

Anonymous
04/04/26(Sat)21:03:01 No.108529159

Anonymous 04/04/26(Sat)21:03:01 No.108529159

Does anyone have configs to share for a wan or ltx lora? There are barely any resources for I2V loras.

Anonymous
04/04/26(Sat)21:09:24 No.108529183

Anonymous 04/04/26(Sat)21:09:24 No.108529183

>>108529159
you won't find that here. we only fling feces around like monkeys in this general

Anonymous
04/04/26(Sat)21:10:22 No.108529185

Anonymous 04/04/26(Sat)21:10:22 No.108529185

>we

Anonymous
04/04/26(Sat)21:20:08 No.108529229

Anonymous 04/04/26(Sat)21:20:08 No.108529229

File: 1745517288762417.jpg (242 KB, 850x480)

242 KB JPG

>try wan 2.2 image to video tutorial workflow in comfyui
>immediately starts using SWAP and slows my system down
>VRAM only at at 5GB out of 24 (according to system monitor)

Anonymous
04/04/26(Sat)21:24:45 No.108529244

Anonymous 04/04/26(Sat)21:24:45 No.108529244

>>108529229
prompt?

Anonymous
04/04/26(Sat)21:25:07 No.108529247

Anonymous 04/04/26(Sat)21:25:07 No.108529247

>>108529229
>starts using SWAP
Amd?
This was fixed with dynamic memory.

Anonymous
04/04/26(Sat)21:43:28 No.108529314

Anonymous 04/04/26(Sat)21:43:28 No.108529314

File: 1759783423067820.png (1.76 MB, 1072x1880)

1.76 MB PNG

Anonymous
04/04/26(Sat)21:53:50 No.108529347

Anonymous 04/04/26(Sat)21:53:50 No.108529347

>>108529183
Thought I'd at least try but it does seem unfortunate

Anonymous
04/04/26(Sat)21:54:52 No.108529352

Anonymous 04/04/26(Sat)21:54:52 No.108529352

>>108529247
>Amd
Yes

Anonymous
04/04/26(Sat)22:31:04 No.108529489

Anonymous 04/04/26(Sat)22:31:04 No.108529489

File: 00126-1690926911.jpg (877 KB, 1664x2432)

877 KB JPG

Anonymous
04/04/26(Sat)22:34:07 No.108529506

Anonymous 04/04/26(Sat)22:34:07 No.108529506

>>108529489
ToT

Anonymous
04/04/26(Sat)22:37:59 No.108529530

Anonymous 04/04/26(Sat)22:37:59 No.108529530

File: 1761993571768767.png (1.67 MB, 1256x1704)

1.67 MB PNG

Anonymous
04/04/26(Sat)23:00:53 No.108529640

Anonymous 04/04/26(Sat)23:00:53 No.108529640

>>108529352
Joever.
I don't think Comfy has any reason to bother fixing traditional memory allocation with current superior method in place, so you gotta wait until (if) he implements dynamic memory for amd.

Anonymous
04/04/26(Sat)23:13:25 No.108529713

Anonymous 04/04/26(Sat)23:13:25 No.108529713

I don't know how finetuners train their models, but the real challenge is avoiding two types of narcissism:
-the ones who train for clout and overtrain their models on CivitAI slop,
-the ones who finetune based on whatever images they personally like while ignoring whether the model can actually do anything.

Anonymous
04/04/26(Sat)23:20:27 No.108529745

Anonymous 04/04/26(Sat)23:20:27 No.108529745

how do i become a professional diffuser?

Anonymous
04/04/26(Sat)23:28:18 No.108529762

Anonymous 04/04/26(Sat)23:28:18 No.108529762

I think I found a more objective way to judge whether a model is actually good or not, and that is by testing it with img2img without using ControlNet.
Bad or poor trained models tend to have various blind spots when come to seeing, interpret and transform an already existing image

Anonymous
04/04/26(Sat)23:31:08 No.108529783

Anonymous 04/04/26(Sat)23:31:08 No.108529783

And thats where after testing i realized that WAI 16 (SDXL) and Anima are the only two anime models that can reliably convert different kinds of realistic or semi realistic illustrations into anime without falling apart or showing blind spots.

All the other popular shitnerges merges, including Chenkin and Noob, have noticeable blind spots, whether in scene composition, background consistency, or worst of all, character positioning and body parts.

Anonymous
04/04/26(Sat)23:32:37 No.108529787

Anonymous 04/04/26(Sat)23:32:37 No.108529787

Noob variants struggle with img2img unless the input is already a clean anime style image. They do not understand fingers, they fail at rendering pupils, and they break down as soon as they are pushed outside the type of data they were trained on. That tells me these are still immature models.

Anonymous
04/04/26(Sat)23:37:08 No.108529816

Anonymous 04/04/26(Sat)23:37:08 No.108529816

Some might argue that these models were trained only on anime, and that trying to convert realistic images to anime through img2img is obviously not what they are meant for.
But that argument does not really hold up. This is exactly where Anima proves its value. It is a model trained specifically on anime, just like WAI, yet both are capable of adapting to almost any input image, regardless of style or scene. They see the img, they understand it, they reinterpret the input in anome way without falling apart and caring about the initial image.

Anonymous
04/04/26(Sat)23:39:21 No.108529831

Anonymous 04/04/26(Sat)23:39:21 No.108529831

File: 1772320734907389.png (1.59 MB, 1432x1432)

1.59 MB PNG

Anonymous
04/04/26(Sat)23:46:46 No.108529876

Anonymous 04/04/26(Sat)23:46:46 No.108529876

>>108529640
There's been progress on AMD support, though I think it's currently waiting on a couple of external bugs to be fixed for it to work right.

https://github.com/Comfy-Org/comfy-aimdo/pull/2

Anonymous
04/05/26(Sun)00:01:01 No.108529944

Anonymous 04/05/26(Sun)00:01:01 No.108529944

File: 00548-913746885.jpg (793 KB, 1664x2432)

793 KB JPG

main model: NoobAI-XL vpred
refiner: WAI-realism (LOL)

i mostly gen with anima now but i kinda miss that noob style.... -_-

Anonymous
04/05/26(Sun)00:08:44 No.108529992

Anonymous 04/05/26(Sun)00:08:44 No.108529992

File: ComfyUI_00080_.png (3.23 MB, 1248x1824)

3.23 MB PNG

>>108529944
just main pass anima and upscale with noob. You have to prompt it differently, but it can really lock in style and bring out texture.

Anonymous
04/05/26(Sun)00:22:53 No.108530071

Anonymous 04/05/26(Sun)00:22:53 No.108530071

>>108529745
Do it for pleasure not for ego

Anonymous
04/05/26(Sun)00:30:46 No.108530101

Anonymous 04/05/26(Sun)00:30:46 No.108530101

>>108529992
I actually do exactly that currently.
I still think Anima favors simplicity, while the last official version of Noob is somewhat overtuned and favors high detail/complexity.....

Anonymous
04/05/26(Sun)00:33:24 No.108530117

Anonymous 04/05/26(Sun)00:33:24 No.108530117

>>108530101
>I still think Anima favors simplicity, while the last official version of Noob is somewhat overtuned and favors high detail/complexity.....
Can you be more specific?

Anonymous
04/05/26(Sun)00:38:42 No.108530140

Anonymous 04/05/26(Sun)00:38:42 No.108530140

>>108529831
Would you be willing to share your prompt?

Anonymous
04/05/26(Sun)00:44:07 No.108530165

Anonymous 04/05/26(Sun)00:44:07 No.108530165

File: 1748361619147590.png (1.8 MB, 1168x1792)

1.8 MB PNG

>>108530140
it was something like https://pastebin.com/tmjA5vaJ

Anonymous
04/05/26(Sun)01:02:57 No.108530238

Anonymous 04/05/26(Sun)01:02:57 No.108530238

>>108530117
Well, not being able to blend artist styles in quite the same way as SDXL is a pain in the ass. I'd like to reuse my prompts from when Noob was the hottest model here, but it's just not the same and you have to rewrite them. The seamless blending of styles was a side effect of Clip. Anima can blend styles, but it seems like one really dominates.

I think Noob still has better lighting and special effects, I use a fair amount of those. Depth of field, etc all that stuff.

Anonymous
04/05/26(Sun)02:21:21 No.108530513

Anonymous 04/05/26(Sun)02:21:21 No.108530513

File: 858568782472427.jpg (1.75 MB, 1664x2432)

1.75 MB JPG

daily anima gen

Anonymous
04/05/26(Sun)02:21:36 No.108530514

Anonymous 04/05/26(Sun)02:21:36 No.108530514

File: 1769103042479694.png (2.98 MB, 1920x1080)

2.98 MB PNG

Anonymous
04/05/26(Sun)02:39:04 No.108530568

Anonymous 04/05/26(Sun)02:39:04 No.108530568

File: 0679485683583.jpg (2.46 MB, 1664x2432)

2.46 MB JPG

Anonymous
04/05/26(Sun)02:40:07 No.108530572

Anonymous 04/05/26(Sun)02:40:07 No.108530572

File: 1760114763583281.png (2.55 MB, 1920x1080)

2.55 MB PNG

Anonymous
04/05/26(Sun)02:55:38 No.108530628

Anonymous 04/05/26(Sun)02:55:38 No.108530628

File: deNS_zi_00043_.png (3.7 MB, 1663x1164)

3.7 MB PNG

Anonymous
04/05/26(Sun)03:07:09 No.108530665

Anonymous 04/05/26(Sun)03:07:09 No.108530665

Netayumesisters.....

Anonymous
04/05/26(Sun)03:12:38 No.108530684

Anonymous 04/05/26(Sun)03:12:38 No.108530684

File: 1753426776398516.png (2.82 MB, 1920x1080)

2.82 MB PNG

Anonymous
04/05/26(Sun)03:16:48 No.108530693

Anonymous 04/05/26(Sun)03:16:48 No.108530693

File: 86479357835683568.jpg (1.39 MB, 1664x2432)

1.39 MB JPG

Anonymous
04/05/26(Sun)03:33:48 No.108530734

Anonymous 04/05/26(Sun)03:33:48 No.108530734

This is probably a massive leap of faith, but is there any good, local, 3d model generation? One that can even generate animatable stuff such as characters, and animations for it.

Anonymous
04/05/26(Sun)03:47:31 No.108530773

Anonymous 04/05/26(Sun)03:47:31 No.108530773

File: 1754896974466774.png (3.01 MB, 1920x1080)

3.01 MB PNG

>>108530734
not really

Anonymous
04/05/26(Sun)03:48:37 No.108530777

Anonymous 04/05/26(Sun)03:48:37 No.108530777

File: 1766412509021832.png (1.14 MB, 1024x1024)

1.14 MB PNG

>>108528965
>>108529314
>>108529831
Cute

Anonymous
04/05/26(Sun)03:54:02 No.108530791

Anonymous 04/05/26(Sun)03:54:02 No.108530791

File: 1763746277223535.png (3.16 MB, 1920x1080)

3.16 MB PNG

Anonymous
04/05/26(Sun)04:01:52 No.108530813

Anonymous 04/05/26(Sun)04:01:52 No.108530813

>>108530777
why are you praising yourself? are you schizophrenic?

Anonymous
04/05/26(Sun)04:46:59 No.108531056

Anonymous 04/05/26(Sun)04:46:59 No.108531056

File: 1771821502546281.png (1.45 MB, 1024x1024)

1.45 MB PNG

Is there a good tag autocomplete for comfy? Typing out character \(series\) is tedious, especially for Fate shit.

>>108530813
I'm a different anon thoughever.

Anonymous
04/05/26(Sun)04:53:00 No.108531095

Anonymous 04/05/26(Sun)04:53:00 No.108531095

>>108531056
Perhaps https://github.com/newtextdoc1111/ComfyUI-Autocomplete-Plus

Anonymous
04/05/26(Sun)05:43:07 No.108531332

Anonymous 04/05/26(Sun)05:43:07 No.108531332

File: 1770267914494594.png (287 KB, 562x805)

287 KB PNG

Gemma 4 is definitely competent and based in captioning, please google, give us a local model as well :( >>108531320

Anonymous
04/05/26(Sun)05:48:49 No.108531358

Anonymous 04/05/26(Sun)05:48:49 No.108531358

>>108528965
>>108529314
>>108529530
>>108529831
This style is so sick. What model and loras are you using?

Anonymous
04/05/26(Sun)06:10:48 No.108531438

Anonymous 04/05/26(Sun)06:10:48 No.108531438

>>108530513
>>108530514
>>108530568
Daily anima fud then

Anonymous
04/05/26(Sun)06:14:20 No.108531451

Anonymous 04/05/26(Sun)06:14:20 No.108531451

File: 1764997542240478.webm (3.68 MB, 1920x1056)

3.68 MB WEBM

Babe wake up, they released a decent local world model
https://xcancel.com/Skywork_ai/status/2039305679966720411

Anonymous
04/05/26(Sun)06:15:31 No.108531454

Anonymous 04/05/26(Sun)06:15:31 No.108531454

File: schizo.png (140 KB, 1150x312)

140 KB PNG

>>108531438
Daily Anifart fud then

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.