/g/ - /ldg/ - Local Diffusion General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/ldg/ - Local Diffusion Genera(...) 04/23/26(Thu)11:09:13 No.108668921

File: highlights_g_108664784_17(...).jpg (2.47 MB, 3418x4647)

/ldg/ - Local Diffusion General Anonymous 04/23/26(Thu)11:09:13 No.108668921

Discussion and Development of Local Image and Video Models

Previous: >>108664784

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
04/23/26(Thu)11:11:30 No.108668934

Anonymous 04/23/26(Thu)11:11:30 No.108668934

is it over or are we back

Anonymous
04/23/26(Thu)11:13:39 No.108668948

Anonymous 04/23/26(Thu)11:13:39 No.108668948

File: 1755007656454961.jpg (392 KB, 1448x1086)

392 KB JPG

Anonymous
04/23/26(Thu)11:13:41 No.108668949

Anonymous 04/23/26(Thu)11:13:41 No.108668949

File: ComfyUI_10703_.png (368 KB, 1024x1024)

368 KB PNG

Anonymous
04/23/26(Thu)11:14:37 No.108668954

Anonymous 04/23/26(Thu)11:14:37 No.108668954

>>108668948
why is it so brown

Anonymous
04/23/26(Thu)11:15:34 No.108668958

Anonymous 04/23/26(Thu)11:15:34 No.108668958

>>108668954
ghibli niggers did this

Anonymous
04/23/26(Thu)11:16:48 No.108668960

Anonymous 04/23/26(Thu)11:16:48 No.108668960

>>108668948
>GreasePT

Anonymous
04/23/26(Thu)11:18:27 No.108668972

Anonymous 04/23/26(Thu)11:18:27 No.108668972

Why is civitai full of new accounts literally named "abc123abc" commenting in every single z-image lora to make an Ernie version. For fuck sake, just take a look at the Commodore64 lora for Ernie, is disgusting, makes me puke just to stare at the images.

Anonymous
04/23/26(Thu)11:19:29 No.108668978

Anonymous 04/23/26(Thu)11:19:29 No.108668978

my gpu fans are starting to rattle. the end is near

Anonymous
04/23/26(Thu)11:19:45 No.108668983

Anonymous 04/23/26(Thu)11:19:45 No.108668983

>>108668948
get out! >>108653190

Shankism on Discord
04/23/26(Thu)11:21:22 No.108668989

Shankism on Discord 04/23/26(Thu)11:21:22 No.108668989

>civitai split between red boards and blue//green board

Anonymous
04/23/26(Thu)11:29:40 No.108669029

Anonymous 04/23/26(Thu)11:29:40 No.108669029

QRD on Ernie? Is it a meme or can it actually save local?

Anonymous
04/23/26(Thu)11:30:43 No.108669034

Anonymous 04/23/26(Thu)11:30:43 No.108669034

>>108669029
infographic generator

Anonymous
04/23/26(Thu)11:31:22 No.108669037

Anonymous 04/23/26(Thu)11:31:22 No.108669037

>>108668948
that's a lot of inpainting and many hours in gimp

Anonymous
04/23/26(Thu)11:35:27 No.108669058

Anonymous 04/23/26(Thu)11:35:27 No.108669058

File: image.png (32 KB, 373x418)

32 KB PNG

>>108668972
chinks shill army nothing new
they are also shilling chink models in r/localllama right now

Anonymous
04/23/26(Thu)11:36:38 No.108669064

Anonymous 04/23/26(Thu)11:36:38 No.108669064

File: 1760920978918124.png (26 KB, 119x119)

26 KB PNG

>>108668954
the room was prompted to be bathed in warm light with dusty color pallete because it looks cozy
>>108669037
facts. i really like what it did with groks coffee cup

Anonymous
04/23/26(Thu)11:38:09 No.108669070

Anonymous 04/23/26(Thu)11:38:09 No.108669070

Why do ai images look like ai? I can't see the exact reason. How can you then make your gens look less than ai?

Anonymous
04/23/26(Thu)11:42:27 No.108669088

Anonymous 04/23/26(Thu)11:42:27 No.108669088

>mfw Resource news

04/23/2026

>ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control
https://shelley-golan.github.io/ParetoSlider-webpage

>DynamicRad: Content-Adaptive Sparse Attention for Long Video Diffusion
https://github.com/Adamlong3/DynamicRad

>Normalizing Flows with Iterative Denoising
https://github.com/apple/ml-itarflow

>LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model
https://github.com/inclusionAI/LLaDA2.0-Uni

>Illustrious XL & NoobAI-XL Style Explorer
https://github.com/ThetaCursed/Illustrious-NoobAI-Style-Explorer

>AI Model & ‘MAGA’ Influencer Emily Hart Unmasked as Indian Man
https://www.yahoo.com/news/articles/ai-model-maga-influencer-emily-091027504.html

04/22/2026

>Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models
https://github.com/cvims/EMBEDDING-ARITHMETIC

>Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
https://github.com/CompVis/patch-forcing

>TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation
https://github.com/Hong-yu-Zhang/TS-Attn

>AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model
https://yutian10.github.io/AnyRecon

>SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing
https://github.com/vivoCameraResearch/SmartPhotoCrafter

>Soft Label Pruning and Quantization for Large-Scale Dataset Distillation
https://github.com/he-y/soft-label-pruning-quantization-for-dataset-distillation

>Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
https://github.com/AMAP-ML/EMF

>Enhancing Continual Learning of Vision-Language Models via Dynamic Prefix Weighting
https://github.com/YonseiML/dpw

>IR-Flow: Bridging Discriminative and Generative Image Restoration via Rectified Flow
https://github.com/fanzh03/IR-Flow

Anonymous
04/23/26(Thu)11:43:05 No.108669089

Anonymous 04/23/26(Thu)11:43:05 No.108669089

>>108669070
put "AI" in the negative prompt

Anonymous
04/23/26(Thu)11:43:27 No.108669090

Anonymous 04/23/26(Thu)11:43:27 No.108669090

>mfw Research news

04/23/2026

>Image Generators are Generalist Vision Learners
http://vision-banana.github.io

>Camera Control for Text-to-Image Generation via Learning Viewpoint Tokens
https://randdl.github.io/viewtoken_control

>Hallucination Early Detection in Diffusion Models
https://arxiv.org/abs/2604.20354

>Wan-Image: Pushing the Boundaries of Generative Visual Intelligence
https://arxiv.org/abs/2604.19858

>MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings
https://arxiv.org/abs/2604.19902

>Rethinking Where to Edit: Task-Aware Localization for Instruction-Based Image Editing
https://arxiv.org/abs/2604.20258

>Amodal SAM: A Unified Amodal Segmentation Framework with Generalization
https://arxiv.org/abs/2604.20748

>FluSplat: Sparse-View 3D Editing without Test-Time Optimization
https://arxiv.org/abs/2604.20038

>HumanScore: Benchmarking Human Motions in Generated Videos
https://arxiv.org/abs/2604.20157

>Render-in-the-Loop: Vector Graphics Generation via Visual Self-Feedback
https://arxiv.org/abs/2604.20730

>Mitigating Hallucinations in Large Vision-Language Models without Performance Degradation
https://arxiv.org/abs/2604.20366

>Cognitive Alignment At No Cost: Inducing Human Attention Biases For Interpretable Vision Transformers
https://arxiv.org/abs/2604.20027

>X-Cache: Cross-Chunk Block Caching for Few-Step Autoregressive World Models Inference
https://arxiv.org/abs/2604.20289

>Self-supervised pretraining for an iterative image size agnostic vision transformer
https://arxiv.org/abs/2604.20392

>Efficient INT8 Single-Image Super-Resolution via Deployment-Aware Quantization and Teacher-Guided Training
https://arxiv.org/abs/2604.20291

>From Diffusion to Flow: Efficient Motion Generation in MotionGPT3
https://arxiv.org/abs/2603.26747

Anonymous
04/23/26(Thu)11:43:43 No.108669092

Anonymous 04/23/26(Thu)11:43:43 No.108669092

>>108669037
that's basically what image 2 is doing.
it's a second pass that projects the text onto the genned image. the easiest way to spot it is on clothing, the X for example, it's just sitting on her dress. it's actually almost pixel perfect with the X on the laptop.

Anonymous
04/23/26(Thu)11:44:00 No.108669093

Anonymous 04/23/26(Thu)11:44:00 No.108669093

>>108669070
Hire an artist to paint over it.

Anonymous
04/23/26(Thu)11:46:35 No.108669107

Anonymous 04/23/26(Thu)11:46:35 No.108669107

>>108669092
why cant local models do that?

Anonymous
04/23/26(Thu)11:50:02 No.108669117

Anonymous 04/23/26(Thu)11:50:02 No.108669117

>>108669088
>>108669090
thanks

Anonymous
04/23/26(Thu)11:52:27 No.108669129

Anonymous 04/23/26(Thu)11:52:27 No.108669129

>>108669107
they probably can but nobody is developing the tooling for it, at least not in a user friendly way

Anonymous
04/23/26(Thu)11:53:28 No.108669135

Anonymous 04/23/26(Thu)11:53:28 No.108669135

File: Untitled-1.png (191 KB, 586x664)

191 KB PNG

>>108669107
probably because they don't care, it's a parlor trick to impress indians and boomer investors. sorry to pull the curtain back.
point in case, the gen has uses the same X, it just has a slight skew on the dress. same with the openAI logo, it's just sitting on her shirt.

Anonymous
04/23/26(Thu)11:54:15 No.108669137

Anonymous 04/23/26(Thu)11:54:15 No.108669137

>>108669117
Put the phone away fag

Anonymous
04/23/26(Thu)12:02:44 No.108669182

Anonymous 04/23/26(Thu)12:02:44 No.108669182

>>108669093
Gay
>>108669089
There is no way it's that simple. But now that I think of it putting tags like "masterpiece" seem to help

Anonymous
04/23/26(Thu)12:03:44 No.108669190

Anonymous 04/23/26(Thu)12:03:44 No.108669190

File: image.png (44 KB, 687x567)

44 KB PNG

>>108669137
?

Anonymous
04/23/26(Thu)12:08:56 No.108669231

Anonymous 04/23/26(Thu)12:08:56 No.108669231

>>108669190
api image thread is here >>108653190

Anonymous
04/23/26(Thu)12:09:29 No.108669237

Anonymous 04/23/26(Thu)12:09:29 No.108669237

>>108669182
>masterpiece in the positives helps make outputs not look ai
Erm..... Anon? When's the last time you saw an optometrist??

Anonymous
04/23/26(Thu)12:09:34 No.108669238

Anonymous 04/23/26(Thu)12:09:34 No.108669238

>>108669070
Can we talk about this pls?

Anonymous
04/23/26(Thu)12:10:36 No.108669243

Anonymous 04/23/26(Thu)12:10:36 No.108669243

>>108669135
its a cool trick honestly
hopefully the chinese will be able to reverse engineer it for local models

Anonymous
04/23/26(Thu)12:14:47 No.108669272

Anonymous 04/23/26(Thu)12:14:47 No.108669272

>>108669231
?

Anonymous
04/23/26(Thu)12:18:34 No.108669294

Anonymous 04/23/26(Thu)12:18:34 No.108669294

>>108669243
honestly i think a random person could figure out a better implementation in a few days, local has a lot more head room to fuck around. there are 3d models, i assume they have some kind of texture projection.
you could probably jury-rig something from preexisting nodes. convert a masked area into a plane or 3d topology, project text or an image onto it, then lay it on top of the gen.

Anonymous
04/23/26(Thu)12:36:59 No.108669426

Anonymous 04/23/26(Thu)12:36:59 No.108669426

>I haven't checked in on /ldg/ in a while what are they up to
>Thread gets diverged again
Still at it huh

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.