[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion and Development of Local Image and Video Models

Previous: >>108664784

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
is it over or are we back
>>
File: 1755007656454961.jpg (392 KB, 1448x1086)
392 KB JPG
>>
File: ComfyUI_10703_.png (368 KB, 1024x1024)
368 KB PNG
>>
>>108668948
why is it so brown
>>
>>108668954
ghibli niggers did this
>>
>>108668948
>GreasePT
>>
Why is civitai full of new accounts literally named "abc123abc" commenting in every single z-image lora to make an Ernie version. For fuck sake, just take a look at the Commodore64 lora for Ernie, is disgusting, makes me puke just to stare at the images.
>>
my gpu fans are starting to rattle. the end is near
>>
>>108668948
get out! >>108653190
>>
>civitai split between red boards and blue//green board
>>
QRD on Ernie? Is it a meme or can it actually save local?
>>
>>108669029
infographic generator
>>
>>108668948
that's a lot of inpainting and many hours in gimp
>>
File: image.png (32 KB, 373x418)
32 KB PNG
>>108668972
chinks shill army nothing new
they are also shilling chink models in r/localllama right now
>>
File: 1760920978918124.png (26 KB, 119x119)
26 KB PNG
>>108668954
the room was prompted to be bathed in warm light with dusty color pallete because it looks cozy
>>108669037
facts. i really like what it did with groks coffee cup
>>
Why do ai images look like ai? I can't see the exact reason. How can you then make your gens look less than ai?
>>
>mfw Resource news

04/23/2026

>ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control
https://shelley-golan.github.io/ParetoSlider-webpage

>DynamicRad: Content-Adaptive Sparse Attention for Long Video Diffusion
https://github.com/Adamlong3/DynamicRad

>Normalizing Flows with Iterative Denoising
https://github.com/apple/ml-itarflow

>LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model
https://github.com/inclusionAI/LLaDA2.0-Uni

>Illustrious XL & NoobAI-XL Style Explorer
https://github.com/ThetaCursed/Illustrious-NoobAI-Style-Explorer

>AI Model & ‘MAGA’ Influencer Emily Hart Unmasked as Indian Man
https://www.yahoo.com/news/articles/ai-model-maga-influencer-emily-091027504.html

04/22/2026

>Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models
https://github.com/cvims/EMBEDDING-ARITHMETIC

>Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
https://github.com/CompVis/patch-forcing

>TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation
https://github.com/Hong-yu-Zhang/TS-Attn

>AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model
https://yutian10.github.io/AnyRecon

>SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing
https://github.com/vivoCameraResearch/SmartPhotoCrafter

>Soft Label Pruning and Quantization for Large-Scale Dataset Distillation
https://github.com/he-y/soft-label-pruning-quantization-for-dataset-distillation

>Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
https://github.com/AMAP-ML/EMF

>Enhancing Continual Learning of Vision-Language Models via Dynamic Prefix Weighting
https://github.com/YonseiML/dpw

>IR-Flow: Bridging Discriminative and Generative Image Restoration via Rectified Flow
https://github.com/fanzh03/IR-Flow
>>
>>108669070
put "AI" in the negative prompt
>>
>mfw Research news

04/23/2026

>Image Generators are Generalist Vision Learners
http://vision-banana.github.io

>Camera Control for Text-to-Image Generation via Learning Viewpoint Tokens
https://randdl.github.io/viewtoken_control

>Hallucination Early Detection in Diffusion Models
https://arxiv.org/abs/2604.20354

>Wan-Image: Pushing the Boundaries of Generative Visual Intelligence
https://arxiv.org/abs/2604.19858

>MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings
https://arxiv.org/abs/2604.19902

>Rethinking Where to Edit: Task-Aware Localization for Instruction-Based Image Editing
https://arxiv.org/abs/2604.20258

>Amodal SAM: A Unified Amodal Segmentation Framework with Generalization
https://arxiv.org/abs/2604.20748

>FluSplat: Sparse-View 3D Editing without Test-Time Optimization
https://arxiv.org/abs/2604.20038

>HumanScore: Benchmarking Human Motions in Generated Videos
https://arxiv.org/abs/2604.20157

>Render-in-the-Loop: Vector Graphics Generation via Visual Self-Feedback
https://arxiv.org/abs/2604.20730

>Mitigating Hallucinations in Large Vision-Language Models without Performance Degradation
https://arxiv.org/abs/2604.20366

>Cognitive Alignment At No Cost: Inducing Human Attention Biases For Interpretable Vision Transformers
https://arxiv.org/abs/2604.20027

>X-Cache: Cross-Chunk Block Caching for Few-Step Autoregressive World Models Inference
https://arxiv.org/abs/2604.20289

>Self-supervised pretraining for an iterative image size agnostic vision transformer
https://arxiv.org/abs/2604.20392

>Efficient INT8 Single-Image Super-Resolution via Deployment-Aware Quantization and Teacher-Guided Training
https://arxiv.org/abs/2604.20291

>From Diffusion to Flow: Efficient Motion Generation in MotionGPT3
https://arxiv.org/abs/2603.26747
>>
>>108669037
that's basically what image 2 is doing.
it's a second pass that projects the text onto the genned image. the easiest way to spot it is on clothing, the X for example, it's just sitting on her dress. it's actually almost pixel perfect with the X on the laptop.
>>
>>108669070
Hire an artist to paint over it.
>>
>>108669092
why cant local models do that?
>>
>>108669088
>>108669090
thanks
>>
>>108669107
they probably can but nobody is developing the tooling for it, at least not in a user friendly way
>>
File: Untitled-1.png (191 KB, 586x664)
191 KB PNG
>>108669107
probably because they don't care, it's a parlor trick to impress indians and boomer investors. sorry to pull the curtain back.
point in case, the gen has uses the same X, it just has a slight skew on the dress. same with the openAI logo, it's just sitting on her shirt.
>>
>>108669117
Put the phone away fag
>>
>>108669093
Gay
>>108669089
There is no way it's that simple. But now that I think of it putting tags like "masterpiece" seem to help
>>
File: image.png (44 KB, 687x567)
44 KB PNG
>>108669137
?
>>
>>108669190
api image thread is here >>108653190
>>
>>108669182
>masterpiece in the positives helps make outputs not look ai
Erm..... Anon? When's the last time you saw an optometrist??
>>
>>108669070
Can we talk about this pls?
>>
>>108669135
its a cool trick honestly
hopefully the chinese will be able to reverse engineer it for local models
>>
>>108669231
?
>>
>>108669243
honestly i think a random person could figure out a better implementation in a few days, local has a lot more head room to fuck around. there are 3d models, i assume they have some kind of texture projection.
you could probably jury-rig something from preexisting nodes. convert a masked area into a plane or 3d topology, project text or an image onto it, then lay it on top of the gen.
>>
>I haven't checked in on /ldg/ in a while what are they up to
>Thread gets diverged again
Still at it huh



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.