[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: collage_1774492893_1.jpg (1.32 MB, 3646x1693)
1.32 MB
1.32 MB JPG
Discussion of Free and Open Source Diffusion Models

Previous: >>108450838

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
File: 1753961902954257.png (256 KB, 1481x1203)
256 KB
256 KB PNG
https://xcancel.com/GoogleResearch/status/2036533564158910740#m
Could this be used on diffusion models as well? Like this could be an improvement on SageAttention
>>
>>108456712
>SageAttention
yes, but only for 10k$ niche gpu only sold to openai!
>>
File: get fucked APIcucks.png (87 KB, 1419x599)
87 KB
87 KB PNG
I'm having the time of my day, top kek!
>>
File: ai.jpg (78 KB, 1024x1024)
78 KB
78 KB JPG
>>108456700
Which GPU are you using at the moment, and are you happy with it? Do you plan to upgrade?
>>
>>108456745
works on my computer
>>
File: 1771268565696394.png (1.01 MB, 1963x1316)
1.01 MB
1.01 MB PNG
>>108456779
I understand that they want to stop it because it's not profitable anymore, but that also shows that they weren't able to make an improved smaller model that would cost less and be as good as Sora 2, not a good sign, if OpenAI couldn't make it, will it ever happen? :(
>>
>>108456779
>SaaS models are so dar ahead of local ones that users would rather degrade themselves and beg like a dog than switch to local
grim
>>
>>108456833
>SaaS models are so dar ahead of local ones
proof?
>>
>>108456833
it's a bit of that, a bit of "I'm too poor to buy a 3090" and a bit of "do you really think I'm smart enough to run ComfyUi? Just give me a button to generate that's it"
>>108456840
for video models they destroy us (especially Seedance 2.0)
>>>/wsg/6115441
>>>/wsg/6115335
>>>/wsg/6109254
>>>/wsg/6105840
>>
>>108456840
i can't provide the proof anymore but trust me
>>
>>108456840
why did you take the b8 anon why
>>
File: this.png (664 KB, 1597x1600)
664 KB
664 KB PNG
>>108456855
>Saying that Seedance 2.0 is way better than LTX 2.3 is "bait"
Now that's a quality bait!
>>
>>108456847
DDR4 64gb ram costs from $150 to $600 thanks to this shit btw
>>
>mfw Resource news

03/25/2026

>End-to-End Training for Unified Tokenization and Latent Denoising
https://xingjianbai.com/unite-tokenization-generation

>Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models
https://github.com/jeeit17/UNCHA.git

>SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models
https://jiwoogit.github.io/SeaCache

>Metaphor-based Jailbreak Attacks on Text-to-Image Models
https://github.com/datar001/metaphor-based-jailbreaking-attack

>Dynamic VRAM in ComfyUI: Saving Local Models from RAMmageddon
https://blog.comfy.org/p/dynamic-vram-in-comfyui-saving-local

>ComfyUI-Flux2Klein-Enhancer
https://github.com/capitan01R/ComfyUI-Flux2Klein-Enhancer

03/24/2026

>daVinci-MagiHuman: Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
https://huggingface.co/GAIR/daVinci-MagiHuman

>SparkVSR: Interactive Video Super-Resolution via Sparse Keyframe Propagation
https://sparkvsr.github.io

>Manifold-Aware Exploration for Reinforcement Learning in Video Generation
https://dungeonmassster.github.io/SAGE-GRPO-Page

>PROBE: Diagnosing Residual Concept Capacity in Erased Text-to-Video Diffusion Models
https://github.com/YiweiXie/PRObingBasedEvaluation

>LPNSR: Prior-Enhanced Diffusion Image Super-Resolution via LR-Guided Noise Prediction
https://github.com/Faze-Hsw/LPNSR

>Text-Image Conditioned 3D Generation
https://jumpat.github.io/tigon-page

>Improving Diffusion Generalization with Weak-to-Strong Segmented Guidance
https://github.com/851695e35/SGG

>The Golden Subspace: Where Efficiency Meets Generalization in Continual Test-Time Adaptation
https://github.com/AIGNLAI/GOLD

>Style Organizer v6.0: Style Grid for Forge
https://github.com/KazeKaze93/sd-webui-style-organizer

03/23/2026

>Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models
https://franklinz233.github.io/projects/astrolabe
>>
File: 1666553265970688.png (181 KB, 483x470)
181 KB
181 KB PNG
>>108456821
>if OpenAI couldn't make it, will it ever happen?
how do you expect AI to evolve with so much selfishness? look how hailuo isn't evolving anymore. stupid, selfish, shitty companies
>>
>>108456861
if local models were actually good, people would rush to buy a lot of RAM and the price would've increased anyway
>>
>mfw Research news

03/25/2026

>ViBe: Ultra-High-Resolution Video Synthesis Born from Pure Images
https://arxiv.org/abs/2603.23326

>SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation
https://arxiv.org/abs/2603.22228

>Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation
https://bchao1.github.io/foveated-diffusion

>GO-Renderer: Generative Object Rendering with 3D-aware Controllable Video Diffusion Models
https://igl-hkust.github.io/GO-Renderer

>InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting
https://arxiv.org/abs/2603.23463

>WorldMesh: Generating Navigable Multi-Room 3D Scenes via Mesh-Conditioned Image Diffusion
https://mschneider456.github.io/world-mesh

>InterDyad: Interactive Dyadic Speech-to-Video Generation by Querying Intermediate Visual Guidance
https://interdyad.github.io

>P-Flow: Prompting Visual Effects Generation
https://arxiv.org/abs/2603.22091

>DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment
https://arxiv.org/abs/2603.22125

>DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation
https://arxiv.org/abs/2603.22041

>Group Editing : Edit Multiple Images in One Go
https://arxiv.org/abs/2603.22883

>Few-Shot Generative Model Adaption via Identity Injection and Preservation
https://arxiv.org/abs/2603.22965

>DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution
https://arxiv.org/abs/2603.22271

>DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models
https://cvlab-kaist.github.io/DA-Flow

>Policy-based Tuning of Autoregressive Image Models with Instance- and Distribution-Level Rewards
https://arxiv.org/abs/2603.23086

>UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation
https://arxiv.org/abs/2603.23500

>Tiny Inference-Time Scaling with Latent Verifiers
https://arxiv.org/abs/2603.22492
>>
>>108456860
>video models were only mentioned after the original b8
ah fuck now im taking it
>>
>>108456876
>nooooo, you see? when he says "SaaS models", we obviously don't talk about video models somehow
ah fuck now I'm taking your bait
>>
>>108456866
>>108456874
why are you still here schizo? get out of this place and go back to your sdg asylum
https://rentry.org/debo
>>
File: kino-alert.gif (577 KB, 498x498)
577 KB
577 KB GIF
>>108456847
i generated this kinoplex talkie in a single shot. it only took 10 minutes on my 12gb card. no wonder sora is shutting down
https://files.catbox.moe/z71bgh.mp4
>>
>>108456893
did you use udio or suno to make that music? you have great taste anon
>>
>>108456900
no the video model made it from a text prompt
>>
>>108456907
https://www.youtube.com/watch?v=hpbGz9JPadM
>>
>>108456893
Desu new jack swing would've increased the kinometer reading
>>
File: 1758052895197041.jpg (63 KB, 550x777)
63 KB
63 KB JPG
>Not local
Not worth it
Fuck Seedance, Fuck Sora, and Fuck Grok
Especially Seedance. Dont give a shit about your cringey cutey gogo gaga babyshit gachacrap stuff
And Grok. Holy fucking shit. Elon cucked hard. You used to be the top
>>
>>108456922
it's a sad period for the APIkeks, Grok is not free anymore and Sora 2 will be destroyed in august, that's why you go local guys, I hope you learned your lesson
>>
any local model that can do braces that well >>108453591 ?
>>
>>108456935
make a lora
>>
File: 1758292650770307.png (3.26 MB, 1152x1728)
3.26 MB
3.26 MB PNG
>>108456935
Z-Image Base
>>
File: 00022-297062542-seedvr2.jpg (758 KB, 2048x2048)
758 KB
758 KB JPG
>>108456833
its very bad to be very reliant on saas models because of unpredictable cucked censorship that can be implemented and losing access to those models because the owners decided to cut them offline. ultimately local will always be my number preference.
>>108456847
on the video side of things local is just going to have a difficult time catching up to saas due to serious hardware bottlenecks on the vram. If a local model can get close to the quality of og veo 3.0 and minimax hauilou 2.3 in terms of basic prompt adherence and visual graphics, it would be absolute heaven on earth. Not to be a shill but the visual photorealistic graphics and animation quality for seedance 2.0 are phenomenally great. I had my short fun with making basic talking head content with ltx 2 last year on the ltx api before open source release but i want to make more than talking head videos. Just going to share my old ltx gens here. For some reason i can't get the same decent quality results on my own hardware that i can easily get with the ltx api studio.
https://files.catbox.moe/w25bkt.mp4
https://files.catbox.moe/19b45z.mp4
https://files.catbox.moe/9omlfc.mp4
https://files.catbox.moe/2gul5o.mp4
https://files.catbox.moe/cuvxe1.mp4
https://files.catbox.moe/pvdat6.mp4
https://files.catbox.moe/s9qox2.mp4
https://files.catbox.moe/un171a.mp4
https://files.catbox.moe/59zuro.mp4
https://files.catbox.moe/hcyv31.mp4
>>108456935
I'm the anon made that gen, you can achieve braces that qwen image 2512
>>
File: Visual Generation.png (989 KB, 896x896)
989 KB
989 KB PNG
https://longcat.chat/longcat-next/intro
looks like ass not gonna lie
>>
File: 1751985402777328.png (3.25 MB, 1152x1728)
3.25 MB
3.25 MB PNG
>>108456935
>>
File: 1772388827340316.png (78 KB, 1114x608)
78 KB
78 KB PNG
>>108456779
>It's your fault that OpenAI didn't have a coherent business plan!
lmao they're fighting with each other now
>>
>>108457042
Benchmark pics are always shit.
>>
>>108457047
>they're fighting with each other
wow, imagine if we ever did that
>>
>>108457054
we don't blame each other for the failure of open source projects for example, it's not the same thing
>>
>>108457047
KEK same thing happened in grok communities when they cut the free plan
>>
File: 1763011898089533.png (312 KB, 2044x1233)
312 KB
312 KB PNG
>>108457047
I think they're mentally ill, holy shit
>>
>>108457082
>GarglingMyDadsPiss
what the fuck?
>>
how is local so far behind compared to sora 2. didnt that model come out 2 years ago? why can't local catch up to the premium api models available in comfyUI?
>>
>plebbit



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.