/g/ - /ldg/ - Local Diffusion General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/ldg/ - Local Diffusion Genera(...) 01/13/26(Tue)18:13:18 No.107855134

File: 1762840306016417.jpg (1.67 MB, 2190x2355)

/ldg/ - Local Diffusion General Anonymous 01/13/26(Tue)18:13:18 No.107855134

Discussion of Free and Open Source Diffusion Models

Prev: >>107851707

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

Anonymous
01/13/26(Tue)18:16:15 No.107855159

Anonymous 01/13/26(Tue)18:16:15 No.107855159

>>107855134
Whoaa that first pic is super duper realistic o_0

Anonymous
01/13/26(Tue)18:18:50 No.107855175

Anonymous 01/13/26(Tue)18:18:50 No.107855175

>>107855134
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
01/13/26(Tue)18:20:44 No.107855187

Anonymous 01/13/26(Tue)18:20:44 No.107855187

there is no qwen inpaint, right? you have to use mask and crop/stitch?

Anonymous
01/13/26(Tue)18:20:54 No.107855189

Anonymous 01/13/26(Tue)18:20:54 No.107855189

>>107855138
They're still around it seems
>https://github.com/nunchaku-ai/ComfyUI-nunchaku
>v1.2.0 Released! Enjoy a 20–30% Z-Image performance boost, seamless LoRA support with native ComfyUI nodes, and INT4 support for 20-series GPUs!

Safe to say wan is officially abandoned

Anonymous
01/13/26(Tue)18:26:47 No.107855240

Anonymous 01/13/26(Tue)18:26:47 No.107855240

>>107855189
Yes. I just used that lol.
But they are not training anything right now.
No pull requests or discussion of anything being in the works somewhere.
ZiT's PR was open for a while before they merged and released it.

Anonymous
01/13/26(Tue)18:30:16 No.107855268

Anonymous 01/13/26(Tue)18:30:16 No.107855268

Sorry if this is a dumb question but I’m looking to do realistic nsfw gens with loras, I have plenty of Loras for flux, do these work with chroma? What’s the best chroma checkpoint? Should I be using something better than chroma? Should I retrain the Loras with chroma somehow? Been kind of out of the loop for a bit.

Anonymous
01/13/26(Tue)18:31:17 No.107855277

Anonymous 01/13/26(Tue)18:31:17 No.107855277

>>107855268
Did you not like the answer anon?
>>107855181

Anonymous
01/13/26(Tue)18:34:04 No.107855296

Anonymous 01/13/26(Tue)18:34:04 No.107855296

File: file.png (6 KB, 326x105)

6 KB PNG

Beijing time tracker anon here
it's 7:30AM there, soon they will wake up and be preparing to drop the GLM Image model on us

Anonymous
01/13/26(Tue)18:34:34 No.107855302

Anonymous 01/13/26(Tue)18:34:34 No.107855302

>>107855277
ah sorry, i didnt see it, thought it got swept up before new thread.

so SDXL is STILL the best for nsfw?? im kinda shocked, i got much better results with flux overall.

Anonymous
01/13/26(Tue)18:38:46 No.107855347

Anonymous 01/13/26(Tue)18:38:46 No.107855347

>>107854980
i've been downloading basically everything since XL was released so i'm good on that. I have practically every lora for SDXL, ZIT, Qwen, Chroma, Wan2.1 & Wan2.2.

Anonymous
01/13/26(Tue)18:39:30 No.107855353

Anonymous 01/13/26(Tue)18:39:30 No.107855353

File: 1757214287690012.png (1.28 MB, 1728x1149)

1.28 MB PNG

LTX2 is amazing. source image here:

Spongebob Squarepants grabs a rifle and says "Hey Patrick, my memory costs five thousand dollars, lets take over the data center!". Patrick says "okay, Spongebob!".

https://files.catbox.moe/r8v8q1.mp4

Anonymous
01/13/26(Tue)18:41:14 No.107855372

Anonymous 01/13/26(Tue)18:41:14 No.107855372

what do we do when comfyui goes ipo?

Anonymous
01/13/26(Tue)18:42:23 No.107855387

Anonymous 01/13/26(Tue)18:42:23 No.107855387

>>107855372
Short it.

Anonymous
01/13/26(Tue)18:43:26 No.107855397

Anonymous 01/13/26(Tue)18:43:26 No.107855397

>>107855372
straight into the S&P500, your retirement fund will be 20% comfyui stocks

Anonymous
01/13/26(Tue)18:44:55 No.107855405

Anonymous 01/13/26(Tue)18:44:55 No.107855405

https://files.catbox.moe/w8qizx.mp4

Anonymous
01/13/26(Tue)18:45:12 No.107855410

Anonymous 01/13/26(Tue)18:45:12 No.107855410

>>107855372
sell immediately before comfy is allowed to

Anonymous
01/13/26(Tue)18:49:21 No.107855444

Anonymous 01/13/26(Tue)18:49:21 No.107855444

>tranibake

Anonymous
01/13/26(Tue)18:50:36 No.107855453

Anonymous 01/13/26(Tue)18:50:36 No.107855453

>>107855444
can you guys agree to a truce this week? two models might drop and it'd be a shame if the discussion is drowned out by nonsense

Anonymous
01/13/26(Tue)18:51:58 No.107855469

Anonymous 01/13/26(Tue)18:51:58 No.107855469

>>107855453
we're not getting base so it really doesn't matter.

Anonymous
01/13/26(Tue)18:52:57 No.107855474

Anonymous 01/13/26(Tue)18:52:57 No.107855474

unblessed thread

Anonymous
01/13/26(Tue)18:53:06 No.107855477

Anonymous 01/13/26(Tue)18:53:06 No.107855477

File: 1739643256259775.jpg (68 KB, 978x1094)

68 KB JPG

>>107855453
>might

Anonymous
01/13/26(Tue)18:53:08 No.107855478

Anonymous 01/13/26(Tue)18:53:08 No.107855478

File: unsloth vs nunchaku zit.jpg (2.96 MB, 3584x4608)

2.96 MB JPG

Unsloth seems to have released higher quality quants for some image models a few days ago. Certain layers are kept at slightly higher quants like Q_5 and Q_6 to boost quality. I think it does a good job for Q4 quant of a 6B model. Shame the quantization implementation for diffusion models suck. It runs slower than bf16 and q8. Use case for this seems very niche. It still runs similarly slower at high resolutions too, so no can't use it for that neither.
Much better quality than nunchaku. I almost wonder if they fucked up the nunchaku implementation somehow? 32 rank one is completely fucking raped, blurry, unusable mess. 128 and 256 rank ones are better, but still blurry and Q_4_K_M is noticeably closer to the original image. Although Nunchaku has the advantage of running three times faster, it's not worth it with the current quality imo. I wonder if they made a schizo 1024 rank model it would perform well. Should still run faster than normal quants and use less memory than q8, if my assumptions are correct.
Tested with 3060 12gb. And no I am fine with bf16 for ZiT too, tested out of curiosity and to see if it will be useful for the base.
Thanks for reading my blog.
>>107855453
>if
It WILL get drowned out by that.

Anonymous
01/13/26(Tue)18:55:17 No.107855498

Anonymous 01/13/26(Tue)18:55:17 No.107855498

>>107855347
rich fucker

Anonymous
01/13/26(Tue)18:56:24 No.107855506

Anonymous 01/13/26(Tue)18:56:24 No.107855506

>>107855302
Flux has higher quality, but it doesn't know nsfw.
Mixing multiple flux loras together to do nsfw is very iffy.
Chroma knows NSFW and has far higher maximum quality than SDXL, but is several times slower and likes to shit out worse than SDXL anatomy commonly.
As I said, if you want low quality but fast and reliable gens go with SDXL, if you want to play seed lottery with slow gen times, but occasionally get great gens, go with Chroma.
That's more or less all there is to it.

Anonymous
01/13/26(Tue)18:56:54 No.107855510

Anonymous 01/13/26(Tue)18:56:54 No.107855510

>>107855134
Why is AniStudio not in OP?

Anonymous
01/13/26(Tue)18:58:49 No.107855523

Anonymous 01/13/26(Tue)18:58:49 No.107855523

>>107855478
yeah ZiT is so small that i'm surprised people bother with quants
the video models are a different story. what's interesting is some anon in the wsg thread found that the Q8 quants have a big difference to bf16 outputs which usually isn't the case

Anonymous
01/13/26(Tue)18:58:49 No.107855524

Anonymous 01/13/26(Tue)18:58:49 No.107855524

>>107855498
it's less than 20TB. you can buy a single 24TB HDD for $500 and fit nearly all of it. now if you wanted to keep all loras on sdds, then yeah be prepared to spend thousands

Anonymous
01/13/26(Tue)19:00:24 No.107855536

Anonymous 01/13/26(Tue)19:00:24 No.107855536

>>107855506
I meant to type worse than SD 1.5
>>107855523
In the current thread? Curious about precisely how.

Anonymous
01/13/26(Tue)19:02:12 No.107855552

Anonymous 01/13/26(Tue)19:02:12 No.107855552

>>107855536
>In the current thread? Curious about precisely how.
yeah it's the current thread and i forgot to somehow mention i was talking about LTX. anon did a few comparison video between different quants
>>>/wsg/6069549

Anonymous
01/13/26(Tue)19:03:23 No.107855564

Anonymous 01/13/26(Tue)19:03:23 No.107855564

I almost miss working with SDXL
The better the models get, the more its about seed lottery and "prompt engineering" but I am tired of doing so many iterations its cluttering my SSD

Anonymous
01/13/26(Tue)19:05:01 No.107855574

Anonymous 01/13/26(Tue)19:05:01 No.107855574

>>107855564
what was sdxl about unc?

Anonymous
01/13/26(Tue)19:06:19 No.107855579

Anonymous 01/13/26(Tue)19:06:19 No.107855579

>>107855353
Spongebob speaks the truth

Anonymous
01/13/26(Tue)19:07:47 No.107855591

Anonymous 01/13/26(Tue)19:07:47 No.107855591

>>107855574
fast gens, mixing loras, masking and photoshop edits, because the prompt comprehension was atrocious
newer models like Flux 2, Qwen2511 have superior comprehension but you need to do a lot of prompt engineering to unlock their potential

Anonymous
01/13/26(Tue)19:08:02 No.107855594

Anonymous 01/13/26(Tue)19:08:02 No.107855594

>>107855564
for me? it's the wild variation unets offered. all dit models take everything too literally sometimes and leaves little room for a more exciting result

Anonymous
01/13/26(Tue)19:08:19 No.107855598

Anonymous 01/13/26(Tue)19:08:19 No.107855598

>>107855564
Wasn't the meta with sdxl is gen a boatload and pick from the best one from the monstrosities lol

Anonymous
01/13/26(Tue)19:08:20 No.107855599

Anonymous 01/13/26(Tue)19:08:20 No.107855599

so since they cancelled z base, what is next on the horizon?

Anonymous
01/13/26(Tue)19:08:54 No.107855606

Anonymous 01/13/26(Tue)19:08:54 No.107855606

>>107855599
ltx 2.1

Anonymous
01/13/26(Tue)19:10:14 No.107855615

Anonymous 01/13/26(Tue)19:10:14 No.107855615

>>107855552
Possibly a bug?
The model is new + new major backend stuff (comfy-kitchen) has been merged recently.
It's possible that something is implemented wrongly, handling the quantized data incorrectly, rather than quant sucking so much.
Q8 of a 19b model different that much from the baseline warrants some search for a decent explanation.

Anonymous
01/13/26(Tue)19:10:18 No.107855617

Anonymous 01/13/26(Tue)19:10:18 No.107855617

What's the best model for pixel art?
I mean both sprites and portrait stuff, I need placeholder art for my game so prototyping is the closest thing to the final stuff.
I remember seeing really good sprites uploaded here by an anon a long ago.

Anonymous
01/13/26(Tue)19:12:10 No.107855631

Anonymous 01/13/26(Tue)19:12:10 No.107855631

>>107855615
lmao >>>/wsg/6071832

Anonymous
01/13/26(Tue)19:13:06 No.107855643

Anonymous 01/13/26(Tue)19:13:06 No.107855643

>>107855510
see>>107855175

Anonymous
01/13/26(Tue)19:14:44 No.107855652

Anonymous 01/13/26(Tue)19:14:44 No.107855652

What's the difference between a text encoder and a llm?

Anonymous
01/13/26(Tue)19:15:46 No.107855660

Anonymous 01/13/26(Tue)19:15:46 No.107855660

>>107855506
so qwen and zimage arent contenders for nsfw anymore? im just so confused at the discourse, it seems like it should be clear what the best image generators are in each area, what about for inpainting and such? i just tried the newest qwen image edit 2511 and it was still terrible, and flux inpainting still seems to be the best for nsfw? i just wish there was somewhere i could get up to date info on this stuff

Anonymous
01/13/26(Tue)19:15:48 No.107855661

Anonymous 01/13/26(Tue)19:15:48 No.107855661

>>107855652
Nothing, same thing.

Anonymous
01/13/26(Tue)19:17:48 No.107855674

Anonymous 01/13/26(Tue)19:17:48 No.107855674

>>107855652
due to there generally only one LLM being trained against a diffusion model, it's the model's text encoder

it might output a strange space of concepts rather than text too

Anonymous
01/13/26(Tue)19:18:50 No.107855682

Anonymous 01/13/26(Tue)19:18:50 No.107855682

>>107855660
if you want nsfw use SDXL. Illustrious finetunes are the most coherent SDXL models. My personal favorite finetune is UncannyValley.

Anonymous
01/13/26(Tue)19:19:07 No.107855684

Anonymous 01/13/26(Tue)19:19:07 No.107855684

I started sending my ai slop to some of my friends and I'm starting to worry because most of them don't realize it is AI

Anonymous
01/13/26(Tue)19:20:11 No.107855690

Anonymous 01/13/26(Tue)19:20:11 No.107855690

File: 1753961513639870.png (1.56 MB, 1584x1056)

1.56 MB PNG

>>107855353
also for fun, a qwen edit 2511 edit

give them ak-47s:

Anonymous
01/13/26(Tue)19:20:25 No.107855693

Anonymous 01/13/26(Tue)19:20:25 No.107855693

>>107855660
you can do a limited amount of nsfw with qwen, zimage, hyimage, whatever.

but illustrious/noob and chroma are the models that are more broadly nsfw trained, they understand vastly more in that regard

Anonymous
01/13/26(Tue)19:20:26 No.107855694

Anonymous 01/13/26(Tue)19:20:26 No.107855694

>>107855617
not saying this is the best but you can try a few of the loras i've published:
https://civitai.com/user/n1eze

Anonymous
01/13/26(Tue)19:21:21 No.107855696

Anonymous 01/13/26(Tue)19:21:21 No.107855696

>>107855660
Neither Qwen nor ZiT know NSFW out of the box. NSFW loras either barely exist (Qwen) or are in shit quality and don't mix with other loras (Z-Image).
There is no expectation that anyone will make a major NSFW fientune of a big model like Qwen, but Z-Image will likely have people attempt to beat NSFW into it properly once the base version releases (current version is distilled and sucks for finetuning.). We will see if it will work out, but current Chroma and SDXL are your two options for nsfw.
I don't know jack shit about inpainting and I am tired of typing paragraphs.

Anonymous
01/13/26(Tue)19:21:42 No.107855699

Anonymous 01/13/26(Tue)19:21:42 No.107855699

>>107855693
qwean can do nsfw fine if you use an ablited text encoder

Anonymous
01/13/26(Tue)19:22:14 No.107855706

Anonymous 01/13/26(Tue)19:22:14 No.107855706

>>107855694
can you make a basedjack lora

Anonymous
01/13/26(Tue)19:22:48 No.107855709

Anonymous 01/13/26(Tue)19:22:48 No.107855709

>>107855706
some other anon already made one for Z image and published it, i don't have a dataset for that

Anonymous
01/13/26(Tue)19:23:27 No.107855714

Anonymous 01/13/26(Tue)19:23:27 No.107855714

>>107855709
benchod

Anonymous
01/13/26(Tue)19:23:50 No.107855716

Anonymous 01/13/26(Tue)19:23:50 No.107855716

>>107855714
??? idk what that means

Anonymous
01/13/26(Tue)19:23:50 No.107855717

Anonymous 01/13/26(Tue)19:23:50 No.107855717

File: 1755646012483881.png (1.44 MB, 1584x1056)

1.44 MB PNG

>>107855690
ice cream:

Anonymous
01/13/26(Tue)19:25:00 No.107855727

Anonymous 01/13/26(Tue)19:25:00 No.107855727

>>107855699
It still doesn't know what genitalia or sex actions look like.
Don't bother with this.

Anonymous
01/13/26(Tue)19:28:00 No.107855749

Anonymous 01/13/26(Tue)19:28:00 No.107855749

>>107855727
sex is when penis goes into vagina

Anonymous
01/13/26(Tue)19:29:17 No.107855756

Anonymous 01/13/26(Tue)19:29:17 No.107855756

>>107855696
ZIT loras are abysmal. Almost every one changes the subject too much or just gives poor results.

Anonymous
01/13/26(Tue)19:29:35 No.107855757

Anonymous 01/13/26(Tue)19:29:35 No.107855757

>>107855749
Or poophole

Anonymous
01/13/26(Tue)19:33:46 No.107855775

Anonymous 01/13/26(Tue)19:33:46 No.107855775

File: 1740476363934013.png (14 KB, 976x207)

14 KB PNG

What do you use to load the new LTX-2 vae by itself? Using picrel standard vae loader just gives me a black screen.

Anonymous
01/13/26(Tue)19:34:19 No.107855782

Anonymous 01/13/26(Tue)19:34:19 No.107855782

File: O__Ai_toolkit_2_AI-Toolki(...).jpg (202 KB, 1024x1024)

202 KB JPG

Anonymous
01/13/26(Tue)19:34:35 No.107855785

Anonymous 01/13/26(Tue)19:34:35 No.107855785

>>107855189
literally making nunchaku where it's the least needed

Anonymous
01/13/26(Tue)19:36:19 No.107855795

Anonymous 01/13/26(Tue)19:36:19 No.107855795

the porn loras I've seen trainings on for ltx2 look like the most basic stuff without any dataset filtering to get some nice girls out of it, do you think the model will be able to generalize beyond "40-45 yo heavy smoker milf"?

Anonymous
01/13/26(Tue)19:36:44 No.107855798

Anonymous 01/13/26(Tue)19:36:44 No.107855798

>>107855775
you need to use kijai vae loader

Anonymous
01/13/26(Tue)19:38:12 No.107855808

Anonymous 01/13/26(Tue)19:38:12 No.107855808

anyone know a good inpainting comfyui workflow for nsfw? so far the best one i've found is still flux kontext, there has to be something better that this point isnt there? i've been fairly frustrated with this so far, if anyone can help i would really appreciate it

Anonymous
01/13/26(Tue)19:39:47 No.107855818

Anonymous 01/13/26(Tue)19:39:47 No.107855818

File: 8986.png (1.21 MB, 1048x1048)

1.21 MB PNG

Anonymous
01/13/26(Tue)19:40:56 No.107855825

Anonymous 01/13/26(Tue)19:40:56 No.107855825

>>107855808
flux kontext is not inpainting, inpainting is when you use a masking node and its usually done with SDXL. attaching controlnets helps

Anonymous
01/13/26(Tue)19:43:08 No.107855839

Anonymous 01/13/26(Tue)19:43:08 No.107855839

>>107855808
>>107855825
Flux fill I believe is the dedicated inpainting variant.
You can use kontext with masks, though I am not sure about the quality.

Anonymous
01/13/26(Tue)19:48:45 No.107855882

Anonymous 01/13/26(Tue)19:48:45 No.107855882

>>107855798
That worked, thanks.

Anonymous
01/13/26(Tue)19:49:04 No.107855883

Anonymous 01/13/26(Tue)19:49:04 No.107855883

>>107855882
any time

Anonymous
01/13/26(Tue)19:52:40 No.107855913

Anonymous 01/13/26(Tue)19:52:40 No.107855913

>>107855825
i use it for inpainting and it seems to do better than flux fill, especially for nsfw

>>107855839
the kontext inpainting is the best i've seen so far, seems to outperform qwen but i do need to test a qwen inpainting nsfw lora i found

anyone have anything better?

Anonymous
01/13/26(Tue)19:53:30 No.107855920

Anonymous 01/13/26(Tue)19:53:30 No.107855920

>>107854466
nice

Anonymous
01/13/26(Tue)19:54:06 No.107855928

Anonymous 01/13/26(Tue)19:54:06 No.107855928

>>107855723
kek

Anonymous
01/13/26(Tue)19:59:28 No.107855959

Anonymous 01/13/26(Tue)19:59:28 No.107855959

imagine contributing to a software because you believe in foss and the guy sells out
lmao

Anonymous
01/13/26(Tue)20:03:15 No.107855978

Anonymous 01/13/26(Tue)20:03:15 No.107855978

At what strength do you use the detailler lora in ltx-2 in i2v?

Anonymous
01/13/26(Tue)20:07:23 No.107856008

Anonymous 01/13/26(Tue)20:07:23 No.107856008

saars
https://huggingface.co/zai-org/GLM-Image

Anonymous
01/13/26(Tue)20:10:37 No.107856023

Anonymous 01/13/26(Tue)20:10:37 No.107856023

>>107856008
I'll wait for GoFuckem to test it first and iron out all the kinks.

Anonymous
01/13/26(Tue)20:12:57 No.107856042

Anonymous 01/13/26(Tue)20:12:57 No.107856042

>>107855928
I took some kpin ok I wasnt paying atention

Anonymous
01/13/26(Tue)20:14:02 No.107856051

Anonymous 01/13/26(Tue)20:14:02 No.107856051

>>107856008
Seems promising if example images aren't heavily cherry picked.
I feel like I should wait until proper cumfart support + workflows + quants appear though.

Anonymous
01/13/26(Tue)20:14:24 No.107856053

Anonymous 01/13/26(Tue)20:14:24 No.107856053

>>107856008
this is some unholy qwen distill isnt it

Anonymous
01/13/26(Tue)20:15:20 No.107856059

Anonymous 01/13/26(Tue)20:15:20 No.107856059

File: zimg_00112.png (1.46 MB, 960x1280)

1.46 MB PNG

downloadin... wish me luck

Anonymous
01/13/26(Tue)20:17:31 No.107856069

Anonymous 01/13/26(Tue)20:17:31 No.107856069

an autorregressive model just flew over my house

Anonymous
01/13/26(Tue)20:17:36 No.107856070

Anonymous 01/13/26(Tue)20:17:36 No.107856070

>>107856051
Should have zoomed in and looked for more than two seconds, feeling emberassed rn.
Sloppy look.
It's doa unless it runs very fast and/or trains extremely well.

Anonymous
01/13/26(Tue)20:18:00 No.107856075

Anonymous 01/13/26(Tue)20:18:00 No.107856075

File: 1762048031180424.png (60 KB, 1376x314)

60 KB PNG

>already in damage control mode

Anonymous
01/13/26(Tue)20:19:14 No.107856081

Anonymous 01/13/26(Tue)20:19:14 No.107856081

File: Screenshot 2026-01-13 170343.png (433 KB, 691x890)

433 KB PNG

my senile uncle keeps falling for picrel type AI slop. anyone know which services people are using to make these vids? i want to make a video of him saying ridiculous shit so that maybe he'll believe me that this type of shit isn't real.

Anonymous
01/13/26(Tue)20:22:18 No.107856103

Anonymous 01/13/26(Tue)20:22:18 No.107856103

plastic? check
brown tint? check
generic showcase of outdated boring prompts? check
slower yet worse? check
chinkshit? check
it's culture time

Anonymous
01/13/26(Tue)20:23:47 No.107856116

Anonymous 01/13/26(Tue)20:23:47 No.107856116

https://files.catbox.moe/firn4v.mp4
https://files.catbox.moe/ue5iur.mp4
https://files.catbox.moe/swlmta.mp4

Anonymous
01/13/26(Tue)20:24:23 No.107856120

Anonymous 01/13/26(Tue)20:24:23 No.107856120

>>107856051
>>107856070
To adding on the fact that they never advertised images and the fact that it's not available on an API like FAL right now mean that they had no confidence in it.
Probably just wanted the investors to tell "we made an image model" in the next quarterly.
Will still give it a shot once the workflows are out.
>>107856081
Not local models, doesn't belong to this thread.
But to answer your question, such videos are primarily made with veo or sora.

Anonymous
01/13/26(Tue)20:28:24 No.107856143

Anonymous 01/13/26(Tue)20:28:24 No.107856143

how to get less terrible sound with ltx2?

Anonymous
01/13/26(Tue)20:30:12 No.107856152

Anonymous 01/13/26(Tue)20:30:12 No.107856152

>>107856081
this was generated with comfyui so it belongs in this thread

Anonymous
01/13/26(Tue)20:31:05 No.107856160

Anonymous 01/13/26(Tue)20:31:05 No.107856160

File: glm image.png (1.6 MB, 1385x786)

1.6 MB PNG

Take the giraffe for example. You would want an autoregressive model that "reasons and iterates on your prompt" to add coherent fancy details like decide on a nice cover for its book and quirky background posters that fit the anthropomorphic animal in daily life theme. And yet it's all gibberish sloppa. Its outfit isn't wet despite being out in the rain neither.

Anonymous
01/13/26(Tue)20:32:11 No.107856167

Anonymous 01/13/26(Tue)20:32:11 No.107856167

File: 00002-855848590.png (3.19 MB, 1536x1536)

3.19 MB PNG

>>107855302
yes sdxl is the best for nsfw stuff.

Anonymous
01/13/26(Tue)20:34:07 No.107856179

Anonymous 01/13/26(Tue)20:34:07 No.107856179

>>107856143
Either continue from previous audio or wait for zeeb to drop the fixes

Anonymous
01/13/26(Tue)20:35:09 No.107856186

Anonymous 01/13/26(Tue)20:35:09 No.107856186

>>107856143
Seed, gen at higher quality or upscale. It will always be bad honestly until fix just less bad at higher res.

Anonymous
01/13/26(Tue)20:36:35 No.107856196

Anonymous 01/13/26(Tue)20:36:35 No.107856196

>>107855785
>tfw we got cultured with wanchaku before chinese culture

holy shit, i dont know whether to kek or cry

Anonymous
01/13/26(Tue)20:36:54 No.107856198

Anonymous 01/13/26(Tue)20:36:54 No.107856198

File: IMG_0265.jpg (165 KB, 1206x1347)

165 KB JPG

>do you like our open source model?

Anonymous
01/13/26(Tue)20:40:06 No.107856210

Anonymous 01/13/26(Tue)20:40:06 No.107856210

>>107856179
>>107856186
damn, ok

Anonymous
01/13/26(Tue)20:41:08 No.107856218

Anonymous 01/13/26(Tue)20:41:08 No.107856218

ltx2 update (minor) if you got the kijai distilled q8 or whatever:

13th of January 2026 update !!IMPORTANT!!
Turns out the video VAE in the initial distilled checkpoints has been wrong one all this time, which (of course) was the one I initially extracted. It has now been replaced with the correct one, which should provide much higher detail

at this moment this requires using updated KJNodes VAELoader to work correctly
Kijai/LTXV2_comfy · Hugging Face

https://huggingface.co/Kijai/LTXV2_comfy

https://huggingface.co/Kijai/LTXV2_comfy/blob/main/VAE/LTX2_video_vae_bf16.safetensors

Anonymous
01/13/26(Tue)20:41:08 No.107856219

Anonymous 01/13/26(Tue)20:41:08 No.107856219

File: creations.jpg (816 KB, 2846x1438)

816 KB JPG

>>107856116
Today's creations, not many were worth posting.

Anonymous
01/13/26(Tue)20:43:08 No.107856235

Anonymous 01/13/26(Tue)20:43:08 No.107856235

File: 1753201761898368.png (19 KB, 936x196)

19 KB PNG

Do people use this? It's not used in the default workflow.

Anonymous
01/13/26(Tue)20:43:56 No.107856239

Anonymous 01/13/26(Tue)20:43:56 No.107856239

>Because the inference optimizations for this architecture are currently limited, the runtime cost is still relatively high. It requires either a single GPU with more than 80GB of memory, or a multi-GPU setup.
How would it even work with multiple GPUs?
>The target image resolution must be divisible by 32. Otherwise, it will throw an error.
So 1024 times compressed latent space? No wonder why the images look shit.
>Guidance scale rather than CFG in the example inference code
Looks like the anon who guessed Qwen distill might be right>>107856053

Anonymous
01/13/26(Tue)20:46:08 No.107856252

Anonymous 01/13/26(Tue)20:46:08 No.107856252

>bro just autoregressively generate 4096 tokens and only then do you get to feed it into a diffusion model to actually make your image
Have I missed something or is this thing destined to be insanely slow? What is the tokens per second of a 9b LLM on a 5090, like 50 or something at best? It's gonna be well over a minute just for those tokens to gen, then you have a whole ass 7b diffusion model on top of that.

Anonymous
01/13/26(Tue)20:49:25 No.107856273

Anonymous 01/13/26(Tue)20:49:25 No.107856273

>>107856239
i downloaded this and keep running OOM because i did NOT read the manual lel

Anonymous
01/13/26(Tue)20:52:04 No.107856283

Anonymous 01/13/26(Tue)20:52:04 No.107856283

File: 00186-578589517.png (2.75 MB, 1824x1248)

2.75 MB PNG

Anonymous
01/13/26(Tue)20:52:46 No.107856289

Anonymous 01/13/26(Tue)20:52:46 No.107856289

It requires 80gb because all of these retarded inference scripts load everything at once and keep it loaded during every stage.

Anonymous
01/13/26(Tue)21:04:55 No.107856342

Anonymous 01/13/26(Tue)21:04:55 No.107856342

File: 00200-4040628510.png (3.03 MB, 1728x1344)

3.03 MB PNG

Anonymous
01/13/26(Tue)21:06:29 No.107856353

Anonymous 01/13/26(Tue)21:06:29 No.107856353

File: output_t2i.png (964 KB, 1152x1024)

964 KB PNG

i was able to gen a glm image locally with a 3090 24GB and cpu offloading, (i have 128GB).

took about 2 minutes for an image though.

Anonymous
01/13/26(Tue)21:07:33 No.107856355

Anonymous 01/13/26(Tue)21:07:33 No.107856355

>>107856353
I also have 128GB, what did you use? Comfy?

Anonymous
01/13/26(Tue)21:08:08 No.107856358

Anonymous 01/13/26(Tue)21:08:08 No.107856358

>>107856353
Step 2 sounds like something pol would say.

Anonymous
01/13/26(Tue)21:09:16 No.107856364

Anonymous 01/13/26(Tue)21:09:16 No.107856364

>>107856353
did you write all the text in the final image?

Anonymous
01/13/26(Tue)21:11:48 No.107856378

Anonymous 01/13/26(Tue)21:11:48 No.107856378

File: output_t2i.png (745 KB, 768x768)

745 KB PNG

attempt two is naturally, 1girl
30 steps, 768x768, it takes a long time before starting to gen just beating the shit out of my ram. this one only took 40 secs to gen after though. it clearly does not like the smaller resolution.

>>107856355
i used the inference script from the model page. attempting to drop the resolution and steps to see what kinda timing i get but it'll probably look like ass.

>>107856364
the prompt is from the official page

Anonymous
01/13/26(Tue)21:18:32 No.107856417

Anonymous 01/13/26(Tue)21:18:32 No.107856417

>>107856378
OK thanks anon, I wonder if it'll be able to do llamacpp style offloading to ram in comfy.

Anonymous
01/13/26(Tue)21:20:07 No.107856425

Anonymous 01/13/26(Tue)21:20:07 No.107856425

>>107856417
dropping the res also made it eat shit, going back to defaults and starting again. all the offloading stuff is already present in the gen pipeline so i don't see why it wouldn't be in comfy, it's all python. it's a big as fuck offload though.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.