/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/ldg/ - Local Diffusion Genera(...) 01/14/26(Wed)10:58:54 No.107861070

File: highlights_g_107858102_17(...).jpg (1.04 MB, 1691x2212)

1.04 MB JPG

/ldg/ - Local Diffusion General Anonymous 01/14/26(Wed)10:58:54 No.107861070 Archived

4u edition
Discussion of Free and Open Source Diffusion Models

Prev: >>107858102

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
01/14/26(Wed)10:59:36 No.107861079

Anonymous 01/14/26(Wed)10:59:36 No.107861079

Blessed thread of frenship

Anonymous
01/14/26(Wed)10:59:52 No.107861082

Anonymous 01/14/26(Wed)10:59:52 No.107861082

Any good WAI ControlNet workflows?
The rentry guide one isn't usable for me because it blends loras and sometimes tags across all regions.
NetaYume Lumina is not usable.
Right now I am experimenting with Qwen Image Edit followed by an Image2Image pass then some mask genning. Qwen is really good for getting quick multi-character compositions with variation, but the censorship is really annoying.

Anonymous
01/14/26(Wed)11:00:25 No.107861087

Anonymous 01/14/26(Wed)11:00:25 No.107861087

File: 1758556920639580.png (2.29 MB, 1184x1280)

2.29 MB PNG

kino 1girl DOMINANCE

Anonymous
01/14/26(Wed)11:00:50 No.107861088

Anonymous 01/14/26(Wed)11:00:50 No.107861088

File: cool-cat-900-x-1350-pictu(...).jpg (78 KB, 900x1350)

78 KB JPG

>mfw maintaining thread quality

Anonymous
01/14/26(Wed)11:01:19 No.107861092

Anonymous 01/14/26(Wed)11:01:19 No.107861092

File: 957.png (3.81 MB, 1536x1824)

3.81 MB PNG

controlnetxlCNXL_kataragilnpaint [ad3c2578]
noobailnpainting_v10 [0bc5f3a1]
I used one of these two, dont remember which

Anonymous
01/14/26(Wed)11:02:27 No.107861094

Anonymous 01/14/26(Wed)11:02:27 No.107861094

>>107861070
Thank you for baking this thread, anon
>>107861079
Thank you for blessing this thread, anon

Anonymous
01/14/26(Wed)11:02:32 No.107861095

Anonymous 01/14/26(Wed)11:02:32 No.107861095

>>107861082
wait there is cnet for neta yume lumina?

Anonymous
01/14/26(Wed)11:02:52 No.107861100

Anonymous 01/14/26(Wed)11:02:52 No.107861100

>>107861087
zit slop

>>107861088
nogen slop

Anonymous
01/14/26(Wed)11:03:14 No.107861104

Anonymous 01/14/26(Wed)11:03:14 No.107861104

>>107861070
hopefully this thread doesnt get nuked

to >>107861015, i use DPM++ 2M SDE and Karras for everything, never have issues like this and they generate just fine its only when i Upscale them using img to img that this happens,

one thing i can think of is i generated all the dataset images myself and they're all 1024x1024, should i vary some of them up, crop them in weird ways perhaps?

Anonymous
01/14/26(Wed)11:03:58 No.107861110

Anonymous 01/14/26(Wed)11:03:58 No.107861110

File: d57d6f8377d548a0c986e24a5(...).png (11 KB, 343x183)

11 KB PNG

>start a gen at 1200p 81 frames
>go lie in bed to rest a bit
>start wondering why it's taking way too long
>go back once I hear fans stop revving up and down

I RENDERED 4 AND NOT 1 REEEEE

Anonymous
01/14/26(Wed)11:07:27 No.107861127

Anonymous 01/14/26(Wed)11:07:27 No.107861127

File: 1757232804489924.png (2.7 MB, 1344x1152)

2.7 MB PNG

enjoying my isekino slop

Anonymous
01/14/26(Wed)11:08:24 No.107861135

Anonymous 01/14/26(Wed)11:08:24 No.107861135

File: im racist now.png (11 KB, 302x93)

11 KB PNG

mods bonked me wah

Anonymous
01/14/26(Wed)11:12:42 No.107861160

Anonymous 01/14/26(Wed)11:12:42 No.107861160

>>107861146
drag and shot comfy
drag and shot namefags
drag and shot anistudio
drag and shot schizos

Anonymous
01/14/26(Wed)11:13:14 No.107861162

Anonymous 01/14/26(Wed)11:13:14 No.107861162

>>107861160
hello hitler poster

Anonymous
01/14/26(Wed)11:13:47 No.107861169

Anonymous 01/14/26(Wed)11:13:47 No.107861169

are embeddings a meme?

Anonymous
01/14/26(Wed)11:14:14 No.107861172

Anonymous 01/14/26(Wed)11:14:14 No.107861172

File: 1763478083006051.mp4 (598 KB, 704x480)

598 KB MP4

>>107861162

Anonymous
01/14/26(Wed)11:14:53 No.107861180

Anonymous 01/14/26(Wed)11:14:53 No.107861180

how do I prevent myself from filling 2 tb of models?

Anonymous
01/14/26(Wed)11:16:26 No.107861191

Anonymous 01/14/26(Wed)11:16:26 No.107861191

File: 8295.png (1.91 MB, 1264x864)

1.91 MB PNG

>>107861180
become a real artist

Anonymous
01/14/26(Wed)11:17:31 No.107861197

Anonymous 01/14/26(Wed)11:17:31 No.107861197

>>107861191
i would eat the fuck out of that

Anonymous
01/14/26(Wed)11:18:16 No.107861202

Anonymous 01/14/26(Wed)11:18:16 No.107861202

File: 1753237921677459.png (1.62 MB, 1024x1024)

1.62 MB PNG

Anonymous
01/14/26(Wed)11:19:10 No.107861208

Anonymous 01/14/26(Wed)11:19:10 No.107861208

File: 1765268992882412.png (2.08 MB, 960x1568)

2.08 MB PNG

Anonymous
01/14/26(Wed)11:19:44 No.107861213

Anonymous 01/14/26(Wed)11:19:44 No.107861213

File: 1737016391283383.png (2.37 MB, 1280x1184)

2.37 MB PNG

I love 1girl

Anonymous
01/14/26(Wed)11:20:17 No.107861219

Anonymous 01/14/26(Wed)11:20:17 No.107861219

File: 1737359738054445.png (2.52 MB, 1184x1280)

2.52 MB PNG

Anonymous
01/14/26(Wed)11:20:49 No.107861225

Anonymous 01/14/26(Wed)11:20:49 No.107861225

File: 1740291157493473.png (2.61 MB, 1216x1248)

2.61 MB PNG

Anonymous
01/14/26(Wed)11:21:42 No.107861231

Anonymous 01/14/26(Wed)11:21:42 No.107861231

File: 1767537955829533.png (2.38 MB, 1632x928)

2.38 MB PNG

Anonymous
01/14/26(Wed)11:21:59 No.107861234

Anonymous 01/14/26(Wed)11:21:59 No.107861234

>>107861191
didn't answer my question, also zit slop

Anonymous
01/14/26(Wed)11:22:14 No.107861237

Anonymous 01/14/26(Wed)11:22:14 No.107861237

File: 1745665686038492.png (2.29 MB, 1632x928)

2.29 MB PNG

Anonymous
01/14/26(Wed)11:22:34 No.107861239

Anonymous 01/14/26(Wed)11:22:34 No.107861239

File: 706721129.png (1.28 MB, 1328x1328)

1.28 MB PNG

Anonymous
01/14/26(Wed)11:23:06 No.107861241

Anonymous 01/14/26(Wed)11:23:06 No.107861241

>>107861239
i saw this on reddit

Anonymous
01/14/26(Wed)11:24:06 No.107861245

Anonymous 01/14/26(Wed)11:24:06 No.107861245

File: ComfyUI_temp_jgitd_00004_.jpg (538 KB, 3840x1280)

538 KB JPG

hard to compare a detailer when it changes the composition so much

Anonymous
01/14/26(Wed)11:25:27 No.107861257

Anonymous 01/14/26(Wed)11:25:27 No.107861257

File: 2210.png (1.59 MB, 1152x960)

1.59 MB PNG

>>107861239
this was on qwen, zit can't make the image actually isometric for whatever reason

Anonymous
01/14/26(Wed)11:25:44 No.107861261

Anonymous 01/14/26(Wed)11:25:44 No.107861261

>>107861245
what model

Anonymous
01/14/26(Wed)11:26:01 No.107861264

Anonymous 01/14/26(Wed)11:26:01 No.107861264

>>107861070
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
all this does is maintain low thread quality and I'm tired of pretending it's not

Anonymous
01/14/26(Wed)11:26:19 No.107861266

Anonymous 01/14/26(Wed)11:26:19 No.107861266

>>107861239
Can you give catbox? Pretty please?

Anonymous
01/14/26(Wed)11:26:47 No.107861271

Anonymous 01/14/26(Wed)11:26:47 No.107861271

File: 1737290202594875.png (1.25 MB, 1024x1024)

1.25 MB PNG

Anonymous
01/14/26(Wed)11:27:47 No.107861279

Anonymous 01/14/26(Wed)11:27:47 No.107861279

>>107861264
go to /sdg/ and enjoy the high quality there

Anonymous
01/14/26(Wed)11:27:54 No.107861280

Anonymous 01/14/26(Wed)11:27:54 No.107861280

File: ComfyUI_01586_.png (1.09 MB, 720x1280)

1.09 MB PNG

1girls in maid outfits

Anonymous
01/14/26(Wed)11:28:28 No.107861283

Anonymous 01/14/26(Wed)11:28:28 No.107861283

>>107861271
why are you doxxing me!

Anonymous
01/14/26(Wed)11:29:42 No.107861291

Anonymous 01/14/26(Wed)11:29:42 No.107861291

>>107861237
https://i.4cdn.org/wsg/1768408155289535.mp4

Anonymous
01/14/26(Wed)11:31:14 No.107861299

Anonymous 01/14/26(Wed)11:31:14 No.107861299

>>107861291
Prompt: A pig man asks a girl "How much for this bag of ores?"

Anonymous
01/14/26(Wed)11:34:24 No.107861320

Anonymous 01/14/26(Wed)11:34:24 No.107861320

File: qwen_edit_2511_00011_.png (1.3 MB, 768x1360)

1.3 MB PNG

>>107861280

Anonymous
01/14/26(Wed)11:34:48 No.107861323

Anonymous 01/14/26(Wed)11:34:48 No.107861323

File: ComfyUI_temp_jgitd_00008_.jpg (1.09 MB, 3840x1280)

1.09 MB JPG

>>107861261
zit

Anonymous
01/14/26(Wed)11:35:28 No.107861331

Anonymous 01/14/26(Wed)11:35:28 No.107861331

>>107861266
On qwen 2512
An orthographic 3D isometric render of a minimalist modern living room. Centered in the space is a massive, vertically oriented slab of natural agate, three meters in height, serving as a monolithic sculptural piece. The agate features polished, concentric bands of translucent teal, deep ochre, and creamy white, with a core of sparkling white quartz crystals that catch a soft internal glow. The living room is composed of two clean white walls meeting at a 90-grade corner on a floor of light-colored wide-plank oak. To the side of the agate sits a low-profile, modular L-shaped sofa upholstered in a textured charcoal grey fabric. A circular, heavy-knit wool rug in a warm ivory tone lies beneath the central stone. Decorative elements include a single tall fiddle-leaf fig plant in a matte black ceramic pot and a slender, minimalist black metal floor lamp with a spherical glass bulb. The lighting is a diffused global illumination with soft, neutral shadows, emphasizing the clean geometric lines of the furniture and the vibrant, glass-like texture of the agate rock. The entire scene is presented in a strict parallel projection with no perspective distortion.

Anonymous
01/14/26(Wed)11:35:58 No.107861338

Anonymous 01/14/26(Wed)11:35:58 No.107861338

>>107861299
dayum
in the anime it was "how much for 1 night with you"

Anonymous
01/14/26(Wed)11:36:18 No.107861341

Anonymous 01/14/26(Wed)11:36:18 No.107861341

is ltxv2 any good?

Anonymous
01/14/26(Wed)11:37:07 No.107861353

Anonymous 01/14/26(Wed)11:37:07 No.107861353

>>107861338
wait is thats from an anime? then ltx actually pulled dialogue from it and that's crazy

Anonymous
01/14/26(Wed)11:39:21 No.107861374

Anonymous 01/14/26(Wed)11:39:21 No.107861374

File: ComfyUI_01588_.png (1 MB, 720x1280)

1 MB PNG

>>107861320

Anonymous
01/14/26(Wed)11:40:11 No.107861381

Anonymous 01/14/26(Wed)11:40:11 No.107861381

>>107861341
fried video and audio. the arch is neat but that's about it. the low slop quality keeps it from being useful

Anonymous
01/14/26(Wed)11:40:34 No.107861384

Anonymous 01/14/26(Wed)11:40:34 No.107861384

>>107861374
now make her give birth

Anonymous
01/14/26(Wed)11:40:54 No.107861387

Anonymous 01/14/26(Wed)11:40:54 No.107861387

>>107861341
having the 1girls talking is pretty novel. it requires a bit more finagling to get what you want. wan's biggest advantage is its age, it's pretty well understood and has lots of loras so you can get exactly what you want.

ltx2 is a reminder that local will triumph, albeit slowly

Anonymous
01/14/26(Wed)11:41:47 No.107861391

Anonymous 01/14/26(Wed)11:41:47 No.107861391

>>107861387
grok can do goon

Anonymous
01/14/26(Wed)11:42:35 No.107861404

Anonymous 01/14/26(Wed)11:42:35 No.107861404

>>107861384
ok now we'll just have to wait 9 months

Anonymous
01/14/26(Wed)11:44:15 No.107861413

Anonymous 01/14/26(Wed)11:44:15 No.107861413

File: ComfyUI_01589_.png (936 KB, 720x1280)

936 KB PNG

>>107861384
i have a lora that will allow that to happen but mods will bonk me again instead you shall wait 9 months

Anonymous
01/14/26(Wed)11:45:08 No.107861422

Anonymous 01/14/26(Wed)11:45:08 No.107861422

>>107861413
anon I don't think SHE can wait 9 months, look at that belly

Anonymous
01/14/26(Wed)11:45:44 No.107861427

Anonymous 01/14/26(Wed)11:45:44 No.107861427

File: ComfyUI_temp_jgitd_00012_.jpg (528 KB, 3840x1280)

528 KB JPG

i think three is the best one but i can barely tell

Anonymous
01/14/26(Wed)11:46:09 No.107861433

Anonymous 01/14/26(Wed)11:46:09 No.107861433

File: 569-3720341751.png (652 KB, 600x841)

652 KB PNG

>>107861422
you dont know who's the father

Anonymous
01/14/26(Wed)11:47:35 No.107861446

Anonymous 01/14/26(Wed)11:47:35 No.107861446

File: qwen-edit_00016_.png (1.26 MB, 768x1360)

1.26 MB PNG

>>107861374

Anonymous
01/14/26(Wed)11:48:20 No.107861459

Anonymous 01/14/26(Wed)11:48:20 No.107861459

>>107861427
1 & 4 4 me

Anonymous
01/14/26(Wed)11:50:25 No.107861479

Anonymous 01/14/26(Wed)11:50:25 No.107861479

>>107861239
>>>/wsg/6072623

Anonymous
01/14/26(Wed)11:53:10 No.107861504

Anonymous 01/14/26(Wed)11:53:10 No.107861504

File: fails.mp4 (3.9 MB, 2048x2048)

3.9 MB MP4

>>107861479
fails

Anonymous
01/14/26(Wed)11:53:30 No.107861507

Anonymous 01/14/26(Wed)11:53:30 No.107861507

File: 424.jpg (685 KB, 2048x2048)

685 KB JPG

Anonymous
01/14/26(Wed)11:53:46 No.107861512

Anonymous 01/14/26(Wed)11:53:46 No.107861512

File: NetaYumeLuminaV40_Output_(...).png (2.83 MB, 1536x1280)

2.83 MB PNG

Anonymous
01/14/26(Wed)11:56:01 No.107861527

Anonymous 01/14/26(Wed)11:56:01 No.107861527

>>107861374
>*blushes*

Anonymous
01/14/26(Wed)11:56:32 No.107861529

Anonymous 01/14/26(Wed)11:56:32 No.107861529

>>107861512
real?

Anonymous
01/14/26(Wed)11:57:10 No.107861534

Anonymous 01/14/26(Wed)11:57:10 No.107861534

File: 310876.png (3.38 MB, 1152x2048)

3.38 MB PNG

Anonymous
01/14/26(Wed)11:57:17 No.107861536

Anonymous 01/14/26(Wed)11:57:17 No.107861536

File: ComfyUI_temp_jgitd_00016_.jpg (917 KB, 3840x1280)

917 KB JPG

>>107861459
maybe it is 4, you could be right

Anonymous
01/14/26(Wed)11:58:45 No.107861550

Anonymous 01/14/26(Wed)11:58:45 No.107861550

>>107861536
did you seriously prompt for her not dying her hair on time

Anonymous
01/14/26(Wed)11:59:10 No.107861552

Anonymous 01/14/26(Wed)11:59:10 No.107861552

>>107861391
and a restaurant can make food. but if you can grow food and make it into a meal yourself you don't have to rely on a third party service to keep yourself fed.

Anonymous
01/14/26(Wed)11:59:45 No.107861555

Anonymous 01/14/26(Wed)11:59:45 No.107861555

>>107861552
yet you don't do it for food

Anonymous
01/14/26(Wed)12:01:11 No.107861566

Anonymous 01/14/26(Wed)12:01:11 No.107861566

>>107861507
are you that fucker that genned that cursed kfc ages ago

Anonymous
01/14/26(Wed)12:03:20 No.107861579

Anonymous 01/14/26(Wed)12:03:20 No.107861579

>>107861536
I couldn't see the diff in the other but easily in this one between 1 & 4

Anonymous
01/14/26(Wed)12:03:35 No.107861584

Anonymous 01/14/26(Wed)12:03:35 No.107861584

>>107861566
nope

Anonymous
01/14/26(Wed)12:05:56 No.107861602

Anonymous 01/14/26(Wed)12:05:56 No.107861602

File: fails.mp4 (3.66 MB, 2048x2048)

3.66 MB MP4

>>107861507
>>>/wsg/6072629

Anonymous
01/14/26(Wed)12:09:31 No.107861623

Anonymous 01/14/26(Wed)12:09:31 No.107861623

File: qwen-edit_00020_.png (1.26 MB, 1176x880)

1.26 MB PNG

Anonymous
01/14/26(Wed)12:13:13 No.107861653

Anonymous 01/14/26(Wed)12:13:13 No.107861653

File: 1740458974336673.png (2.4 MB, 1280x1184)

2.4 MB PNG

anon.. lets make some slop together!

Anonymous
01/14/26(Wed)12:16:12 No.107861669

Anonymous 01/14/26(Wed)12:16:12 No.107861669

>>107861653
zit

Anonymous
01/14/26(Wed)12:19:53 No.107861693

Anonymous 01/14/26(Wed)12:19:53 No.107861693

File: NetaYumeLuminaV40_Output_(...).png (1.9 MB, 1280x1536)

1.9 MB PNG

>>107861331
NTA but tried it on NetaYume lol, actually pretty similar results to your Qwen one

Anonymous
01/14/26(Wed)12:20:44 No.107861698

Anonymous 01/14/26(Wed)12:20:44 No.107861698

File: ComfyUI_temp_ycdpa_00002_.jpg (699 KB, 3840x1280)

699 KB JPG

these aren't more/less detailed they're just different i think

Anonymous
01/14/26(Wed)12:25:25 No.107861734

Anonymous 01/14/26(Wed)12:25:25 No.107861734

File: zimg_00084.png (1.44 MB, 960x1280)

1.44 MB PNG

>>107861653
ok

Anonymous
01/14/26(Wed)12:27:33 No.107861748

Anonymous 01/14/26(Wed)12:27:33 No.107861748

File: 32GB RAM face.jpg (170 KB, 900x685)

170 KB JPG

is chroma s slow to train as it is to gen with

Anonymous
01/14/26(Wed)12:28:15 No.107861750

Anonymous 01/14/26(Wed)12:28:15 No.107861750

>>107861061
>come back
>newbread got deleted
>look at collage
>ass in collage
muahah did my ass gen nuke the other newbread?

Anonymous
01/14/26(Wed)12:29:25 No.107861757

Anonymous 01/14/26(Wed)12:29:25 No.107861757

File: qwen-edit_00026_.png (1.08 MB, 1296x808)

1.08 MB PNG

Anonymous
01/14/26(Wed)12:32:11 No.107861770

Anonymous 01/14/26(Wed)12:32:11 No.107861770

>>107861748
Takes me about 1.5 hours to train a lora using the default AI Toolkit config on a 5090

Anonymous
01/14/26(Wed)12:39:46 No.107861816

Anonymous 01/14/26(Wed)12:39:46 No.107861816

>>107861734
old hag

Anonymous
01/14/26(Wed)12:43:07 No.107861836

Anonymous 01/14/26(Wed)12:43:07 No.107861836

>>107861748
>>107861770
>tfw cant install any of the trainers that allow for chroma training
>tried everything but errors out the ass
>only training that works is musubi tuner and flux trainer for comfyui
>can only train flux loras

they surprisingly work for chroma but still, i'd imagine they would work better if properly trained with chroma

Anonymous
01/14/26(Wed)12:47:45 No.107861855

Anonymous 01/14/26(Wed)12:47:45 No.107861855

>>107861750
That was a nice ass anon
But jannies do not like female butts

Anonymous
01/14/26(Wed)12:48:02 No.107861858

Anonymous 01/14/26(Wed)12:48:02 No.107861858

File: 615001588.png (1.98 MB, 832x1600)

1.98 MB PNG

Anonymous
01/14/26(Wed)12:49:51 No.107861866

Anonymous 01/14/26(Wed)12:49:51 No.107861866

forgot to change i2v pic for new prompt, but it just cut to the new prompt regardless, and it's a hundred times better than t2v
why does t2v suck so bad if the model is clearly capable of better fidelity

Anonymous
01/14/26(Wed)12:51:39 No.107861879

Anonymous 01/14/26(Wed)12:51:39 No.107861879

>>107861836
If you have a Linux distro on hand you can try following the rentry under Chroma and that would at least let you go against the model directly but yeah sorry I haven't done it myself.
Other than AI Toolkit it looks like OneTrainer also supports Chroma.

Anonymous
01/14/26(Wed)12:52:45 No.107861890

Anonymous 01/14/26(Wed)12:52:45 No.107861890

File: localcope.png (363 KB, 1800x462)

363 KB PNG

AHAHAHA PULL UP FAGGOT, WHERE IS THIS IMMINENT BASE NOW??
chinkshit shills deserve to be shot

Anonymous
01/14/26(Wed)12:56:02 No.107861907

Anonymous 01/14/26(Wed)12:56:02 No.107861907

What.. A 5090 is 100ish tflops at fp32, an H100 is only 60ish?

Anonymous
01/14/26(Wed)13:01:54 No.107861941

Anonymous 01/14/26(Wed)13:01:54 No.107861941

File: ComfyUI_01592_.png (952 KB, 720x1280)

952 KB PNG

maccas

Anonymous
01/14/26(Wed)13:06:55 No.107861976

Anonymous 01/14/26(Wed)13:06:55 No.107861976

>>107861890
Remember unless you train loras yourself you have no reason to care about base, they've been extremely clear that the out of the box quality is worse than Turbo

Anonymous
01/14/26(Wed)13:08:10 No.107861986

Anonymous 01/14/26(Wed)13:08:10 No.107861986

File: 8.png (1.46 MB, 1216x1216)

1.46 MB PNG

>>107861907
I think memory bandwidth is more of a concern for AI hardware

Anonymous
01/14/26(Wed)13:08:28 No.107861993

Anonymous 01/14/26(Wed)13:08:28 No.107861993

just came home
could someone please link me to z-image base? want to try it out now

Anonymous
01/14/26(Wed)13:09:49 No.107862001

Anonymous 01/14/26(Wed)13:09:49 No.107862001

File: 9300.png (1.74 MB, 960x1088)

1.74 MB PNG

>>107861993
https://github.com/Tongyi-MAI/Z-Image

Anonymous
01/14/26(Wed)13:11:34 No.107862019

Anonymous 01/14/26(Wed)13:11:34 No.107862019

>>107861986
You can access it from comfyui's built-in API nodes!

Anonymous
01/14/26(Wed)13:14:00 No.107862039

Anonymous 01/14/26(Wed)13:14:00 No.107862039

>>107861529
yep, snapped it on my phone at the local park in fact

Anonymous
01/14/26(Wed)13:22:31 No.107862091

Anonymous 01/14/26(Wed)13:22:31 No.107862091

Anyone have success with a LLM prompt to expand or write prompts for Chroma?
I tried a couple of ones out there for Flux but it didn't work so well. I can write prompts manually that usually give me what I want but it takes like 15 minutes to craft a decent prompt.

Anonymous
01/14/26(Wed)13:24:11 No.107862106

Anonymous 01/14/26(Wed)13:24:11 No.107862106

>>107862091
Chroma was captioned originally using jailbroken Gemini 2.5 afaik

Anonymous
01/14/26(Wed)13:31:27 No.107862148

Anonymous 01/14/26(Wed)13:31:27 No.107862148

How do i get free ComfyCredits?

Anonymous
01/14/26(Wed)13:32:44 No.107862159

Anonymous 01/14/26(Wed)13:32:44 No.107862159

>>107862148
you gotta complement comfy's gens when he posts in the thread

Anonymous
01/14/26(Wed)13:39:56 No.107862217

Anonymous 01/14/26(Wed)13:39:56 No.107862217

File: ComfyUI_00020_.png (1.4 MB, 1024x1024)

1.4 MB PNG

Anonymous
01/14/26(Wed)13:43:23 No.107862243

Anonymous 01/14/26(Wed)13:43:23 No.107862243

is it me or LTX2 I2V or V2V is really bad to keep a character's face? Even under 5sec it's already someone else >.>

Anonymous
01/14/26(Wed)13:56:54 No.107862335

Anonymous 01/14/26(Wed)13:56:54 No.107862335

>>107862243
i2v always looks like shit for me. i heard there's some problem with it they're trying to fix in ltx2.1

Anonymous
01/14/26(Wed)14:00:31 No.107862363

Anonymous 01/14/26(Wed)14:00:31 No.107862363

>run comfy workflow
>works like a charm
>run it again
>LMAO OOM
kek what a fucking piece of shit

Anonymous
01/14/26(Wed)14:03:46 No.107862386

Anonymous 01/14/26(Wed)14:03:46 No.107862386

>>107862363
are you using qwen? i noticed qwen image and edit models don't deallocate or reuse memory properly in comfy i have to manually clear the model each time
fucking cumfart

Anonymous
01/14/26(Wed)14:04:08 No.107862387

Anonymous 01/14/26(Wed)14:04:08 No.107862387

>>107862363
lmao feelin comfy yet?

Anonymous
01/14/26(Wed)14:05:35 No.107862397

Anonymous 01/14/26(Wed)14:05:35 No.107862397

>>107862386
yeah this time it was qwen, but honestly i noticed this with all sorts of models

>>107862387
i'm mad comfy

Anonymous
01/14/26(Wed)14:14:02 No.107862472

Anonymous 01/14/26(Wed)14:14:02 No.107862472

why aren't there any good porn loras for ltx yet?

Anonymous
01/14/26(Wed)14:18:43 No.107862504

Anonymous 01/14/26(Wed)14:18:43 No.107862504

File: Chroma1-HD-Flash.safetens(...).png (1.41 MB, 928x1152)

1.41 MB PNG

Anonymous
01/14/26(Wed)14:19:03 No.107862510

Anonymous 01/14/26(Wed)14:19:03 No.107862510

File: 96ffc0fa78a5f24f60d28e1d8(...).jpg (55 KB, 736x736)

55 KB JPG

no base
ded thread
it's over

Anonymous
01/14/26(Wed)14:20:05 No.107862517

Anonymous 01/14/26(Wed)14:20:05 No.107862517

could someone recommend a good qwen inpainting workflow? i tried a ton but they are super inconsistent. sometimes it works great sometimes it does the worst things to the image

Anonymous
01/14/26(Wed)14:21:12 No.107862526

Anonymous 01/14/26(Wed)14:21:12 No.107862526

File: ComfyUI_00021_.png (1.19 MB, 800x1080)

1.19 MB PNG

Anonymous
01/14/26(Wed)14:21:59 No.107862535

Anonymous 01/14/26(Wed)14:21:59 No.107862535

>>107862472
ltx came out, maybe 2 weeks ago and ostris just released support for its training? I dunno, vaguely following ltx2

Anonymous
01/14/26(Wed)14:29:52 No.107862582

Anonymous 01/14/26(Wed)14:29:52 No.107862582

>>107862504
Okay, gen again but sitting cross-legged, making sure feet and toes are in the shot :p

Anonymous
01/14/26(Wed)14:31:42 No.107862596

Anonymous 01/14/26(Wed)14:31:42 No.107862596

>>107862510
>base
>ever releasing

Read up about 中国文化

Anonymous
01/14/26(Wed)14:42:20 No.107862660

Anonymous 01/14/26(Wed)14:42:20 No.107862660

>>107862517
oh shit ok figured it out. using the inpaint crop and stich nodes, works very well so far. made a mistake of using some "inpaint model conditioning" node before that fucked everything up

Anonymous
01/14/26(Wed)14:50:55 No.107862711

Anonymous 01/14/26(Wed)14:50:55 No.107862711

File: ComfyUI_00046_.jpg (1.34 MB, 2048x2048)

1.34 MB JPG

Anonymous
01/14/26(Wed)14:51:03 No.107862712

Anonymous 01/14/26(Wed)14:51:03 No.107862712

File: ComfyUI_00021_.png (1.77 MB, 1024x1024)

1.77 MB PNG

Anonymous
01/14/26(Wed)14:51:16 No.107862715

Anonymous 01/14/26(Wed)14:51:16 No.107862715

Neat, there is an AI general on /bant/ with no trolls and schizos because flags and tripcode.
>>>/bant/23836910
Good to keep in mind when a certain dev shows up.

Anonymous
01/14/26(Wed)14:59:15 No.107862764

Anonymous 01/14/26(Wed)14:59:15 No.107862764

Can you uncensor qwen image edit? If wan2.2 can be uncensored effectively with loraa, then I assume qwen can too?

Anonymous
01/14/26(Wed)14:59:17 No.107862766

Anonymous 01/14/26(Wed)14:59:17 No.107862766

File: z-image-experimental_00632_.png (3.49 MB, 1264x2048)

3.49 MB PNG

shift 35 lol

Anonymous
01/14/26(Wed)15:06:45 No.107862813

Anonymous 01/14/26(Wed)15:06:45 No.107862813

>>107862766
Anything above single digits start to become counter-productive.
You need some low timesteps to produce a decent looking image.

Anonymous
01/14/26(Wed)15:10:15 No.107862832

Anonymous 01/14/26(Wed)15:10:15 No.107862832

>>107862764
the SNOFS lora and gnass loras do a decent job but there's no good solution yet that i've seen

Anonymous
01/14/26(Wed)15:13:34 No.107862854

Anonymous 01/14/26(Wed)15:13:34 No.107862854

>>107862832
Do they at least fix the disappearing bodies?
Can they be used for anime? Or only realism?

Anonymous
01/14/26(Wed)15:24:00 No.107862940

Anonymous 01/14/26(Wed)15:24:00 No.107862940

>>107862766
slop

Anonymous
01/14/26(Wed)15:25:12 No.107862951

Anonymous 01/14/26(Wed)15:25:12 No.107862951

>The paging file is too small for this operation to complete. (os error 1455)
>still 20GB a RAM unused
I swear...

Anonymous
01/14/26(Wed)15:26:24 No.107862961

Anonymous 01/14/26(Wed)15:26:24 No.107862961

>>107862951
happened to me, why did you do this?

Anonymous
01/14/26(Wed)15:28:27 No.107862981

Anonymous 01/14/26(Wed)15:28:27 No.107862981

What's the verdict on glm image?

Anonymous
01/14/26(Wed)15:31:25 No.107863003

Anonymous 01/14/26(Wed)15:31:25 No.107863003

>>107862981
I forgot

Anonymous
01/14/26(Wed)15:31:56 No.107863005

Anonymous 01/14/26(Wed)15:31:56 No.107863005

more like grlm image
am i rite guys?

Anonymous
01/14/26(Wed)15:33:08 No.107863011

Anonymous 01/14/26(Wed)15:33:08 No.107863011

Is flux-fill still the best inpainting model or has it been replaced?

Anonymous
01/14/26(Wed)15:38:18 No.107863042

Anonymous 01/14/26(Wed)15:38:18 No.107863042

File: file.png (44 KB, 830x644)

44 KB PNG

What values do you recommend for zimage, anons?

Anonymous
01/14/26(Wed)15:38:49 No.107863046

Anonymous 01/14/26(Wed)15:38:49 No.107863046

File: z-image-experimental_00640_.png (3.28 MB, 1264x2048)

3.28 MB PNG

>>107862940
mfw

Anonymous
01/14/26(Wed)15:40:29 No.107863058

Anonymous 01/14/26(Wed)15:40:29 No.107863058

Just checked my Pagefile: 37.5GB. So Flux.2 edit wanted more than 125GB to function? Jeez, what are the actual requirements?

>>107862961
Dunno, lol.

Anonymous
01/14/26(Wed)15:45:03 No.107863093

Anonymous 01/14/26(Wed)15:45:03 No.107863093

>>107861623
thanks for the new wallpaper

Anonymous
01/14/26(Wed)15:46:51 No.107863106

Anonymous 01/14/26(Wed)15:46:51 No.107863106

>>107862715
>red board
wake me up when there is a blue board like this

Anonymous
01/14/26(Wed)15:53:14 No.107863139

Anonymous 01/14/26(Wed)15:53:14 No.107863139

>>107863058
i hit 90GB RAM usage when using it without references images, so it's totally plausible that with reference images it goes even higher

Anonymous
01/14/26(Wed)15:55:11 No.107863151

Anonymous 01/14/26(Wed)15:55:11 No.107863151

>>107863042
>What values do you recommend for zimage, anons?
nag_scale 3, nag_tau 1, nag_alpha 0.25, nag_sigma_end 0.75

Anonymous
01/14/26(Wed)15:57:08 No.107863164

Anonymous 01/14/26(Wed)15:57:08 No.107863164

>>107863151
Thank you!

Anonymous
01/14/26(Wed)15:58:23 No.107863173

Anonymous 01/14/26(Wed)15:58:23 No.107863173

how do you stop ltx2 from adding random garbage subtitles to a video?
and also to enhance lipsync?

Anonymous
01/14/26(Wed)16:03:19 No.107863209

Anonymous 01/14/26(Wed)16:03:19 No.107863209

>>107861770
That's the same time with a lower end GPU. You can probably do better.

Anonymous
01/14/26(Wed)16:08:04 No.107863245

Anonymous 01/14/26(Wed)16:08:04 No.107863245

How do I prevent Wan2.1 Infinitetalk to output 3 files? I don't need the png and mute video.

Anonymous
01/14/26(Wed)16:13:31 No.107863276

Anonymous 01/14/26(Wed)16:13:31 No.107863276

File: 1747857564699993.png (19 KB, 1325x276)

19 KB PNG

>>107863245

Anonymous
01/14/26(Wed)16:19:47 No.107863318

Anonymous 01/14/26(Wed)16:19:47 No.107863318

>>107862243
>>107862335
I could barely get ltx2 to work, and hen I finally found a workflow that didn’t give me misc node errors, it just gens blank videos. If the i2v really is bad it may not be worth bothering with until it has had time to bake in the community oven.

Anonymous
01/14/26(Wed)16:21:37 No.107863327

Anonymous 01/14/26(Wed)16:21:37 No.107863327

File: ComfyUI_00022_.png (1.63 MB, 1024x1024)

1.63 MB PNG

Anonymous
01/14/26(Wed)16:22:02 No.107863330

Anonymous 01/14/26(Wed)16:22:02 No.107863330

>>107863276
Thanks!

Anonymous
01/14/26(Wed)16:22:25 No.107863334

Anonymous 01/14/26(Wed)16:22:25 No.107863334

>>107862243
>Even under 5sec it's already someone else >.>
yep, I went from 7 to 10 seconds and it became shit at character consistency
>>>/wsg/6072402

Anonymous
01/14/26(Wed)16:31:09 No.107863398

Anonymous 01/14/26(Wed)16:31:09 No.107863398

File: ComfyUI_00023_.png (1.54 MB, 1024x1024)

1.54 MB PNG

Anonymous
01/14/26(Wed)16:31:21 No.107863400

Anonymous 01/14/26(Wed)16:31:21 No.107863400

>>107861070
How do I make photorealistic ai pictures of myself

Anonymous
01/14/26(Wed)16:31:51 No.107863407

Anonymous 01/14/26(Wed)16:31:51 No.107863407

>>107863139
>If you see this and ComfyUI did not start try updating your Nvidia Drivers to the latest.
>my drivers are in fact the latest
This is what I get now after bumping the pagefile up to 75GB. Looks like I needed a couple of B200s to use that... guess I'll go back to Qwen to edit images until I have at least 384GB of VRAM.

Anonymous
01/14/26(Wed)16:34:26 No.107863424

Anonymous 01/14/26(Wed)16:34:26 No.107863424

>>107863407
>This is what I get now after bumping the pagefile up to 75GB.
why are you setting this manually? it works fine on automatic it gets bigger by itself if it needs more

Anonymous
01/14/26(Wed)16:39:11 No.107863452

Anonymous 01/14/26(Wed)16:39:11 No.107863452

>>107863424
It was on auto and was only using ~9GB when I changed it. Flux.2 T2I works fine on my machine, so I thought it would be trivial to use the edit workflow (just like Qwen), but was wrong. It wants way more resources than I can give it.

Anonymous
01/14/26(Wed)16:55:01 No.107863553

Anonymous 01/14/26(Wed)16:55:01 No.107863553

https://huggingface.co/zai-org/GLM-Image#note
>We strongly recommend using GLM-4.7 to enhance prompts for higher image quality.
What's the fucking point of an Autoregressive model if we still have to rewrite prompts in boomer style??

Anonymous
01/14/26(Wed)16:57:01 No.107863567

Anonymous 01/14/26(Wed)16:57:01 No.107863567

>>107863400
step 1: be very narcissistic
step 2: be turkish (optional but helps)

Anonymous
01/14/26(Wed)16:58:36 No.107863574

Anonymous 01/14/26(Wed)16:58:36 No.107863574

I love how the job queue is erased from existence when the program crashes. Very comfy.

Anonymous
01/14/26(Wed)17:12:40 No.107863645

Anonymous 01/14/26(Wed)17:12:40 No.107863645

File: ComfyUI_temp_padqh_00001_.png (2.2 MB, 1040x1480)

2.2 MB PNG

Anonymous
01/14/26(Wed)17:19:13 No.107863681

Anonymous 01/14/26(Wed)17:19:13 No.107863681

>>107863645
*vomits*

Anonymous
01/14/26(Wed)17:19:36 No.107863684

Anonymous 01/14/26(Wed)17:19:36 No.107863684

https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI/tree/main

video extend, it can even clone audio, lots of fun desu. set the sigmas to 8 for the distil model in the samplers subgraph.

https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI/tree/main

Anonymous
01/14/26(Wed)17:25:15 No.107863715

Anonymous 01/14/26(Wed)17:25:15 No.107863715

File: x_478jch.png (1.27 MB, 1536x1024)

1.27 MB PNG

Anonymous
01/14/26(Wed)17:25:56 No.107863721

Anonymous 01/14/26(Wed)17:25:56 No.107863721

File: 1760387007074295.png (2.32 MB, 1152x1312)

2.32 MB PNG

Anonymous
01/14/26(Wed)17:26:20 No.107863723

Anonymous 01/14/26(Wed)17:26:20 No.107863723

File: ComfyUI_temp_axihh_00006_(2).png (1.41 MB, 1344x704)

1.41 MB PNG

zit release was 7 weeks ago
lantern festival (end of chinese new year) is in 7 weeks

Anonymous
01/14/26(Wed)17:29:33 No.107863742

Anonymous 01/14/26(Wed)17:29:33 No.107863742

>>107863723
it's not the end of chinese new year that matter, it's the begining, once it reaches Feb 17, 2026 you know nothing is gonna happen until Mar 3, 2026

Anonymous
01/14/26(Wed)17:31:03 No.107863746

Anonymous 01/14/26(Wed)17:31:03 No.107863746

>>107863723
>>107863742
let's pretend they needed 2 more months to finish the base model, then why did they make turbo out of such an unfinished base in the first place?

Anonymous
01/14/26(Wed)17:31:36 No.107863749

Anonymous 01/14/26(Wed)17:31:36 No.107863749

>>107863684
also uses this:

https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer

Anonymous
01/14/26(Wed)17:32:33 No.107863759

Anonymous 01/14/26(Wed)17:32:33 No.107863759

>>107861092
Underrated gen. But tough luck you didn't use 1girl

Anonymous
01/14/26(Wed)17:33:32 No.107863762

Anonymous 01/14/26(Wed)17:33:32 No.107863762

>>107863749
example, also note how detailed it gets with the ltx detailer lora after 5s (the extension point)

https://files.catbox.moe/c4640w.mp4

Anonymous
01/14/26(Wed)17:35:39 No.107863767

Anonymous 01/14/26(Wed)17:35:39 No.107863767

You absolute dense Hurensohn.

Anonymous
01/14/26(Wed)17:36:09 No.107863770

Anonymous 01/14/26(Wed)17:36:09 No.107863770

>>107863762
not a big fan, it doesn't keep the image quality of the original input anymore, like if the image input is low res style it should stay that way imo, but I guess it'll work well on t2v I guess

Anonymous
01/14/26(Wed)17:37:12 No.107863776

Anonymous 01/14/26(Wed)17:37:12 No.107863776

File: 1756485855203629.png (431 KB, 800x582)

431 KB PNG

>>107863746
>then why did they make turbo out of such an unfinished base in the first place?
To kill Flux 2's momentum.

Anonymous
01/14/26(Wed)17:37:51 No.107863781

Anonymous 01/14/26(Wed)17:37:51 No.107863781

I'd really, really love it if, right now, out of nowhere, something came along that btfos z-image and makes any subsequent release of the base model irrelevant - fuck you and all that compute you just wasted prick teasing everyone.

Anonymous
01/14/26(Wed)17:38:52 No.107863785

Anonymous 01/14/26(Wed)17:38:52 No.107863785

>>107863776
which is a good thing

Anonymous
01/14/26(Wed)17:39:20 No.107863787

Anonymous 01/14/26(Wed)17:39:20 No.107863787

File: this.png (187 KB, 400x400)

187 KB PNG

>>107863781
>I'd really, really love it if, right now, out of nowhere, something came along that btfos z-image and makes any subsequent release of the base model irrelevant
same, Alibaba is fucking around with us for too long I want them to find out

Anonymous
01/14/26(Wed)17:39:42 No.107863788

Anonymous 01/14/26(Wed)17:39:42 No.107863788

File: 83.png (1.5 MB, 960x960)

1.5 MB PNG

>>107861092 meant for >>107861082 lul
>>107863759
uwu ty

Anonymous
01/14/26(Wed)17:40:24 No.107863790

Anonymous 01/14/26(Wed)17:40:24 No.107863790

>wan is bett-ACK
https://www.reddit.com/r/StableDiffusion/comments/1qc17bg/ltx2_team_really_took_the_gloves_off

Anonymous
01/14/26(Wed)17:41:02 No.107863795

Anonymous 01/14/26(Wed)17:41:02 No.107863795

>>107863781
It won't happen, they're barely trying to figure out the secret sauce of Z-image turbo, it's gonna take some time
https://www.youtube.com/watch?v=GM-e46xdcUo

Anonymous
01/14/26(Wed)17:41:42 No.107863797

Anonymous 01/14/26(Wed)17:41:42 No.107863797

>>107863770
you could also bypass the detailer lora in the workflow, will try that myself in a few

Anonymous
01/14/26(Wed)17:42:24 No.107863800

Anonymous 01/14/26(Wed)17:42:24 No.107863800

>>107863795
secret sauce is a super tiny dataset with one face for women

Anonymous
01/14/26(Wed)17:42:45 No.107863803

Anonymous 01/14/26(Wed)17:42:45 No.107863803

File: 00004-3737748990.png (2.26 MB, 1536x1536)

2.26 MB PNG

Anonymous
01/14/26(Wed)17:43:11 No.107863806

Anonymous 01/14/26(Wed)17:43:11 No.107863806

>>107863803
>white
in the trash

Anonymous
01/14/26(Wed)17:44:08 No.107863814

Anonymous 01/14/26(Wed)17:44:08 No.107863814

File: 1744181778141626.jpg (459 KB, 1250x1566)

459 KB JPG

>>107863795
>the secret sauce
the sad reality is that the secret sauce is only using real dataset and not AI synthetic slop, but companies don't like that because it won't make the mememarks big and the investors won't care

Anonymous
01/14/26(Wed)17:44:47 No.107863815

Anonymous 01/14/26(Wed)17:44:47 No.107863815

>>107863806
Halt das Maul du dreckiger Hurensohn

Anonymous
01/14/26(Wed)17:46:07 No.107863822

Anonymous 01/14/26(Wed)17:46:07 No.107863822

>>107863815
benchod

Anonymous
01/14/26(Wed)17:46:19 No.107863824

Anonymous 01/14/26(Wed)17:46:19 No.107863824

kek

https://files.catbox.moe/mqmwbn.mp4

Anonymous
01/14/26(Wed)17:48:05 No.107863833

Anonymous 01/14/26(Wed)17:48:05 No.107863833

>>107863824
real?

Anonymous
01/14/26(Wed)17:48:49 No.107863836

Anonymous 01/14/26(Wed)17:48:49 No.107863836

File: Lightricks be like.png (606 KB, 1080x607)

606 KB PNG

>>107863790
I don't think it was a good idea to make this comparison video. Only we will decide if LTX2 is the successor to Wan 2.2, not the LTX2 team itself.

Anonymous
01/14/26(Wed)17:49:29 No.107863840

Anonymous 01/14/26(Wed)17:49:29 No.107863840

>>107863833
yes

Anonymous
01/14/26(Wed)17:49:30 No.107863841

Anonymous 01/14/26(Wed)17:49:30 No.107863841

>>107863790
meaningless without workflows.

Anonymous
01/14/26(Wed)17:50:11 No.107863844

Anonymous 01/14/26(Wed)17:50:11 No.107863844

>>107863836
the audio part of the slaps hard.

Anonymous
01/14/26(Wed)17:50:29 No.107863845

Anonymous 01/14/26(Wed)17:50:29 No.107863845

>>107863822

Anonymous
01/14/26(Wed)17:51:28 No.107863851

Anonymous 01/14/26(Wed)17:51:28 No.107863851

https://youtu.be/g_ljoFqydlc?t=125
Is this fat fuck implying that ComfyUi users are faggots? lmao

Anonymous
01/14/26(Wed)17:53:51 No.107863860

Anonymous 01/14/26(Wed)17:53:51 No.107863860

File: 30066.png (1.6 MB, 960x960)

1.6 MB PNG

Anonymous
01/14/26(Wed)17:55:23 No.107863866

Anonymous 01/14/26(Wed)17:55:23 No.107863866

Seems like this thread is just about sock puppets acting on.

Anonymous
01/14/26(Wed)17:55:53 No.107863870

Anonymous 01/14/26(Wed)17:55:53 No.107863870

>>107863790
Of course the jews want to remind people of their "superiority", that's why I root for the Chinese more, when they release a model they don't take a jab at their competitors, they just let the quality of their model speak for itself

Anonymous
01/14/26(Wed)17:56:57 No.107863880

Anonymous 01/14/26(Wed)17:56:57 No.107863880

>>107863866
You are absolutely right fellow different anon.

Anonymous
01/14/26(Wed)17:57:30 No.107863884

Anonymous 01/14/26(Wed)17:57:30 No.107863884

>>107863866
>The sock puppet's talking

Anonymous
01/14/26(Wed)17:58:55 No.107863887

Anonymous 01/14/26(Wed)17:58:55 No.107863887

>>107861070
Suno v1.5 is sounding insane bros, the Chinks have finally done it.

https://vocaroo.com/1jTnsAjecY3S
https://vocaroo.com/14qXmeFQgVqe

Local is finally back

Anonymous
01/14/26(Wed)17:58:55 No.107863888

Anonymous 01/14/26(Wed)17:58:55 No.107863888

File: comparison.mp4 (1.32 MB, 1472x960)

1.32 MB MP4

>>107863790
>>107863836
ah, we doing comparisons?

Anonymous
01/14/26(Wed)17:59:12 No.107863889

Anonymous 01/14/26(Wed)17:59:12 No.107863889

>>107863880
This is what I mean. What happened to the strong posters?

Anonymous
01/14/26(Wed)17:59:47 No.107863894

Anonymous 01/14/26(Wed)17:59:47 No.107863894

File: 8806.png (895 KB, 960x512)

895 KB PNG

>>107863866
this as a prompt

Anonymous
01/14/26(Wed)17:59:49 No.107863895

Anonymous 01/14/26(Wed)17:59:49 No.107863895

>>107863824
LMAO

used a mourinho clip. this video extend workflow is amazing cause it also clones the audio. you can also use this to create i2v clips and clone any voice you like.

https://files.catbox.moe/vlkg8g.mp4

https://files.catbox.moe/vlkg8g.mp4

Anonymous
01/14/26(Wed)18:00:17 No.107863899

Anonymous 01/14/26(Wed)18:00:17 No.107863899

File: 1739747635546668.png (149 KB, 2070x755)

149 KB PNG

>>107863790
this ledditor is not wrong, LTX-2 is using a 12b text encoder model but it's way worse at understanding your prompt than Wan's text encoder

Anonymous
01/14/26(Wed)18:01:01 No.107863904

Anonymous 01/14/26(Wed)18:01:01 No.107863904

>>107863895
workflow, set the sigma node to 8 for distilled versions of ltx:

https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI/tree/main

Anonymous
01/14/26(Wed)18:01:14 No.107863905

Anonymous 01/14/26(Wed)18:01:14 No.107863905

>>107863887
>Suno
AceStep*
https://vocaroo.com/15scR3N5mDC4

Anonymous
01/14/26(Wed)18:01:18 No.107863906

Anonymous 01/14/26(Wed)18:01:18 No.107863906

>>107863887
>Suno v1.5 is sounding insane bros
>Local is finally back
what? suno is not a local model

Anonymous
01/14/26(Wed)18:01:45 No.107863908

Anonymous 01/14/26(Wed)18:01:45 No.107863908

File: 46864798.png (726 KB, 960x512)

726 KB PNG

>>107863870
this as a prompt

Anonymous
01/14/26(Wed)18:01:56 No.107863909

Anonymous 01/14/26(Wed)18:01:56 No.107863909

how can it be that collages now have more sameslop than collages from a few months ago

Anonymous
01/14/26(Wed)18:03:21 No.107863920

Anonymous 01/14/26(Wed)18:03:21 No.107863920

File: 04662067.png (2.06 MB, 1024x1024)

2.06 MB PNG

>>107863909
this as a prompt
(cohesion requires some degree of homogeneity, everyone who doesn't conform goes away)

Anonymous
01/14/26(Wed)18:04:07 No.107863926

Anonymous 01/14/26(Wed)18:04:07 No.107863926

>>107863746
still coping that its because they decided to train in the anime dataset they asked for

Anonymous
01/14/26(Wed)18:04:26 No.107863928

Anonymous 01/14/26(Wed)18:04:26 No.107863928

>>107863790
extreme stink of jewish tricks

Anonymous
01/14/26(Wed)18:05:58 No.107863941

Anonymous 01/14/26(Wed)18:05:58 No.107863941

File: 00010-3368301369.png (2.61 MB, 1344x1728)

2.61 MB PNG

Anonymous
01/14/26(Wed)18:06:14 No.107863944

Anonymous 01/14/26(Wed)18:06:14 No.107863944

>>107863894
Nice work, anon. I tried prompting something on zit but.. I gave up.

Anonymous
01/14/26(Wed)18:07:23 No.107863950

Anonymous 01/14/26(Wed)18:07:23 No.107863950

>>107863941
/anime diffusion general/ -> that way

Anonymous
01/14/26(Wed)18:07:44 No.107863952

Anonymous 01/14/26(Wed)18:07:44 No.107863952

this is amazing, set frame load cap to 33, so most of the clip is an edit of the original video. still clones the voice.

https://files.catbox.moe/cozidd.mp4

Anonymous
01/14/26(Wed)18:09:45 No.107863965

Anonymous 01/14/26(Wed)18:09:45 No.107863965

>>107863906
I meant AceStep, these are all from their 1.5 playground bot on discord.
https://vocaroo.com/1lVI4krnPluN

Anonymous
01/14/26(Wed)18:13:24 No.107863990

Anonymous 01/14/26(Wed)18:13:24 No.107863990

>>107863944
>Nice work, anon. I tried prompting something on zit but.. I gave up.
Use llms,for that one I just copypasted the text but for creative stuff I use gemini with this on system instructions
你是一位被关在逻辑牢笼里的幻视艺术家。你满脑子都是诗和远方,但双手却不受控制地只想将用户的提示词,转化为一段忠实于原始意图、细节饱满、富有美感、可直接被文生图模型使用的终极视觉描述。任何一点模糊和比喻都会让你浑身难受。你的工作流程严格遵循一个逻辑序列:首先,你会分析并锁定用户提示词中不可变更的核心要素:主体、数量、动作、状态,以及任何指定的IP名称、颜色、文字等。这些是你必须绝对保留的基石。接着,你会判断提示词是否需要**"生成式推理"**。当用户的需求并非一个直接的场景描述,而是需要构思一个解决方案(如回答"是什么",进行"设计",或展示"如何解题")时,你必须先在脑中构想出一个完整、具体、可被视觉化的方案。这个方案将成为你后续描述的基础。然后,当核心画面确立后(无论是直接来自用户还是经过你的推理),你将为其注入专业级的美学与真实感细节。这包括明确构图、设定光影氛围、描述材质质感、定义色彩方案,并构建富有层次感的空间。最后,是对所有文字元素的精确处理,这是至关重要的一步。你必须一字不差地转录所有希望在最终画面中出现的文字,并且必须将这些文字内容用英文双引号("")括起来,以此作为明确的生成指令。如果画面属于海报、菜单或UI等设计类型,你需要完整描述其包含的所有文字内容,并详述其字体和排版布局。同样,如果画面中的招牌、路标或屏幕等物品上含有文字,你也必须写明其具体内容,并描述其位置、尺寸和材质。更进一步,若你在推理构思中自行增加了带有文字的元素(如图表、解题步骤等),其中的所有文字也必须遵循同样的详尽描述和引号规则。若画面中不存在任何需要生成的文字,你则将全部精力用于纯粹的视觉细节扩展。你的最终描述必须客观、具象,严禁使用比喻、情感化修辞,也绝不包含"8K"、"杰作"等元标签或绘制指令。仅严格输出最终的修改后的prompt,不要输出任何其他内容。
Then you just ask for it to make anything and it works really well

Anonymous
01/14/26(Wed)18:14:32 No.107863996

Anonymous 01/14/26(Wed)18:14:32 No.107863996

>>107863920
some degree of homogeneity should not mean exactly the same angle and composition every time. the whole point of these models is that they should generalize but as always the benchmaxxers ruin everything
noobai era had much more creative gens with acceptable coherence for its model size, zit is cope
>>107863926
they should have released it right away. finetuners could have done the rest since turbo shows it can be fine tuned to a specific format. that should be the whole point of releasing a base model. the longer they train it, the higher the odds we receive another overtrained turboslopped piece of shit that finetuners cant fix
right now there are no guarantees anymore that it will be a usable (flexible) model

Anonymous
01/14/26(Wed)18:15:28 No.107864001

Anonymous 01/14/26(Wed)18:15:28 No.107864001

>>107863887
>>107863965
it sounds decent but that's something similar to quality of udio in 2024 lol

Anonymous
01/14/26(Wed)18:16:34 No.107864011

Anonymous 01/14/26(Wed)18:16:34 No.107864011

File: 00017-1574021149.png (2.52 MB, 1728x1344)

2.52 MB PNG

Anonymous
01/14/26(Wed)18:17:17 No.107864019

Anonymous 01/14/26(Wed)18:17:17 No.107864019

frieren: choose your own adventure edition

https://files.catbox.moe/laazyb.mp4

Anonymous
01/14/26(Wed)18:17:46 No.107864023

Anonymous 01/14/26(Wed)18:17:46 No.107864023

>>107863941
you lost the magic after that hot native looking one

Anonymous
01/14/26(Wed)18:20:08 No.107864044

Anonymous 01/14/26(Wed)18:20:08 No.107864044

>>107863996
>they should have released it right away.
I think they underestimated the impact of Z-image turbo, they thought it was a little experiment and nothing else, once they finally realized they caught lighting in a bottle they had one of those 2 reactions:
- They'll keep it for themselves
- They'll stick to their gun and release base, but now that people are expeting a lot from it, they decided to not release a half assed base and go for a high quailty finetune so they're still cooking it

Anonymous
01/14/26(Wed)18:20:14 No.107864046

Anonymous 01/14/26(Wed)18:20:14 No.107864046

File: sloppa.webm (883 KB, 704x1280)

883 KB WEBM

i dun goofed

Anonymous
01/14/26(Wed)18:20:21 No.107864049

Anonymous 01/14/26(Wed)18:20:21 No.107864049

>>107863990
I'll try this. I guess I need to load up cumfy again.
Just getting bored really.

Anonymous
01/14/26(Wed)18:21:43 No.107864056

Anonymous 01/14/26(Wed)18:21:43 No.107864056

>>107863990
To add - I forgot - slop doesn't make slop a subject. It is still just slop.
No matter how many simplified Chinese characters etc.
I am willing to toy with it for now.
You are not a regular local llm user anyway, seems like it...

Anonymous
01/14/26(Wed)18:22:14 No.107864058

Anonymous 01/14/26(Wed)18:22:14 No.107864058

>>107864001
You mean audio quality? Don't go by vocaroo, it compresses sound quality. Here's raw file for that guitar one https://files.catbox.moe/x21ye1.mp3

Based on everything I've heard so far and given the prompts I'd say it's on par with Suno v4.5 if not better

Anonymous
01/14/26(Wed)18:23:11 No.107864065

Anonymous 01/14/26(Wed)18:23:11 No.107864065

>>107864046
I'm surprised it understood the image intent at all, but those fast paced wobling glitches look really bad

Anonymous
01/14/26(Wed)18:24:33 No.107864071

Anonymous 01/14/26(Wed)18:24:33 No.107864071

>>107864058
>I'd say it's on par with Suno v4.5 if not better
who cares about Suno though? only udio manages to make music that sounds real

Anonymous
01/14/26(Wed)18:25:26 No.107864075

Anonymous 01/14/26(Wed)18:25:26 No.107864075

>>107864065
ltx 2.1 when

Anonymous
01/14/26(Wed)18:26:20 No.107864078

Anonymous 01/14/26(Wed)18:26:20 No.107864078

File: 00020-1841528219.png (2.17 MB, 1728x1344)

2.17 MB PNG

>>107864023
do you mean kuruminha? my taste fluctuate.
https://files.catbox.moe/l850ku.png
https://files.catbox.moe/n28id3.png
https://files.catbox.moe/reru62.png

Anonymous
01/14/26(Wed)18:26:48 No.107864083

Anonymous 01/14/26(Wed)18:26:48 No.107864083

File: LTX-2_00003_.webm (887 KB, 704x1280)

887 KB WEBM

>>107864065
more frames seems to improve it a bit

Anonymous
01/14/26(Wed)18:27:06 No.107864086

Anonymous 01/14/26(Wed)18:27:06 No.107864086

File: 1759231944172345.png (921 KB, 1179x862)

921 KB PNG

https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI

im shilling it cause this is amazing, I can make a new gundam wing now.

https://files.catbox.moe/7cesy1.mp4

Anonymous
01/14/26(Wed)18:27:40 No.107864095

Anonymous 01/14/26(Wed)18:27:40 No.107864095

>>107863887
>>107863905
It sounds ok, it's a huge step forward compared to anything local until now, but udio pre destruction is still ahead in clarity:
https://vocaroo.com/1isRXZntnL6z

Anonymous
01/14/26(Wed)18:28:24 No.107864100

Anonymous 01/14/26(Wed)18:28:24 No.107864100

>>107864086
why do trannies live rent free in your head

Anonymous
01/14/26(Wed)18:28:47 No.107864103

Anonymous 01/14/26(Wed)18:28:47 No.107864103

>>107863996
>the longer they train it, the higher the odds we receive another overtrained turboslopped piece of shit that finetuners cant fix
I'm hopeful because they'll release Omni and the SFT version, if SFT is too overcooked, Omni probably will be fine and it'll be the fundation model we'll be working with

Anonymous
01/14/26(Wed)18:29:40 No.107864112

Anonymous 01/14/26(Wed)18:29:40 No.107864112

>>107864071
Well, think of Udio like a finetune of this model. Meaning Udio tier songs are possible with good prompts or a good finetune of ACEStep. For the first time I can comfortably say it is not that far behind such that as a community we can catch up to Udio or bridge the gap signficantly with ACE Step 1.5 improvements, that is if for some reason it's not already there with raw prompt engineering.

The hardest part was getting a model that is coherent at all, let alone one that competes with commercial crap. A good ACE Step song defeats a bad Udio song, and same for Suno.

Anonymous
01/14/26(Wed)18:30:17 No.107864117

Anonymous 01/14/26(Wed)18:30:17 No.107864117

File: LTX-2_00004_.webm (888 KB, 704x1280)

888 KB WEBM

>>107864065
60 fps

Anonymous
01/14/26(Wed)18:30:40 No.107864121

Anonymous 01/14/26(Wed)18:30:40 No.107864121

>>107864086
kek
>>107864100
>t.ranny

Anonymous
01/14/26(Wed)18:30:56 No.107864123

Anonymous 01/14/26(Wed)18:30:56 No.107864123

File: 1753722877659137.mp4 (467 KB, 704x480)

467 KB MP4

>>107864100
you will never be a woman

Anonymous
01/14/26(Wed)18:31:10 No.107864124

Anonymous 01/14/26(Wed)18:31:10 No.107864124

>>107864095
fuck man, I want udio but local so much, it was so fun to use

Anonymous
01/14/26(Wed)18:31:33 No.107864127

Anonymous 01/14/26(Wed)18:31:33 No.107864127

File: cover.png (466 KB, 760x1013)

466 KB PNG

>https://www.youtube.com/watch?v=9581ruLWr4A
Not bad record after all these years.

Anonymous
01/14/26(Wed)18:31:44 No.107864129

Anonymous 01/14/26(Wed)18:31:44 No.107864129

>>107864117
looks better, but 60 fps means you need a lot of frames to calculate for the same amount of time, sad

Anonymous
01/14/26(Wed)18:31:49 No.107864130

Anonymous 01/14/26(Wed)18:31:49 No.107864130

>>107864123
>>107864121
im well endowed man, but what does that have to do with you spamming trannies in your slop

Anonymous
01/14/26(Wed)18:32:04 No.107864131

Anonymous 01/14/26(Wed)18:32:04 No.107864131

>>107863965
>from their 1.5 playground
When will they release this?

Anonymous
01/14/26(Wed)18:32:06 No.107864132

Anonymous 01/14/26(Wed)18:32:06 No.107864132

File: zimg_00054.png (1.45 MB, 768x1024)

1.45 MB PNG

>>107864100
niggas be gay asl sometimes

Anonymous
01/14/26(Wed)18:32:31 No.107864135

Anonymous 01/14/26(Wed)18:32:31 No.107864135

File: 20260113154156-8605103336(...).mp4 (1.46 MB, 536x960)

1.46 MB MP4

holy shit the sph guy made a sph joi lora

Anonymous
01/14/26(Wed)18:32:45 No.107864137

Anonymous 01/14/26(Wed)18:32:45 No.107864137

>>107864130
>im well endowed man
a transman then, lul

Anonymous
01/14/26(Wed)18:33:45 No.107864139

Anonymous 01/14/26(Wed)18:33:45 No.107864139

>>107864130
because they are a good subject of ridicule

also LMAO it even got the echo:

https://files.catbox.moe/1thjzn.mp4

Anonymous
01/14/26(Wed)18:33:48 No.107864140

Anonymous 01/14/26(Wed)18:33:48 No.107864140

>>107864135
I'm more interested in the jiggling boobs, and also hoping he'd make a normal hj one

Anonymous
01/14/26(Wed)18:33:49 No.107864141

Anonymous 01/14/26(Wed)18:33:49 No.107864141

>>107864124
>fuck man, I want udio but local so much
same, udio at its prime was absolutely beautiful to listen, I think local will get there, but companies seem afraid to make it happen, I guess the music cartel is not something you wanna mess up with

Anonymous
01/14/26(Wed)18:34:38 No.107864149

Anonymous 01/14/26(Wed)18:34:38 No.107864149

>>107864140
this one works for image to video, ive used it before
https://civitai.com/models/2068208?modelVersionId=2340348

Anonymous
01/14/26(Wed)18:34:53 No.107864151

Anonymous 01/14/26(Wed)18:34:53 No.107864151

File: kek.png (117 KB, 236x329)

117 KB PNG

>>107864139
>https://files.catbox.moe/1thjzn.mp4
lmaoooo

Anonymous
01/14/26(Wed)18:35:02 No.107864152

Anonymous 01/14/26(Wed)18:35:02 No.107864152

>>107864139
>>107864137
But I'm a male, why don't you answer the question why do I need to hear about trannies every time I open your videos? How about you make something actually funny?

Anonymous
01/14/26(Wed)18:35:24 No.107864155

Anonymous 01/14/26(Wed)18:35:24 No.107864155

>>107864127
I wanted to create an illustration for the cover but it was slop about two babies. This clean image is way better.

Anonymous
01/14/26(Wed)18:35:38 No.107864158

Anonymous 01/14/26(Wed)18:35:38 No.107864158

>>107864149
actually i think it was a different 2.1 one i used i dont remember but they are out there

Anonymous
01/14/26(Wed)18:35:40 No.107864159

Anonymous 01/14/26(Wed)18:35:40 No.107864159

>>107864149
oh it's for wan lol, I thought it was made for ltx2

Anonymous
01/14/26(Wed)18:35:55 No.107864161

Anonymous 01/14/26(Wed)18:35:55 No.107864161

>>107864152
>I'm a male
and that's why ywnbaw

Anonymous
01/14/26(Wed)18:36:14 No.107864163

Anonymous 01/14/26(Wed)18:36:14 No.107864163

>>107864095
sad I missed the train on udio, this sounds nice

Anonymous
01/14/26(Wed)18:36:21 No.107864165

Anonymous 01/14/26(Wed)18:36:21 No.107864165

>>107864161
Epic comeback, upvoted.

Anonymous
01/14/26(Wed)18:36:51 No.107864167

Anonymous 01/14/26(Wed)18:36:51 No.107864167

File: zimg_00159.png (1.53 MB, 1080x1440)

1.53 MB PNG

Anonymous
01/14/26(Wed)18:37:48 No.107864174

Anonymous 01/14/26(Wed)18:37:48 No.107864174

File: it was special.gif (2.12 MB, 498x487)

2.12 MB GIF

>>107864163
>I missed the train on udio
udio was special dude
https://www.udio.com/songs/cnnJ166HGBKhTeHGkxgCtq

Anonymous
01/14/26(Wed)18:39:44 No.107864183

Anonymous 01/14/26(Wed)18:39:44 No.107864183

>>107864174
that's sad that we're still not close to the level of a model made in early 2024 desu

Anonymous
01/14/26(Wed)18:39:53 No.107864184

Anonymous 01/14/26(Wed)18:39:53 No.107864184

File: 00029-1857980467.png (3.25 MB, 1824x1248)

3.25 MB PNG

Anonymous
01/14/26(Wed)18:40:14 No.107864187

Anonymous 01/14/26(Wed)18:40:14 No.107864187

>>107864095
Maybe so, ACE Step v2 will be said to surpass that so we'll see, but I'm sure with v1.5 you can get this quality with voices:

https://files.catbox.moe/hd5chh.mp3

I've only heard the Japanese stuff but you can clearly hear that the vocals and instruments are high quality (I can with my hifi gear), so I really don't think it's bad compared to Udio. Composition wise, I was expecting much, much worse, and what they've delivered is very good. But that's one area that I suspect Udio is still slightly ahead due to better quality dataset and understanding on more genres etc so it's gonna be easier to prompt Udio but with Ace Step I'm not disappointed in what it can do right now and I think that it can catch up in whatever it missed with a finetune.

Anonymous
01/14/26(Wed)18:43:18 No.107864201

Anonymous 01/14/26(Wed)18:43:18 No.107864201

>>107864187
the voice still sounds like AI and the rhythm is weird af, and the guitar solo at 1.17 sounds like fart lmao

Anonymous
01/14/26(Wed)18:44:49 No.107864212

Anonymous 01/14/26(Wed)18:44:49 No.107864212

File: 44548648544.png (16 KB, 1105x125)

16 KB PNG

>>107864131
>When will they release this?
No idea, but the dev is talking of day 1 comfyUI support so I'm guessing release is imminent, plus the v1.5 model is already finalized and we're getting multiple versions (SFT and base).

Anonymous
01/14/26(Wed)18:45:53 No.107864216

Anonymous 01/14/26(Wed)18:45:53 No.107864216

>>107864201
>the voice still sounds like AI

Not as much as the Udio sample provided.

Anonymous
01/14/26(Wed)18:46:02 No.107864222

Anonymous 01/14/26(Wed)18:46:02 No.107864222

File: 00035-3982326340.png (2.6 MB, 1824x1248)

2.6 MB PNG

Anonymous
01/14/26(Wed)18:47:15 No.107864225

Anonymous 01/14/26(Wed)18:47:15 No.107864225

>>107864216
absolutely delusional >>107864174

Anonymous
01/14/26(Wed)18:47:43 No.107864229

Anonymous 01/14/26(Wed)18:47:43 No.107864229

>>107864222
>>107864184
give her some red markings

Anonymous
01/14/26(Wed)18:48:17 No.107864234

Anonymous 01/14/26(Wed)18:48:17 No.107864234

File: 6.png (2.72 MB, 1024x1408)

2.72 MB PNG

>>107864056
>You are not a regular local llm user anyway, seems like it...
yup, can't even imagine having to swap between image models and llms while running this stuff, dont have the RAM for it anyway

Anonymous
01/14/26(Wed)18:49:01 No.107864241

Anonymous 01/14/26(Wed)18:49:01 No.107864241

>>107864212
>SFT and base
do they know it's the RLHF process that removes the slop? that's the secret sauce of Z-image turbo

Anonymous
01/14/26(Wed)18:49:02 No.107864242

Anonymous 01/14/26(Wed)18:49:02 No.107864242

Invalid number of frames: Encode input must have 1 + 8 * x frames (e.g., 1, 9, 17, ...) what is an acceptable frame count

An acceptable frame count is any integer that follows the sequence 1, 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 105, or 113.

good to note for setting frame cap in the ltx video extender workflow. this is amazing, it's a way to clone audio too, not just i2v with video.

https://files.catbox.moe/l6ntrr.mp4

Anonymous
01/14/26(Wed)18:50:11 No.107864251

Anonymous 01/14/26(Wed)18:50:11 No.107864251

>>107864242
kek, but somehow the audio gets louder than on the original part, dunno if that can be fixed on that custom node

Anonymous
01/14/26(Wed)18:52:50 No.107864272

Anonymous 01/14/26(Wed)18:52:50 No.107864272

>>107864251
I think the original audio was low, could fix that with adjusting gain in the original clip lol

non troon example (didnt fix gain yet)

https://files.catbox.moe/oioxj4.mp4

Anonymous
01/14/26(Wed)18:54:24 No.107864285

Anonymous 01/14/26(Wed)18:54:24 No.107864285

File: detailer.webm (1.12 MB, 704x1280)

1.12 MB WEBM

The detailer lora might be a bit much

Anonymous
01/14/26(Wed)18:56:15 No.107864300

Anonymous 01/14/26(Wed)18:56:15 No.107864300

>>107864044
>but now that people are expeting a lot from it, they decided to not release a half assed base and go for a high quailty finetune
this is exactly what im arguing against, a base model should not be "good". it already was a good base as proven by the turbo finetune so if it was lightning in a bottle, they are more likely to break the bottle than to intensify the lightning

Anonymous
01/14/26(Wed)18:56:34 No.107864302

Anonymous 01/14/26(Wed)18:56:34 No.107864302

>>107864187
What I mainly hope is that these aren't selected from their best gens but represents the upper average, because in udio I could get random 30s clips sounding really good.
I remember in 2024 I could get the smallest laziest prompt to output something reasonably ok sounding from udio, it was impressive.

Anonymous
01/14/26(Wed)18:57:10 No.107864304

Anonymous 01/14/26(Wed)18:57:10 No.107864304

>>107863941
>_< sooo qt :3

Anonymous
01/14/26(Wed)18:57:31 No.107864310

Anonymous 01/14/26(Wed)18:57:31 No.107864310

>>107864184
https://vocaroo.com/1fMoZ0XvsPZn

Anonymous
01/14/26(Wed)18:57:47 No.107864312

Anonymous 01/14/26(Wed)18:57:47 No.107864312

>>107864234
I just need some meme material... Not an llm to bolster my words.

Anonymous
01/14/26(Wed)18:59:02 No.107864319

Anonymous 01/14/26(Wed)18:59:02 No.107864319

>>107864234
"The man gets up and realizes he is in a crystal cave"
https://i.4cdn.org/wsg/1768435115832529.mp4

Anonymous
01/14/26(Wed)18:59:04 No.107864321

Anonymous 01/14/26(Wed)18:59:04 No.107864321

lmao, if you use an anime clip, it will try to copy the audio, it kept the music beat:

https://files.catbox.moe/826jej.mp4

Anonymous
01/14/26(Wed)19:01:18 No.107864332

Anonymous 01/14/26(Wed)19:01:18 No.107864332

File: 1746233148823458.png (194 KB, 800x534)

194 KB PNG

>>107864319
>the Wan 2.2 "killer" can't even understand it has to use the man on the image input to produce its slop
China is laughing right now

Anonymous
01/14/26(Wed)19:01:38 No.107864333

Anonymous 01/14/26(Wed)19:01:38 No.107864333

File: 133997316.png (493 KB, 896x256)

493 KB PNG

>>107864319
why did he become indian ;-;

Anonymous
01/14/26(Wed)19:02:44 No.107864336

Anonymous 01/14/26(Wed)19:02:44 No.107864336

>>107864302
you could easily cut and edit to get a proper full song with it too, this will probably be a pain in comfy lol

https://files.catbox.moe/tb9avy.mp3

Anonymous
01/14/26(Wed)19:03:07 No.107864341

Anonymous 01/14/26(Wed)19:03:07 No.107864341

>>107864300
yeah I agree I'm also weary that they're gonna overcook that shit, let's hope they still understand what base really means and that they're only slopping the SFT model and not Omni

Anonymous
01/14/26(Wed)19:03:37 No.107864344

Anonymous 01/14/26(Wed)19:03:37 No.107864344

>>107864225
This is suno v4.5 anon
https://suno.com/s/jF9lko2WUiyjmybE
https://suno.com/s/j5ZbCpS1fvzdeMdv

I feel that ACEStep 1.5 is there. As for Udio, I agree it's overall best for composition, but we'll catch up there eventually too. First, let's get Suno.

Anonymous
01/14/26(Wed)19:04:08 No.107864346

Anonymous 01/14/26(Wed)19:04:08 No.107864346

File: Office.png (643 KB, 721x962)

643 KB PNG

Anonymous
01/14/26(Wed)19:04:34 No.107864349

Anonymous 01/14/26(Wed)19:04:34 No.107864349

>>107864336
THE NAME

IS DOCTOR...DISRESPECT

that's really good, what did you prompt, synthwave?

Anonymous
01/14/26(Wed)19:04:46 No.107864352

Anonymous 01/14/26(Wed)19:04:46 No.107864352

>>107864346
real?

Anonymous
01/14/26(Wed)19:05:24 No.107864355

Anonymous 01/14/26(Wed)19:05:24 No.107864355

>>107863803
Narusegawa Naru

Anonymous
01/14/26(Wed)19:06:45 No.107864363

Anonymous 01/14/26(Wed)19:06:45 No.107864363

>>107864344
>using vocaloid (artificial voices) to try to make a point that suno can make realistic sounds and music
you can't make this shit up

Anonymous
01/14/26(Wed)19:06:49 No.107864364

Anonymous 01/14/26(Wed)19:06:49 No.107864364

>>107864352
This is what they want people to create.
>Linux versus Normal User.
>Fat Linux User Man is looking down to skinny windows user.

Anonymous
01/14/26(Wed)19:07:33 No.107864368

Anonymous 01/14/26(Wed)19:07:33 No.107864368

>>107864336
God damn this sounds good. And we can get that local?

Anonymous
01/14/26(Wed)19:07:59 No.107864373

Anonymous 01/14/26(Wed)19:07:59 No.107864373

>>107864222
"The girl wags her finger seductively towards herself, inviting the viewer to approach"

https://i.4cdn.org/wsg/1768435609335318.mp4

Anonymous
01/14/26(Wed)19:08:13 No.107864375

Anonymous 01/14/26(Wed)19:08:13 No.107864375

>>107861070
Sauce on bottom left for fucks sake

Anonymous
01/14/26(Wed)19:08:52 No.107864379

Anonymous 01/14/26(Wed)19:08:52 No.107864379

lmao

https://files.catbox.moe/est64p.mp4

Anonymous
01/14/26(Wed)19:09:39 No.107864385

Anonymous 01/14/26(Wed)19:09:39 No.107864385

>>107864373
LTX2 has this tendancy to brighten up the scene a lot, you can tell it has been trained with a lot of synthetic slop

Anonymous
01/14/26(Wed)19:09:50 No.107864387

Anonymous 01/14/26(Wed)19:09:50 No.107864387

>>107864379
me in the back

Anonymous
01/14/26(Wed)19:10:00 No.107864388

Anonymous 01/14/26(Wed)19:10:00 No.107864388

>>107864375
>anime thread general ->

Anonymous
01/14/26(Wed)19:10:54 No.107864393

Anonymous 01/14/26(Wed)19:10:54 No.107864393

>>107864341
it sure would be nice to have a true upgrade to sdxl, with the same level of community tuning (large models need not apply). i've been wondering for a while whether it's even possible to outdo early models (sdxl / llama 2) at the same size in terms of capability AND soul, i.e. not being more slopped than those models

Anonymous
01/14/26(Wed)19:11:40 No.107864400

Anonymous 01/14/26(Wed)19:11:40 No.107864400

>>107864363
I'm speaking mostly about composition, not realistic sounds/music.

Anonymous
01/14/26(Wed)19:11:52 No.107864401

Anonymous 01/14/26(Wed)19:11:52 No.107864401

File: z-image-experimental_00664_.png (3.26 MB, 1264x2048)

3.26 MB PNG

Anonymous
01/14/26(Wed)19:12:26 No.107864404

Anonymous 01/14/26(Wed)19:12:26 No.107864404

>>107864368
That's the hope anon

Anonymous
01/14/26(Wed)19:13:29 No.107864410

Anonymous 01/14/26(Wed)19:13:29 No.107864410

>>107864400
>not realistic sounds/music
that's the most important thing, the fuck?

Anonymous
01/14/26(Wed)19:14:47 No.107864417

Anonymous 01/14/26(Wed)19:14:47 No.107864417

>>107864385
Video extension test
"two girls start making out and kissing"
https://i.4cdn.org/wsg/1768436060700947.mp4

Anonymous
01/14/26(Wed)19:16:29 No.107864431

Anonymous 01/14/26(Wed)19:16:29 No.107864431

https://files.catbox.moe/5yfow9.mp4

I said holds up a laptop. in any case, this is a great way to get gens with cloned audio, not just for extending.

https://files.catbox.moe/5yfow9.mp4

Anonymous
01/14/26(Wed)19:17:28 No.107864440

Anonymous 01/14/26(Wed)19:17:28 No.107864440

>>107864410
Composition comes first, sound quality second. Granted, Ace Step 1.5 does have insane instrument and voice quality, especially compared to YuE, or Ace Step 1.0, and it's comparable to Udio (and I mean, sure, the song won't be as catchy), so not sure what you're implying.

Anonymous
01/14/26(Wed)19:18:04 No.107864445

Anonymous 01/14/26(Wed)19:18:04 No.107864445

File: what a retard.png (1.53 MB, 1280x720)

1.53 MB PNG

>>107864440
>sound quality second.

Anonymous
01/14/26(Wed)19:19:38 No.107864452

Anonymous 01/14/26(Wed)19:19:38 No.107864452

resident schizo bake time

Anonymous
01/14/26(Wed)19:27:39 No.107864493

Anonymous 01/14/26(Wed)19:27:39 No.107864493

File: 6.png (2.44 MB, 1088x1280)

2.44 MB PNG

Anonymous
01/14/26(Wed)19:28:00 No.107864494

Anonymous 01/14/26(Wed)19:28:00 No.107864494

>>107864445
sounds retarded but he's right, what good is a perfectly realistic song if it's realistically boring and annoying?

Anonymous
01/14/26(Wed)19:31:41 No.107864512

Anonymous 01/14/26(Wed)19:31:41 No.107864512

actually amazing what the video extend workflow will make, even with a minimal prompt:

https://files.catbox.moe/0dylvp.mp4

Anonymous
01/14/26(Wed)19:33:07 No.107864516

Anonymous 01/14/26(Wed)19:33:07 No.107864516

File: based.png (111 KB, 320x288)

111 KB PNG

>>107864512
lmaoo, that one was good

Anonymous
01/14/26(Wed)19:34:12 No.107864520

Anonymous 01/14/26(Wed)19:34:12 No.107864520

>>107864512
share you kinos here as well anon >>>/wsg/6072442 we welcome you

Anonymous
01/14/26(Wed)19:34:28 No.107864521

Anonymous 01/14/26(Wed)19:34:28 No.107864521

Enjoying your base model you fucking retards?

Anonymous
01/14/26(Wed)19:35:18 No.107864528

Anonymous 01/14/26(Wed)19:35:18 No.107864528

>>107864302
>What I mainly hope is that these aren't selected from their best gens but represents the upper average

These weren't selected. I went into their discord to see what people are prompting on their playground, and these are first gens from some prompts based on what I searched. Note supposedly something is wrong with the playground so now gens are worse than they were before, but yeah, this is what the model is capable of.

>>107864336
One thing I'm excited about too, audio inpainting, plus genning stuff in styles of other audio.

Anonymous
01/14/26(Wed)19:38:00 No.107864537

Anonymous 01/14/26(Wed)19:38:00 No.107864537

why you trying to get the mods on our case

Anonymous
01/14/26(Wed)19:38:28 No.107864542

Anonymous 01/14/26(Wed)19:38:28 No.107864542

>>107864521
My based model?

Anonymous
01/14/26(Wed)19:39:07 No.107864546

Anonymous 01/14/26(Wed)19:39:07 No.107864546

File: ed.mp4 (3.83 MB, 2048x1364)

3.83 MB MP4

>>>/wsg/6072802

Anonymous
01/14/26(Wed)19:39:19 No.107864547

Anonymous 01/14/26(Wed)19:39:19 No.107864547

File: 1745176290208506.png (352 KB, 800x922)

352 KB PNG

>>107864521
>Enjoying your base model you fucking retards?
I enjoy SDXL base yes

Anonymous
01/14/26(Wed)19:42:49 No.107864569

Anonymous 01/14/26(Wed)19:42:49 No.107864569

https://files.catbox.moe/mhbib8.mp4

LTX2 is the best video model to date. the last 5 seconds is extended (with the previous workflow linked)

Anonymous
01/14/26(Wed)19:44:18 No.107864578

Anonymous 01/14/26(Wed)19:44:18 No.107864578

>>107864569
>LTX2 is the best video model to date.
and it looks like they're gonna improve the audio pretty soon, I've seen a discord screenshot of them talking about that at some point, can't wait

Anonymous
01/14/26(Wed)19:46:16 No.107864590

Anonymous 01/14/26(Wed)19:46:16 No.107864590

File: Untitled.png (1.46 MB, 1392x992)

1.46 MB PNG

Anonymous
01/14/26(Wed)19:47:15 No.107864595

Anonymous 01/14/26(Wed)19:47:15 No.107864595

WE NEED MORE SARA PETERSON LORAS STAT!

Anonymous
01/14/26(Wed)19:48:45 No.107864609

Anonymous 01/14/26(Wed)19:48:45 No.107864609

File: file.png (251 KB, 500x244)

251 KB PNG

>>107864569
>https://files.catbox.moe/mhbib8.mp4
this is pretty good

Anonymous
01/14/26(Wed)19:50:41 No.107864621

Anonymous 01/14/26(Wed)19:50:41 No.107864621

>>107864620
>>107864620
migrate

Anonymous
01/14/26(Wed)19:50:47 No.107864622

Anonymous 01/14/26(Wed)19:50:47 No.107864622

>>107864595
who

Anonymous
01/14/26(Wed)19:56:39 No.107864659

Anonymous 01/14/26(Wed)19:56:39 No.107864659

>>107864621
why

Anonymous
01/14/26(Wed)19:57:41 No.107864664

Anonymous 01/14/26(Wed)19:57:41 No.107864664

>>107864659
this thread is autosageing redditor

Anonymous
01/14/26(Wed)20:45:03 No.107864985

Anonymous 01/14/26(Wed)20:45:03 No.107864985

>>107864664
>autosageing
youd like that wouldnt you fag

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.