/g/ - /ldg/ - Local Diffusion General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/ldg/ - Local Diffusion Genera(...) 01/17/26(Sat)01:37:31 No.107887535

File: highlights_g_107885702_17(...).png (3.99 MB, 2045x1162)

/ldg/ - Local Diffusion General Anonymous 01/17/26(Sat)01:37:31 No.107887535

Discussion of Free and Open Source Diffusion Models

Prev: >>107885702

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Flux Klein
https://huggingface.co/collections/black-forest-labs/flux2

>WanX
https://github.com/Wan-Video/Wan2.2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
01/17/26(Sat)01:40:55 No.107887552

Anonymous 01/17/26(Sat)01:40:55 No.107887552

>>107887524
>>107887529
So they didn't give a shit about Qwen Image and Qwen Image Edit, but Z-image was enough to spook them?
I guess it makes sense

Anonymous
01/17/26(Sat)01:41:14 No.107887554

Anonymous 01/17/26(Sat)01:41:14 No.107887554

>>107887547
File Not found
>>107887541
>>107887540
I checked at CivitAI and there are barely any LTX2 Loras.... are you trolling ?

Anonymous
01/17/26(Sat)01:41:47 No.107887555

Anonymous 01/17/26(Sat)01:41:47 No.107887555

File: 1750333742531075.png (1.44 MB, 848x1216)

1.44 MB PNG

>>107887537
the girl on the left is wearing white gundam armor.

Anonymous
01/17/26(Sat)01:42:18 No.107887559

Anonymous 01/17/26(Sat)01:42:18 No.107887559

>>107887554
are you slow or something

Anonymous
01/17/26(Sat)01:44:01 No.107887565

Anonymous 01/17/26(Sat)01:44:01 No.107887565

File: 1763462820275090.png (1.38 MB, 848x1216)

1.38 MB PNG

>>107887546
seems to work

Anonymous
01/17/26(Sat)01:44:41 No.107887568

Anonymous 01/17/26(Sat)01:44:41 No.107887568

File: 1758054221086937.png (53 KB, 200x200)

53 KB PNG

>>107887559
Its been a week since i bought 5070ti and i download Wan2GP but people said Comfy was better and i download it but i have no idea how to use it. I waste like 100gb and still dont get it

Anonymous
01/17/26(Sat)01:44:50 No.107887570

Anonymous 01/17/26(Sat)01:44:50 No.107887570

Any local model able to create nice music locally? I miss my udio's catchy kpop gen abilities

https://files.catbox.moe/tij84e.mp3
https://files.catbox.moe/ylh0uh.mp3
https://files.catbox.moe/h18hrp.mp3

Anonymous
01/17/26(Sat)01:44:57 No.107887572

Anonymous 01/17/26(Sat)01:44:57 No.107887572

>>107887535
based collage

Anonymous
01/17/26(Sat)01:45:52 No.107887578

Anonymous 01/17/26(Sat)01:45:52 No.107887578

>>107887565
very conservative for small panties

Anonymous
01/17/26(Sat)01:46:10 No.107887579

Anonymous 01/17/26(Sat)01:46:10 No.107887579

File: 1745832634552343.png (1.5 MB, 848x1216)

1.5 MB PNG

>>107887565
the girls are dressed as hatsune miku.

Anonymous
01/17/26(Sat)01:47:25 No.107887582

Anonymous 01/17/26(Sat)01:47:25 No.107887582

>>107887568
keep with it youll start to learn
also yes that other anon was trolling

Anonymous
01/17/26(Sat)01:47:30 No.107887584

Anonymous 01/17/26(Sat)01:47:30 No.107887584

>>107887570
https://github.com/HeartMuLa/heartlib?tab=readme-ov-file

This came out yesterday and has k-pop as a tag, but if I'm being honest. It's hard to control and very hit or miss. The clarity itself is pretty good though. Also takes like 3-5 minutes per gen.

Anonymous
01/17/26(Sat)01:48:52 No.107887588

Anonymous 01/17/26(Sat)01:48:52 No.107887588

>>107887570
catchy it is, I don't think anything local can do that yet

Anonymous
01/17/26(Sat)01:49:22 No.107887598

Anonymous 01/17/26(Sat)01:49:22 No.107887598

>>107887568
>>107887568
anon im gonna help you out of pity, people here are too evil for people like you

Get this workflow: https://civitai.com/models/1824027/wan-22-aio-t2v-i2v-s2v-t2i-mmaudio-4-6-stepsloop-svi-video-extendwanvideowrapper-workflowk3nk
and download the nodes and models it tells you to also check this for loras
https://civitai.com/user/K3NK/models?sort=Newest

You don't need to use the actual workflow if it is too complex, but put it in your comfy so it at least tells you what models and stuff to download

Anonymous
01/17/26(Sat)01:49:39 No.107887600

Anonymous 01/17/26(Sat)01:49:39 No.107887600

File: 1766479750833096.png (1.19 MB, 1024x1024)

1.19 MB PNG

replace the text "DEUS EX" with "LDG General". replace the man with sunglasses with hatsune miku wearing the same sunglasses.

it did this prompt better than qwen edit did, I remember trying this one.

Anonymous
01/17/26(Sat)01:50:42 No.107887604

Anonymous 01/17/26(Sat)01:50:42 No.107887604

File: 1747856660009803.png (1.23 MB, 1024x1024)

1.23 MB PNG

>>107887600

Anonymous
01/17/26(Sat)01:50:50 No.107887607

Anonymous 01/17/26(Sat)01:50:50 No.107887607

>>107887600
damn nigga this is crazy

Anonymous
01/17/26(Sat)01:51:01 No.107887610

Anonymous 01/17/26(Sat)01:51:01 No.107887610

>>107887570
it makes irrationally angry that the model behind this quality will never be released

Anonymous
01/17/26(Sat)01:51:05 No.107887612

Anonymous 01/17/26(Sat)01:51:05 No.107887612

File: 1751615103401558.jpg (5 KB, 250x250)

5 KB JPG

>>107887598

Thanks bro. Its been 3 times i uninstall and reinstall ComfyUI now

Anonymous
01/17/26(Sat)01:52:38 No.107887619

Anonymous 01/17/26(Sat)01:52:38 No.107887619

>>107887610
sudo is better

Anonymous
01/17/26(Sat)01:53:53 No.107887626

Anonymous 01/17/26(Sat)01:53:53 No.107887626

File: 1763185167171884.png (835 KB, 5360x3137)

835 KB PNG

>>107887584
It's supposed to be better than udio, somehow I doubt it but I'll try it, also :

> Release the HeartMuLa-oss-7B version.
Hopefully it'll be good

Anonymous
01/17/26(Sat)01:54:25 No.107887627

Anonymous 01/17/26(Sat)01:54:25 No.107887627

>>107887619
the superadmin model

Anonymous
01/17/26(Sat)01:54:28 No.107887628

Anonymous 01/17/26(Sat)01:54:28 No.107887628

File: img_00065_.jpg (671 KB, 1376x1824)

671 KB JPG

Anonymous
01/17/26(Sat)01:55:03 No.107887632

Anonymous 01/17/26(Sat)01:55:03 No.107887632

>>107887612
Use chatgpt with thinking enabled for most common question on how to install comfy and how it works.

Anonymous
01/17/26(Sat)01:55:52 No.107887636

Anonymous 01/17/26(Sat)01:55:52 No.107887636

File: 1759520505198026.png (537 KB, 1104x928)

537 KB PNG

kek

change the black man on the left into a jewish rabbi wearing a yarmulke.

Anonymous
01/17/26(Sat)01:56:09 No.107887638

Anonymous 01/17/26(Sat)01:56:09 No.107887638

File: 1768629580205425.png (3.03 MB, 1632x928)

3.03 MB PNG

my champignon wife

Anonymous
01/17/26(Sat)01:56:53 No.107887641

Anonymous 01/17/26(Sat)01:56:53 No.107887641

>>107887626
>It's supposed to be better than udio
It's not.

All I can say is sometimes the 3B outputs a bop then goes back to being shit. Also I'm willing to bet a stick of RAM that 7B never releases.

Anonymous
01/17/26(Sat)01:57:02 No.107887643

Anonymous 01/17/26(Sat)01:57:02 No.107887643

File: 1748242895535112.png (2.5 MB, 1152x1312)

2.5 MB PNG

1girl bros?

Anonymous
01/17/26(Sat)01:57:35 No.107887648

Anonymous 01/17/26(Sat)01:57:35 No.107887648

File: 1751757223385443.webm (2.96 MB, 1874x666)

2.96 MB WEBM

>>107887626
>>107887584

Anonymous
01/17/26(Sat)01:57:38 No.107887649

Anonymous 01/17/26(Sat)01:57:38 No.107887649

File: 1759443664365187.png (690 KB, 1104x928)

690 KB PNG

>>107887636
give the black man on the left a baseball cap, white t-shirt, and blue jeans. he is smoking a joint.

Anonymous
01/17/26(Sat)01:58:07 No.107887652

Anonymous 01/17/26(Sat)01:58:07 No.107887652

File: 1752614773074239.png (3.18 MB, 2896x4096)

3.18 MB PNG

why didn't you put this in the collage?

Anonymous
01/17/26(Sat)01:58:23 No.107887653

Anonymous 01/17/26(Sat)01:58:23 No.107887653

>>107887626
It won't beat the mp3 you linked here >>107887570 anon, I know, I tested it, it doesn't sound nearly as good.

Anonymous
01/17/26(Sat)01:58:30 No.107887654

Anonymous 01/17/26(Sat)01:58:30 No.107887654

File: 1747177242182925.png (2.61 MB, 1280x1184)

2.61 MB PNG

>>107887652
is she pulling a stallman?

Anonymous
01/17/26(Sat)01:58:58 No.107887656

Anonymous 01/17/26(Sat)01:58:58 No.107887656

File: 1759314070994625.png (1.16 MB, 1104x928)

1.16 MB PNG

change the location to a sunny beach.

Anonymous
01/17/26(Sat)01:59:05 No.107887657

Anonymous 01/17/26(Sat)01:59:05 No.107887657

>>107887648
The sound of silence.

Anonymous
01/17/26(Sat)01:59:48 No.107887659

Anonymous 01/17/26(Sat)01:59:48 No.107887659

File: 1765235285156769.png (2.53 MB, 1632x928)

2.53 MB PNG

Anonymous
01/17/26(Sat)02:00:10 No.107887660

Anonymous 01/17/26(Sat)02:00:10 No.107887660

>>107887648
Funny webm

Anonymous
01/17/26(Sat)02:00:51 No.107887663

Anonymous 01/17/26(Sat)02:00:51 No.107887663

File: 1747798070842528.png (2.51 MB, 1632x928)

2.51 MB PNG

I wonder if our resident frieren porn slopper is happy about new episode

Anonymous
01/17/26(Sat)02:01:23 No.107887668

Anonymous 01/17/26(Sat)02:01:23 No.107887668

>>107887663
who watches that again

Anonymous
01/17/26(Sat)02:01:28 No.107887669

Anonymous 01/17/26(Sat)02:01:28 No.107887669

>>107887641
>>107887653
OK, back to waiting for a local competition that can't be destroyed again

Anonymous
01/17/26(Sat)02:02:14 No.107887673

Anonymous 01/17/26(Sat)02:02:14 No.107887673

File: 1766913449121986.png (2.38 MB, 1632x928)

2.38 MB PNG

>>107887668
I do, I consume around 30~ shows per season.

Anonymous
01/17/26(Sat)02:03:06 No.107887674

Anonymous 01/17/26(Sat)02:03:06 No.107887674

>coma for 3 years
>sdxl still the best for 2d goon
goddam

Anonymous
01/17/26(Sat)02:03:30 No.107887677

Anonymous 01/17/26(Sat)02:03:30 No.107887677

File: 1739054407712040.png (2.5 MB, 1504x1024)

2.5 MB PNG

damn my gens are coming out gigaslopped today. sad.

Anonymous
01/17/26(Sat)02:03:50 No.107887678

Anonymous 01/17/26(Sat)02:03:50 No.107887678

>>107887677
use chroma

Anonymous
01/17/26(Sat)02:04:37 No.107887680

Anonymous 01/17/26(Sat)02:04:37 No.107887680

File: 1757942727340196.png (2.03 MB, 1056x1440)

2.03 MB PNG

>>107887678
I don't want to sit through chroma's gacha lottery + detaling steps. All my gens are basically 1-shot

Anonymous
01/17/26(Sat)02:11:40 No.107887705

Anonymous 01/17/26(Sat)02:11:40 No.107887705

File: a33fcc93f65af661cc2e806ef(...).png (713 KB, 2235x475)

713 KB PNG

We're like a week into this shit.

>reported cp and it took over 24hours to get taken down

Jeet staff was a mistake.

Anonymous
01/17/26(Sat)02:12:20 No.107887708

Anonymous 01/17/26(Sat)02:12:20 No.107887708

File: img_00086_.jpg (521 KB, 1520x1152)

521 KB JPG

Anonymous
01/17/26(Sat)02:13:04 No.107887713

Anonymous 01/17/26(Sat)02:13:04 No.107887713

File: David_Cronenberg_533558.jpg (16 KB, 264x348)

16 KB JPG

I've trained ZiT celeb lora (on deTurbo) and while the likeness comes out well the hands are often chroma tier flesh lumps. Is this a sign of overtraining? Or dataset problem, should I just crop the images to leave out hands or something? Or does this simply happen because it's trained on dedistilled Turbo instead of base?

Anonymous
01/17/26(Sat)02:13:11 No.107887714

Anonymous 01/17/26(Sat)02:13:11 No.107887714

File: 1764704962625934.jpg (81 KB, 850x873)

81 KB JPG

>>107887535
I know Flux cant do NSFW
But can Flux2 Klein edit NSFW images like background and costume they wear or something ??

Image for easy (You)

Anonymous
01/17/26(Sat)02:13:28 No.107887717

Anonymous 01/17/26(Sat)02:13:28 No.107887717

>>107887674
>sdxl + ipadapters
>chroma
>wan2.1/2.2
>maybe qwen edit here and there

im set and don't need any new models, unless much faster versions release without quality loss (IM LOOKING AT YOU CACHEDIT)

Anonymous
01/17/26(Sat)02:14:04 No.107887724

Anonymous 01/17/26(Sat)02:14:04 No.107887724

What the hell is going on with base? It's not distilled for low steps, but it's still distilled from the larger flux 2 presumably, yet it uses CFG > 1, and yet you are supposed to leave negative prompt empty otherwise it deforms your image... WTF is this?

Anonymous
01/17/26(Sat)02:14:32 No.107887725

Anonymous 01/17/26(Sat)02:14:32 No.107887725

>>107887680
how original

Anonymous
01/17/26(Sat)02:14:46 No.107887728

Anonymous 01/17/26(Sat)02:14:46 No.107887728

>>107887717
which chroma though

Anonymous
01/17/26(Sat)02:17:48 No.107887739

Anonymous 01/17/26(Sat)02:17:48 No.107887739

Has the slow motion curse of lightx2v been broken yet?

Anonymous
01/17/26(Sat)02:18:13 No.107887742

Anonymous 01/17/26(Sat)02:18:13 No.107887742

>>107887714
I just tried background and it did without changing the naked lady.

Anonymous
01/17/26(Sat)02:19:19 No.107887747

Anonymous 01/17/26(Sat)02:19:19 No.107887747

>>107887739
>slow motion curse
You mean wan? Slow motion was never an issue for light.

Anonymous
01/17/26(Sat)02:19:51 No.107887750

Anonymous 01/17/26(Sat)02:19:51 No.107887750

>>107887739
Isn't there a 3 sampler strategy where you do like 4 steps high model without lora and then high model with lora, and then low model with lora?
Never tried myself though.

Anonymous
01/17/26(Sat)02:19:54 No.107887751

Anonymous 01/17/26(Sat)02:19:54 No.107887751

>>107887674
We're at the dawn of a new age though, either Flux klein 4b will dethrone XL, or Z Image if it's ever released. Coomers will be eating good in 2026

Anonymous
01/17/26(Sat)02:20:24 No.107887753

Anonymous 01/17/26(Sat)02:20:24 No.107887753

>>107887747
lightx2v 4step distillation loras are the root cause of the slow motion it's known for

Anonymous
01/17/26(Sat)02:20:48 No.107887757

Anonymous 01/17/26(Sat)02:20:48 No.107887757

>>107887753
lies

Anonymous
01/17/26(Sat)02:21:44 No.107887762

Anonymous 01/17/26(Sat)02:21:44 No.107887762

>>107887742
Also at least the 4b one looks like it sucks for backgrounds. I am just getting slop.

Anonymous
01/17/26(Sat)02:22:19 No.107887766

Anonymous 01/17/26(Sat)02:22:19 No.107887766

>>107887753
Oh. I got my names mixed up. No. Probably.

Anonymous
01/17/26(Sat)02:22:35 No.107887769

Anonymous 01/17/26(Sat)02:22:35 No.107887769

>>107887713
It can be overtrained, or you aren't taking enough steps. You could try adding a simple i2i to your workflow, it can fix hands. I've noticed that backgrounds go messy very easily if you overtrain z lora. It's also possible that the sweetspot for your specific lora is way lower than 1. You might get strong resemblance with 0.7

Anonymous
01/17/26(Sat)02:23:39 No.107887773

Anonymous 01/17/26(Sat)02:23:39 No.107887773

>>107887728
which ever works best for you. i like exaggerated realism so uncanny photorealism, spark preview and chroma1 base, these have a little less body horror (but it IS still there).

>>107887739
try...

>PainterI2VAdvanced
https://github.com/princepainter/ComfyUI-PainterI2Vadvanced
>Wan Motion Scale
https://github.com/shootthesound/comfyUI-LongLook

Anonymous
01/17/26(Sat)02:31:05 No.107887806

Anonymous 01/17/26(Sat)02:31:05 No.107887806

>>107887773
can you help me

Anonymous
01/17/26(Sat)02:33:02 No.107887813

Anonymous 01/17/26(Sat)02:33:02 No.107887813

>>107887626
>>107887584
ACEStep 1.5 already discussed previous thread is on its way there. This gen is from most recent iteration and improvements:
https://files.catbox.moe/jc3fgz.mp3

Now, there's not many kpop gens, but here's one I could find from back in Dec in discord
https://files.catbox.moe/enbzvl.mp3

In terms of potential catchyness ACEStep is already Udio tier, after that it's a matter of good prompts to bring it to be as good as the best Udio gens. Takes more effort or could even take a tune on certain genres, sure, but since it's open source it will always be preferable to a locked down model that you'd have to pay to get more gens.

As for HeartMuLa, I don't think that has the musicality (instrument variety) of ACEStep.

Anonymous
01/17/26(Sat)02:35:43 No.107887819

Anonymous 01/17/26(Sat)02:35:43 No.107887819

>>107887813
ETA on 1.5?

Anonymous
01/17/26(Sat)02:36:26 No.107887824

Anonymous 01/17/26(Sat)02:36:26 No.107887824

>>107887819
should release around the time z base rleeases

Anonymous
01/17/26(Sat)02:36:56 No.107887830

Anonymous 01/17/26(Sat)02:36:56 No.107887830

the people who spam threads on Reddit and cherry-pick the worst Klein gens against the best Z are chinks?
With really bad prompts, they force Klein to produce crap (photorealistic etc), while Z can do nothing but be realistic
Sure, z is better in terms of realism, but small is nowhere near as bad as it's made out to be there.

Anonymous
01/17/26(Sat)02:38:01 No.107887833

Anonymous 01/17/26(Sat)02:38:01 No.107887833

>>107887830
Can you be more respectful?

Anonymous
01/17/26(Sat)02:38:03 No.107887834

Anonymous 01/17/26(Sat)02:38:03 No.107887834

What are the crem de la crem NSFW wan loras?

Anonymous
01/17/26(Sat)02:39:25 No.107887842

Anonymous 01/17/26(Sat)02:39:25 No.107887842

>add the naked body in image 1 to the body in image 2
okay now we're cooking

Anonymous
01/17/26(Sat)02:39:38 No.107887843

Anonymous 01/17/26(Sat)02:39:38 No.107887843

>>107887834
i dont know

Anonymous
01/17/26(Sat)02:40:19 No.107887846

Anonymous 01/17/26(Sat)02:40:19 No.107887846

>>107887717
>ipadapters
qrd

Anonymous
01/17/26(Sat)02:42:15 No.107887853

Anonymous 01/17/26(Sat)02:42:15 No.107887853

>>107887833
use e-hentai and you'll know what I mean

Anonymous
01/17/26(Sat)02:43:19 No.107887860

Anonymous 01/17/26(Sat)02:43:19 No.107887860

>>107887853
???

Anonymous
01/17/26(Sat)02:43:53 No.107887863

Anonymous 01/17/26(Sat)02:43:53 No.107887863

File: 55645646546.png (36 KB, 1091x227)

36 KB PNG

>>107887819
Unexpectedly found an actual release date

Anonymous
01/17/26(Sat)02:45:02 No.107887868

Anonymous 01/17/26(Sat)02:45:02 No.107887868

>>107887813
>https://files.catbox.moe/jc3fgz.mp3
Sounds almost ok

>https://files.catbox.moe/enbzvl.mp3
Sounds meh for voice, it has that "metallic" low quality and it's clearly AI, they didn't probably train on non English songs that much, I think udio sounds richer instrumentally and also way less "robotic" :
https://files.catbox.moe/90f0l7.mp3
https://files.catbox.moe/h2qrop.mp3
https://files.catbox.moe/dm8ang.mp3

Anonymous
01/17/26(Sat)02:45:33 No.107887870

Anonymous 01/17/26(Sat)02:45:33 No.107887870

>>107887830
the ablublu model?

Anonymous
01/17/26(Sat)02:45:34 No.107887871

Anonymous 01/17/26(Sat)02:45:34 No.107887871

>>107887846
https://github.com/cubiq/ComfyUI_IPAdapter_plus

Anonymous
01/17/26(Sat)02:46:22 No.107887875

Anonymous 01/17/26(Sat)02:46:22 No.107887875

>>107887863
in 2 more weeks fellas

Anonymous
01/17/26(Sat)02:46:52 No.107887876

Anonymous 01/17/26(Sat)02:46:52 No.107887876

>>107887871
but what it do

Anonymous
01/17/26(Sat)02:47:09 No.107887877

Anonymous 01/17/26(Sat)02:47:09 No.107887877

>>107887868
fuck that's catchy, any reason they are only 32s?

Anonymous
01/17/26(Sat)02:48:55 No.107887883

Anonymous 01/17/26(Sat)02:48:55 No.107887883

>>107887863
>Literally 2 weeks.

You can't make this shit up.

Anonymous
01/17/26(Sat)02:49:21 No.107887886

Anonymous 01/17/26(Sat)02:49:21 No.107887886

>>107887877
Udio v1 limitation

Anonymous
01/17/26(Sat)02:49:37 No.107887887

Anonymous 01/17/26(Sat)02:49:37 No.107887887

>>107887876
sd1.5/sdxl, transfer styles, combine images, read it

Anonymous
01/17/26(Sat)02:52:42 No.107887905

Anonymous 01/17/26(Sat)02:52:42 No.107887905

>>107887626
>mememarks
mememarks also show that GLM Image destroys Z-image turbo, do you also believe that to be the case? keek

Anonymous
01/17/26(Sat)02:56:21 No.107887920

Anonymous 01/17/26(Sat)02:56:21 No.107887920

File: FluxKlein9BDistilled_Outp(...).png (1.84 MB, 832x1248)

1.84 MB PNG

prompt literally just THERE'S TOO MANY NIGGERS IN HERE

Anonymous
01/17/26(Sat)02:56:34 No.107887923

Anonymous 01/17/26(Sat)02:56:34 No.107887923

>>107887868
I actually wonder the size of their model, we don't have enough music models to really compare.

Lumi (¬ᴗ ´¬ )
01/17/26(Sat)02:57:10 No.107887926

Lumi (¬ᴗ ´¬ ) 01/17/26(Sat)02:57:10 No.107887926

File: corpo-zit-2026-01-17_00037_.png (2.58 MB, 2304x1296)

2.58 MB PNG

Anonymous
01/17/26(Sat)02:59:34 No.107887934

Anonymous 01/17/26(Sat)02:59:34 No.107887934

>>107887868
I mean, I've heard good Udio songs, so I know what it's capable of but I don't think you're being objective when you say that ACEStep example is clearly AI but then you link Udio songs that sound low quality. I've got insane Udio songs saved to my drive but I disagree with your assessment here. It should also be noted that the ACEStep examples sound very rich in quality, maybe you can tell the different with quality speakers or headphones. Not quite Udio tier yet in terms of composition, but certainly already better sound quality (though that's probably because they disabled quality downloads).

Here's a decent kpop Udio gen: https://files.catbox.moe/iw5ju4.mp3

Do I think a good ACEStep can do it? Maybe about 90% of it, but not quite there yet with composition. Technically, one thing where Udio really shines is lyrics and adherence to them, E.G.

https://files.catbox.moe/pyxtpi.mp3

ACEStep is still not fully coherent with lyrics and that's concern recognized by the dev plus something they're still working on, but if you've tried Udio long enough you'd know that it also messes up some songs near the end and you'd essentially have to inpaint (which is coming to ACEStep).

Anonymous
01/17/26(Sat)03:00:56 No.107887938

Anonymous 01/17/26(Sat)03:00:56 No.107887938

File: img_00147_.jpg (827 KB, 1520x1152)

827 KB JPG

Anonymous
01/17/26(Sat)03:04:16 No.107887949

Anonymous 01/17/26(Sat)03:04:16 No.107887949

>>107887934
Here's another catchy Udio kpop gen, nice but messes up lyrics somewhere in the center so it's not infallible

https://files.catbox.moe/svtbkq.mp3

Anonymous
01/17/26(Sat)03:07:05 No.107887957

Anonymous 01/17/26(Sat)03:07:05 No.107887957

>>107887751
They will do their best to not generate Penis/Vagina/Anus/Nipples bro

Anonymous
01/17/26(Sat)03:09:40 No.107887970

Anonymous 01/17/26(Sat)03:09:40 No.107887970

>>107887934
To be frank, the Udio niceness is probably just an RLHF tune away with ACEStep. Once we get those sweet weights, if it's missing anything it's very likely given the high audio quality we'll be able to reach the gap with a simple tune on high quality data. ACEStep 1.0 was a meme, we will now have an SD moment for audio, hopefully. There's also that rumored Alibaba model coming, so if they want to give them competition I'm all for it.

Anonymous
01/17/26(Sat)03:10:26 No.107887973

Anonymous 01/17/26(Sat)03:10:26 No.107887973

>>107887957
depends, look how much less cucked Klein is compared to Kontext for example, now that they know they have China who doesn't really give a fuck about this mentally ill safety shit, look how quikcly they dropped their paradigm

Anonymous
01/17/26(Sat)03:16:10 No.107887992

Anonymous 01/17/26(Sat)03:16:10 No.107887992

>>107887552
>So they didn't give a shit about Qwen Image and Qwen Image Edit, but Z-image was enough to spook them?
When QiT has beaten Kontext it was a case of a 20b model beating a 12b model so it was seen as normal, but having a 6b model destroying the ass of a 32b model is a really humiliating experience, that really woken up, and there you go you got a nice product at the end, Competition baby!

Anonymous
01/17/26(Sat)03:16:14 No.107887993

Anonymous 01/17/26(Sat)03:16:14 No.107887993

File: 1749184370112688.png (2.13 MB, 1056x1408)

2.13 MB PNG

>>107887725
thangks :D

Anonymous
01/17/26(Sat)03:16:41 No.107887995

Anonymous 01/17/26(Sat)03:16:41 No.107887995

File: sdgsdfwfwg.mp4 (3.79 MB, 720x1280)

3.79 MB MP4

I forgot about this squish lora, lol.

Anonymous
01/17/26(Sat)03:19:58 No.107888010

Anonymous 01/17/26(Sat)03:19:58 No.107888010

>>107887992
Klein isn't better than Dev though, it's just smaller

Anonymous
01/17/26(Sat)03:21:50 No.107888017

Anonymous 01/17/26(Sat)03:21:50 No.107888017

>>107888010
so you think it's equal? that's also impressive you know? a 9b model as good as a 32b model at editing shit

Anonymous
01/17/26(Sat)03:25:35 No.107888023

Anonymous 01/17/26(Sat)03:25:35 No.107888023

>>107887663
>>107887673
it even got the heavy makeup look

Anonymous
01/17/26(Sat)03:29:38 No.107888042

Anonymous 01/17/26(Sat)03:29:38 No.107888042

>>107888017
no it's worse at editing than Dev too, by a lot. Dev can take up to 14 inputs also. Flex, Pro, and Max are all even better than that but they're API only obviously.

Anonymous
01/17/26(Sat)03:30:05 No.107888044

Anonymous 01/17/26(Sat)03:30:05 No.107888044

i desperately need to train klein loras

Anonymous
01/17/26(Sat)03:31:37 No.107888051

Anonymous 01/17/26(Sat)03:31:37 No.107888051

>>107887713
I got better results from training with V2 Ostris adapter on actual Turbo, than I did with DeTurbo.

Lumi (¬ᴗ ´¬ )
01/17/26(Sat)03:35:21 No.107888068

Lumi (¬ᴗ ´¬ ) 01/17/26(Sat)03:35:21 No.107888068

File: still_cant_believe_it.png (2.15 MB, 1920x1080)

2.15 MB PNG

>tfw mogged by suno
still can't believe it
https://suno.com/s/cR36Z8K0aBXpaATE
https://youtu.be/MAwRKDLqv9c

Anonymous
01/17/26(Sat)03:37:10 No.107888076

Anonymous 01/17/26(Sat)03:37:10 No.107888076

>>107887769
Thanks

Anonymous
01/17/26(Sat)03:43:17 No.107888098

Anonymous 01/17/26(Sat)03:43:17 No.107888098

>>107887957
The model doesn't seem to be actively poisoned and on the level of original SDXL when it comes to nudity, unless BFL found new tricks poison their models NSFW capability should come soon enough, it serms to be easy to train too

Anonymous
01/17/26(Sat)03:46:13 No.107888112

Anonymous 01/17/26(Sat)03:46:13 No.107888112

where's the denoise node for klein? it's working fine but the effect is too strong, 0.5 would be perfect

Anonymous
01/17/26(Sat)03:50:33 No.107888129

Anonymous 01/17/26(Sat)03:50:33 No.107888129

>>107887713
>>107888051
yeah i also use the adapter. around 30 images, 1800-2000 steps, rank 16, and manually crop all the training data to make sure i capture what i want to replicate. i also make sure to save the lora at a multiplicative value of the image count. So if i have 30 images, i save every 90 steps. My worry is that saving at say 100 steps, the last epoch will only have trained on 10 images, instead of the full 30.

i caption with this system prompt:
>Write a long description of this image. refer to the person as 'female'. do not describe any features she cannot change like her physique, face, skin-color, breast size, etc.
>Start with describing the quality of the photograph, her facial expression and her hair. then describe what she's wearing and her pose. then describe the background. and lastly describe the lighting.

my results are great.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.