/g/ - /asg/ - AceStep General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/asg/ - AceStep General 02/16/26(Mon)15:01:13 No.108164777

File: acestep2.png (2.3 MB, 1344x768)

/asg/ - AceStep General Anonymous 02/16/26(Mon)15:01:13 No.108164777

>What is this?

A local open weights music generator, like Suno and Udio.

>Original repo (includes lora training)
https://github.com/ace-step/ACE-Step-1.5
>Comfyui guide (but use the SFT model instead of Turbo, CFG=1 and 50 steps)
https://docs.comfy.org/tutorials/audio/ace-step/ace-step-v1-5
>Suno-like UIs
https://github.com/fspecii/ace-step-ui
https://github.com/roblaughter/ace-step-studio
>Cover and Edit modes
https://github.com/ryanontheinside/ComfyUI_RyanOnTheInside/tree/main/examples/ace1.5
>Cover and reference song tutorial
https://www.youtube.com/watch?v=sv4pNrjRh7s

Share your gens and lora results.

Keywords: music gen, local model, song gen, suno, udio, acestep, ace step, lmg,ldg, dmp

Anonymous
02/16/26(Mon)15:27:56 No.108165006

Anonymous 02/16/26(Mon)15:27:56 No.108165006

Britney Spears lora:

https://voca.ro/16blE7la2Ff8
https://voca.ro/1eLHswbE9ZHK
https://voca.ro/1me4VBIkzHfK

To that one anon that is claiming "lora training on SFT doesn't work": this is a Lora trained on SFT =)

Anonymous
02/16/26(Mon)16:12:33 No.108165295

Anonymous 02/16/26(Mon)16:12:33 No.108165295

based thread! i don't use ACE Step but the best wisheS!

Anonymous
02/16/26(Mon)16:15:01 No.108165310

Anonymous 02/16/26(Mon)16:15:01 No.108165310

>>108164777
Nakadashee AceStep-chan

Anonymous
02/16/26(Mon)16:52:52 No.108165607

Anonymous 02/16/26(Mon)16:52:52 No.108165607

File: default.jpg (23 KB, 586x275)

23 KB JPG

>>108165006
it is...ok, can hear her singing style from time to time.
my internets is cutting out for half a day and more lately since i live in a third world country (australia), so yesterday i had time to sit and test default settings as per developer instruction, results were meh via gradio.
must test comfy nodes since i got better results than via gradio interface.
one note, manual captions, removing redundant stuff like too many attributes llm gives (in case it does detect correct instruments), help quite a bit.

enya anon done it really well via overfit, and if he sees this post;
what was your overfit setting?
high lr low rank low small-medium dataset?

Anonymous
02/16/26(Mon)17:19:32 No.108165764

Anonymous 02/16/26(Mon)17:19:32 No.108165764

>>108165607
>enya anon done it really well via overfit, and if he sees this post;
>what was your overfit setting?
>high lr low rank low small-medium dataset?

I used the default 0.0003 LR at 800~1000 epochs (I can't remember where I stopped), my dataset consisted of 24 songs

Anonymous
02/16/26(Mon)17:21:37 No.108165776

Anonymous 02/16/26(Mon)17:21:37 No.108165776

>>108165764
As of the rank, I used rank 128 I think, it's the maximum my card supports without going OOM

Anonymous
02/16/26(Mon)18:42:02 No.108166294

Anonymous 02/16/26(Mon)18:42:02 No.108166294

>>108165776
>>108165764
ty, will try it.

i used 12 songs, same genre and different bands, results are meh and sometimes ok.

comfy gens are better, if i crank lora strength to 2 there is that fm radio but super high can+static noise, yet it does replicate training set song at around 80% of the content.

Anonymous
02/16/26(Mon)18:47:34 No.108166323

Anonymous 02/16/26(Mon)18:47:34 No.108166323

>>108166294
>fm radio
AM -.- radio

Anonymous
02/16/26(Mon)18:52:56 No.108166359

Anonymous 02/16/26(Mon)18:52:56 No.108166359

>>108166294
>comfy gens are better, if i crank lora strength to 2 there is that fm radio but super high can+static noise
I said that in last thread and I am going t say again, you are probably undertraining your models. Use a high enough LR, train longer, and use a high rank.

Anonymous
02/16/26(Mon)18:58:38 No.108166395

Anonymous 02/16/26(Mon)18:58:38 No.108166395

>>108166359
Also DO NOT USE THE LLM.
The LLM tends to weaken the Lora effect, sometimes it even changes the voice/singing style

Anonymous
02/16/26(Mon)19:05:42 No.108166444

Anonymous 02/16/26(Mon)19:05:42 No.108166444

does anyone have access to the suite Sony has produced to compare?

Anonymous
02/16/26(Mon)19:15:27 No.108166524

Anonymous 02/16/26(Mon)19:15:27 No.108166524

>>108166444
>suite Sony has produced
What even that is?

Anonymous
02/16/26(Mon)19:19:19 No.108166559

Anonymous 02/16/26(Mon)19:19:19 No.108166559

>>108165776
In my experience 128 has been good. I think people need to cook their LoRAs a lot longer though. At 1500 epochs 2000 or more is probably better.

>>108166395
Yeah do not use the LLM if you are using a LoRA. It straight up makes it own song and the LoRA is just antagonistic to it.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.