/g/ - /asg/ - AceStep General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/asg/ - AceStep General 02/08/26(Sun)15:24:01 No.108095075

File: ACE.png (95 KB, 1920x1080)

/asg/ - AceStep General Anonymous 02/08/26(Sun)15:24:01 No.108095075

>What is this?

A local open weights music generator, like Suno and Udio.

>Original repo (includes lora training)
https://github.com/ace-step/ACE-Step-1.5
>Comfyui guide
https://docs.comfy.org/tutorials/audio/ace-step/ace-step-v1-5
>Suno-like UI
https://github.com/fspecii/ace-step-ui

Share your gens.

Keywords: music gen, local model, song gen, suno, udio, acestep, ace step

Anonymous
02/08/26(Sun)15:28:49 No.108095115

Anonymous 02/08/26(Sun)15:28:49 No.108095115

Is audio gen lighter than image/text gen or will it still rekt gpulet users?

Anonymous
02/08/26(Sun)15:31:02 No.108095136

Anonymous 02/08/26(Sun)15:31:02 No.108095136

>>108095115
theres like a huge range of models for image gen. Its a bit heavier than SDXL models but lighter than like Flux. I am running the comfyui workflow with 4GB vram and the generation time is about 8x longer than the song duration.

Anonymous
02/08/26(Sun)15:31:37 No.108095139

Anonymous 02/08/26(Sun)15:31:37 No.108095139

how far are we from giving an ai a full album with corresponding lyrics and letting it generate more songs in the same style?

Anonymous
02/08/26(Sun)15:32:35 No.108095151

Anonymous 02/08/26(Sun)15:32:35 No.108095151

>>108095075
>https://github.com/fspecii/ace-step-ui
looks unironically like vibecoded trash.
do we know if comfy is planning to implement more nodes?

Anonymous
02/08/26(Sun)15:34:21 No.108095168

Anonymous 02/08/26(Sun)15:34:21 No.108095168

>>108095139
It works if you train a Lora. People have already trained loras with Michael Jackson, Linkin Park etc with success.

Anonymous
02/08/26(Sun)15:35:08 No.108095174

Anonymous 02/08/26(Sun)15:35:08 No.108095174

>>108095139
you can do that now with the lora training feature. also supposedly the audio sounds a lot better too when you use a lora

Anonymous
02/08/26(Sun)15:45:58 No.108095252

Anonymous 02/08/26(Sun)15:45:58 No.108095252

>>108095115
You need a 24gb GPU to run the biggest LLM text encoder together with the main DiT model comfortably, but if you disable the LLM, it works even on CPU-only for VRAMlets, but the output quality will be shittier.

Anonymous
02/08/26(Sun)15:49:57 No.108095285

Anonymous 02/08/26(Sun)15:49:57 No.108095285

>>108095174
>>108095168
do you give them snippets or full length mp3s?

Anonymous
02/08/26(Sun)15:53:25 No.108095306

Anonymous 02/08/26(Sun)15:53:25 No.108095306

>>108095285
You can train Loras with full length tracks as long as your GPU doesn't OOM in the process.

Anonymous
02/08/26(Sun)16:04:30 No.108095379

Anonymous 02/08/26(Sun)16:04:30 No.108095379

>>108095306
I'll bite, how many minutes of audio for a decent lora?

Anonymous
02/08/26(Sun)16:09:10 No.108095410

Anonymous 02/08/26(Sun)16:09:10 No.108095410

>>108095379
Any full album (~11 songs) works

Anonymous
02/08/26(Sun)16:19:14 No.108095498

Anonymous 02/08/26(Sun)16:19:14 No.108095498

File: may_i_see_it.jpg (28 KB, 500x378)

28 KB JPG

>>108095168

Anonymous
02/08/26(Sun)16:24:28 No.108095536

Anonymous 02/08/26(Sun)16:24:28 No.108095536

File: acelokr.png (120 KB, 598x527)

120 KB PNG

https://xcancel.com/bdsqlsz/status/2020432198210613708

Based if true

Anonymous
02/08/26(Sun)16:27:22 No.108095560

Anonymous 02/08/26(Sun)16:27:22 No.108095560

>>108095075
Can I use lora with ComfyUI yet?
why is it broken?

Anonymous
02/08/26(Sun)16:31:30 No.108095600

Anonymous 02/08/26(Sun)16:31:30 No.108095600

>>108095560
You have to convert the lora first. I had Claude to write a python script for me that converts it to a format Comfy accepts and it worked perfectly.

You have to convert the keys like:

new_key = k.replace("base_model.model.base_model.model.", "diffusion_model.decoder.")

Anonymous
02/08/26(Sun)16:51:08 No.108095768

Anonymous 02/08/26(Sun)16:51:08 No.108095768

https://voca.ro/1o8PRqN0Gbae

Anonymous
02/08/26(Sun)17:17:02 No.108095955

Anonymous 02/08/26(Sun)17:17:02 No.108095955

File: AE86-fifth-stage.png (1.15 MB, 1600x900)

1.15 MB PNG

https://voca.ro/15rN76Zadfqu

Anonymous
02/08/26(Sun)17:36:53 No.108096107

Anonymous 02/08/26(Sun)17:36:53 No.108096107

>>108095536
is LoKr another Lora replacement like Locon Dora etc?

Anonymous
02/08/26(Sun)17:39:13 No.108096122

Anonymous 02/08/26(Sun)17:39:13 No.108096122

>>108095768
sounds like low quality mp3 but overall good

Anonymous
02/08/26(Sun)17:50:36 No.108096205

Anonymous 02/08/26(Sun)17:50:36 No.108096205

>>108096107
Yes

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.