/g/ - /asg/ - AceStep General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/asg/ - AceStep General 02/11/26(Wed)04:21:11 No.108117091

File: mqdefault[1].jpg (7 KB, 320x180)

7 KB JPG

/asg/ - AceStep General Anonymous 02/11/26(Wed)04:21:11 No.108117091

I wasn't done talking about ace step edition.

>What is this?

A local open weights music generator, like Suno and Udio.

>Original repo (includes lora training)
https://github.com/ace-step/ACE-Step-1.5
>Comfyui guide
https://docs.comfy.org/tutorials/audio/ace-step/ace-step-v1-5
>Suno-like UI
https://github.com/fspecii/ace-step-ui

Share your gens.

Keywords: music gen, local model, song gen, suno, udio, acestep, ace step

Anonymous
02/11/26(Wed)04:35:06 No.108117140

Anonymous 02/11/26(Wed)04:35:06 No.108117140

Has anyone, and I mean anyone made a good lokr yet? It feels inferior to LoRa in every way.

Anonymous
02/11/26(Wed)04:43:27 No.108117176

Anonymous 02/11/26(Wed)04:43:27 No.108117176

https://youtu.be/IjCOM825wk0

https://youtu.be/R6ksf5GSsrk

https://youtu.be/QzddQoCKKss

Anonymous
02/11/26(Wed)04:47:10 No.108117192

Anonymous 02/11/26(Wed)04:47:10 No.108117192

>>108117091
give this to comfy users
>https://github.com/filliptm/ComfyUI-FL-AceStep-Training
it works
heavy on vram, read issues
bit finicky.

test that works:
up to 1200 epochs and no more.
save every 250.
rank/alpha 16/16
learning rate: 0.0005
batch size 1
gradient accumulation 1
leave everything else as it is in configuration node.

disconnected llm loader node before training.

use 4 - 10 songs as dataset to test.
adjust max_duration in those nodes - around 4 minute songs cutt-off.

use 1.7B as llm (it will take few minutes), 4b will take hours..
check manifest.json, almost same results will be given by 4b (retarded?)
llm processing depends on amount of items in dataset.
to test what llm preprocessing does use only 2 songs.
add others to dataset later.
you can modify llm captions by hand.

when loading lora use standard comfy lora loader.
other loaders have minimal or not effect (strange?).

up the strength of lora to 2.0 - 2.5 (it is a must).

add reference audio if you wish via link to ksampler.
denoise 0.85 - 0.95 with euler/simple.
use sft model for cooler results.

will post generation sample later since i am trying to get perfect one right now.
lora is not trained enough, twas a test.
comfy does not pick up lora properly as well - hopefully that will eb adressed soon enough.
>>108117140
what do you fags use, that ryan node?
should link tools next time.

Anonymous
02/11/26(Wed)05:06:55 No.108117274

Anonymous 02/11/26(Wed)05:06:55 No.108117274

>>108117192
>what do you fags use, that ryan node?
random anime man UI.

https://github.com/sdbds/ACE-Step-1.5-for-windows/tree/qinglong

I actually just took is lokr training script and put it in the official UI. I cannot stand that fake suno AI with the backend hidden.

Anonymous
02/11/26(Wed)05:07:08 No.108117276

Anonymous 02/11/26(Wed)05:07:08 No.108117276

>>108117140
i think that guy was just a troll, he was just fucking with people and succeeded very well apparently,

Anonymous
02/11/26(Wed)05:12:09 No.108117297

Anonymous 02/11/26(Wed)05:12:09 No.108117297

>>108117276
Do any of them use the quantized models for reduced vram usage? Quantized for LLM and quantized for PT

Anonymous
02/11/26(Wed)05:17:02 No.108117316

Anonymous 02/11/26(Wed)05:17:02 No.108117316

>>108117276
>i think that guy was just a troll
He's basically Chinese furk.

Anonymous
02/11/26(Wed)05:37:16 No.108117411

Anonymous 02/11/26(Wed)05:37:16 No.108117411

>>108117274
ah. ty. i saw it but i did not open it at all since title itself seems as it was built for special case aka windose.
ty.
>took is lokr training script and put it
good stuff

Anonymous
02/11/26(Wed)05:41:51 No.108117444

Anonymous 02/11/26(Wed)05:41:51 No.108117444

>>108117411
>good stuff
Debatable. I've yet to get a good lokr output. Be aware it doesn't work in comfyui without editing LoRA.py as well.

 if isinstance(model, comfy.model_base.ACEStep15):
        for k in sdk:
            if k.startswith("diffusion_model.decoder.") and k.endswith(".weight"):
                key_lora = k[len("diffusion_model.decoder."):-len(".weight")]
                key_map["base_model.model.{}".format(key_lora)] = k  # Official base model loras
                key_map["lycoris_{}".format(key_lora.replace(".", "_"))] = k
                key_map["lora_unet_{}".format(key_lora.replace(".", "_"))] = k

    return key_map

Anonymous
02/11/26(Wed)05:56:00 No.108117519

Anonymous 02/11/26(Wed)05:56:00 No.108117519

The gradio is broken as all hell on Windows. Tried preprocessing some flacs, and it's complaining about their names or just can't process them, never had this issue when the dataset was smaller so I haven't been able to train anything in the last 5 hours. Discord everyone seems to be having issues with training as well.

Anonymous
02/11/26(Wed)05:58:56 No.108117529

Anonymous 02/11/26(Wed)05:58:56 No.108117529

>>108117519
Easiest solution is to just use an older commit. Or, ask gemini or something. That seems to be what the repo owners do lol.

Anonymous
02/11/26(Wed)06:00:32 No.108117536

Anonymous 02/11/26(Wed)06:00:32 No.108117536

>>108117276
Anime man claimed it was good, maybe his implementation is broken or something but it is possible to get somewhat mediocre results. It should theoretically be better than LoRAs though, so it's probably a miscalculation somewhere. There's a LoKR PR open on the official repo as well.

Anonymous
02/11/26(Wed)06:08:55 No.108117581

Anonymous 02/11/26(Wed)06:08:55 No.108117581

>>108117536
They "work" but the default learing rate is 0.001
Which is insane. I think there are other issues as well, but having to pick through the parameters is time consuming.

Anonymous
02/11/26(Wed)06:11:18 No.108117600

Anonymous 02/11/26(Wed)06:11:18 No.108117600

>>108117581
Wait, just an appendment to that. I tried 128/128 with an LR of 0.0001 I also set factor to 8 instead of -1

It's actually producing coherent results now. Might be something there.

Anonymous
02/11/26(Wed)06:35:25 No.108117701

Anonymous 02/11/26(Wed)06:35:25 No.108117701

>>108117600
Wait never mind. It's still shit.

Anonymous
02/11/26(Wed)06:48:13 No.108117751

Anonymous 02/11/26(Wed)06:48:13 No.108117751

Oh god I feel like a retard. Using the LM to generate audio codes basically nullifies the LoRA.

Anonymous
02/11/26(Wed)06:59:52 No.108117799

Anonymous 02/11/26(Wed)06:59:52 No.108117799

But yeah lokr is pure garbo unless I'm proven otherwise. LoRA works just fine.

Anonymous
02/11/26(Wed)07:07:32 No.108117838

Anonymous 02/11/26(Wed)07:07:32 No.108117838

>>108117444
check and ty

btw those nodes i found for lora will attempt to donwload models into models/acestep which can be filled with shortcuts if one has models

i got x2 good gens today with lora i trained but i am getting nothing but slop during last hour or so.
even slight change to lyrics has an impact on this model.

and lora is not trained anough.
lower rank/alpha eve more and up the lr = quicker overfit i presume, i must try that.

Anonymous
02/11/26(Wed)07:14:57 No.108117863

Anonymous 02/11/26(Wed)07:14:57 No.108117863

>>108117838
>>>/wsg/6090567
got semi-decent one just now
bass and drums rhythm is ok
voice still close to ai slop
guitars = horrible synth

that is with lora at 950 epochs via those nodes (around 1500 steps = 250 epochs per those nodes)

Anonymous
02/11/26(Wed)07:24:11 No.108117906

Anonymous 02/11/26(Wed)07:24:11 No.108117906

My findings so far
>big fat high rank LoRA = generally better
>LR should be very conservative ~0.00003~6
>GA or batch seems optional large GA following by supplemental training with no GA might be cool.
>Letting LM generate codes while using LoRA will basically override the LoRA
>On comfy you may need to crank the strength to 1.5 or 2.
>lokr seems to be a meme.

Anonymous
02/11/26(Wed)07:29:38 No.108117938

Anonymous 02/11/26(Wed)07:29:38 No.108117938

>>108117906
>>On comfy you may need to crank the strength to 1.5 or 2.
This is false, it really depends how overfit the lora is.

>>108117906
>>Letting LM generate codes while using LoRA will basically override the LoRA
If you have that off, the melody will be very wild/random, and the main model is only 600M parameters, so...

Anonymous
02/11/26(Wed)07:32:57 No.108117955

Anonymous 02/11/26(Wed)07:32:57 No.108117955

>>108117938
Do one with the LM codes with the LoRA and without the LoRA. They are almost the same.

Anonymous
02/11/26(Wed)08:00:36 No.108118108

Anonymous 02/11/26(Wed)08:00:36 No.108118108

Anyone have settings that work well for slow, solo piano instrumentals?

Anonymous
02/11/26(Wed)08:19:43 No.108118234

Anonymous 02/11/26(Wed)08:19:43 No.108118234

I wish there was a way to force key change in the middle of the songs, Udio excels at this, and this was a common feature back when music wasn't shit (before mid-2000s), the bias towards modern music unironically ruin the models

Anonymous
02/11/26(Wed)08:30:43 No.108118310

Anonymous 02/11/26(Wed)08:30:43 No.108118310

>>108118234
[chorus - key change] doesn't work?

Anonymous
02/11/26(Wed)09:35:06 No.108118764

Anonymous 02/11/26(Wed)09:35:06 No.108118764

>>108117091
Add this to the next OP:

https://github.com/ryanontheinside/ComfyUI_RyanOnTheInside/tree/main/examples/ace1.5

It contains workflows for Cover and Edit modes

>>108118310
Not really, from my tests

Anonymous
02/11/26(Wed)10:09:02 No.108119066

Anonymous 02/11/26(Wed)10:09:02 No.108119066

>>108118764
>https://github.com/ryanontheinside/ComfyUI_RyanOnTheInside/tree/main/examples/ace1.5
I couldn't get any of them to work, it only output noise/garbage

Anonymous
02/11/26(Wed)12:07:02 No.108120317

Anonymous 02/11/26(Wed)12:07:02 No.108120317

File: comfy fail.jpg (682 KB, 1960x2496)

682 KB JPG

>>108117192
Trying this workflow now and keep getting this error
did I set it up wrong

Anonymous
02/11/26(Wed)13:01:21 No.108120810

Anonymous 02/11/26(Wed)13:01:21 No.108120810

Can anyone confirm if negative prompts work?

Anonymous
02/11/26(Wed)13:47:26 No.108121226

Anonymous 02/11/26(Wed)13:47:26 No.108121226

Just use the fucking gradio, comfy is a mess for now.

Anonymous
02/11/26(Wed)14:16:18 No.108121485

Anonymous 02/11/26(Wed)14:16:18 No.108121485

Is ace able to make songs in Italian?
https://www.youtube.com/watch?v=-nvX9BKOnTA

Anonymous
02/11/26(Wed)17:01:51 No.108122783

Anonymous 02/11/26(Wed)17:01:51 No.108122783

Any workflows for lyric rewriting in existing audio samples?
it's for memes

Anonymous
02/11/26(Wed)17:17:24 No.108122893

Anonymous 02/11/26(Wed)17:17:24 No.108122893

>>108121226
Id rather eat my own shoes.

Anonymous
02/11/26(Wed)18:02:43 No.108123232

Anonymous 02/11/26(Wed)18:02:43 No.108123232

>>108120317
Ok I'm back and it seems thread is ded and no one is genning shit because nobody can figure out this training crap

Seems like the llm version they provide is only useable with the clip from the turbo aio safetensors you get from ComfyUI
Then it tries to scan the music directory to auto-label but just fails immediately
>Scanning directory: H:\Music\whatever
>[FL AceStep] Starting auto-labeling...
>Starting auto-labeling...
>Prompt executed in 0.00 seconds
it even tells me the job completed with no problems even though nothing was generated, the output folder is empty
Already tried with every single mix of settings but nothing seems to work

Anonymous
02/11/26(Wed)18:37:14 No.108123488

Anonymous 02/11/26(Wed)18:37:14 No.108123488

Ace step with a good LoRA is basically the best shit ever and nobody cares.

Anonymous
02/11/26(Wed)18:46:06 No.108123555

Anonymous 02/11/26(Wed)18:46:06 No.108123555

>>108123488
Genning music is fun but it's not the best thing since sliced bread. I'm not spending hours training loras to make slop sound slightly less sloppy.

Anonymous
02/11/26(Wed)19:24:04 No.108123820

Anonymous 02/11/26(Wed)19:24:04 No.108123820

>>108123488
Which lora anon? Is there a list somewhere?

Anonymous
02/11/26(Wed)19:42:17 No.108123935

Anonymous 02/11/26(Wed)19:42:17 No.108123935

>>108123820
Nobody is ever going to share LoRAs for this. You need to make them.

Anonymous
02/11/26(Wed)19:43:39 No.108123949

Anonymous 02/11/26(Wed)19:43:39 No.108123949

>>108123555
>I'm not spending hours
It's like 2 hours max for a LoRA.

Anonymous
02/11/26(Wed)19:47:57 No.108123977

Anonymous 02/11/26(Wed)19:47:57 No.108123977

>>108123935
Well, fuck.

Anonymous
02/11/26(Wed)19:54:53 No.108124018

Anonymous 02/11/26(Wed)19:54:53 No.108124018

>>108123935
>Nobody is ever going to share LoRAs for this
I wish zoomers knew what Torrents are. If you try to push torrents in this day and age, you'd get dead torrents with no seeds unless it's a popular TV show

Anonymous
02/11/26(Wed)20:31:06 No.108124236

Anonymous 02/11/26(Wed)20:31:06 No.108124236

>>108120317
i posted that first link to nodes
i do not get that error and use same setup as you
>>108123232
my modles are not from comfy
i got huggingface repo file by file when ace was released
ad placed shortcuts into "model>acestep" my_shortcuts to models

and you should train on turbo that is what acestep developers say if you read issues on their github
sft is for full fine-tunes

Anonymous
02/11/26(Wed)20:35:21 No.108124266

Anonymous 02/11/26(Wed)20:35:21 No.108124266

>>108124236
>you should train on turbo
Sure, if you want your music to sound like beep boop midi shit.

Anonymous
02/11/26(Wed)20:42:16 No.108124308

Anonymous 02/11/26(Wed)20:42:16 No.108124308

>Gradio crashes and training stops because the inbuilt tensor board has too many points and crashes.

Gradio

Anonymous
02/11/26(Wed)21:12:33 No.108124513

Anonymous 02/11/26(Wed)21:12:33 No.108124513

It's no suno, but it's okay sometimes.

Anonymous
02/11/26(Wed)21:17:28 No.108124544

Anonymous 02/11/26(Wed)21:17:28 No.108124544

>>108124513
It really depends what you're after, with LoRAs it's a beast desu. Right out of the box? Meh

Anonymous
02/11/26(Wed)21:35:02 No.108124678

Anonymous 02/11/26(Wed)21:35:02 No.108124678

>>108124513
suno gives me hives, does it occasionally have a good song?

Anonymous
02/11/26(Wed)22:25:12 No.108124974

Anonymous 02/11/26(Wed)22:25:12 No.108124974

>>108124266
full model does not have what is required for lora, developers said that themselves, check issues

anyways those nodes i found:
- updated few hours ago
2. ran training
3. all generated audio via new lora code fail with corrupt audio (will crash your music player)

no idea what is going on atm

Anonymous
02/11/26(Wed)22:31:26 No.108125007

Anonymous 02/11/26(Wed)22:31:26 No.108125007

>>108124974
Good thing I use comfy where this isn't a problem

Anonymous
02/11/26(Wed)22:45:32 No.108125076

Anonymous 02/11/26(Wed)22:45:32 No.108125076

https://files.catbox.moe/07jvs0.mp3

What's the model?

Anonymous
02/11/26(Wed)22:53:18 No.108125123

Anonymous 02/11/26(Wed)22:53:18 No.108125123

>>108125007
>nodes
is comfy

i updated comfy too so it can be either one messing it up
have to test mix of training nodes/comy old code later since i backup before update

Anonymous
02/11/26(Wed)23:04:21 No.108125190

Anonymous 02/11/26(Wed)23:04:21 No.108125190

>>108123935
1.0 has LoRAs on HF. Why should this version be different? People think others would care, not as much as they think. Model can't be taken down unless artist files complaint. Training on their work is fair use, not illegal. Derivatives made e
With AI fall under covers/fan art, as long the LoRA is not overbaked to reproduce songs verbatim then I see nothing wrong.

Anonymous
02/11/26(Wed)23:06:34 No.108125200

Anonymous 02/11/26(Wed)23:06:34 No.108125200

>>108125190
The difference is these work :^))))

Anonymous
02/11/26(Wed)23:11:44 No.108125240

Anonymous 02/11/26(Wed)23:11:44 No.108125240

Modern problems: Having an AI song stuck in your head. Nobody else on Earth will ever know of it.

Anonymous
02/11/26(Wed)23:44:51 No.108125436

Anonymous 02/11/26(Wed)23:44:51 No.108125436

>>108125240
That's literally why I'm using ai. I can make up my own songs, but the thing is they all sound like me-songs, imo pretty nice, but I would prefer a wider gamut.

I look forward to dubbing harmonies and solos onto ai and having it layer with a2a.

Anonymous
02/11/26(Wed)23:46:56 No.108125455

Anonymous 02/11/26(Wed)23:46:56 No.108125455

together with image gen or video gen, and then I can edit a video. It's highly effective.

Anonymous
02/12/26(Thu)01:02:41 No.108125879

Anonymous 02/12/26(Thu)01:02:41 No.108125879

:3

https://vocaroo.com/1oKyOdyQD8Hu

Anonymous
02/12/26(Thu)01:05:56 No.108125905

Anonymous 02/12/26(Thu)01:05:56 No.108125905

>>108125879
This is totally the kind of song that would be playing at 1:30 in the morning in a night club while some bitch is screaming in my ear telling me what drink she wants me to buy.

Anonymous
02/12/26(Thu)02:00:20 No.108126146

Anonymous 02/12/26(Thu)02:00:20 No.108126146

>>108117192
>>108123232
Ok I tried this shit again, got it to auto label all samples but then failed at Preprocess dataset because it ran out of vram to allocate
12Gb 4070 vramlet btw
seems like training isn't for us poorfags yet

Anonymous
02/12/26(Thu)02:02:28 No.108126152

Anonymous 02/12/26(Thu)02:02:28 No.108126152

>>108126146
Ask Claude (Opus) to vibe code a Ramtorch preprocess script to you. I did this and never again ran into OOM issues when preprocessing.

Anonymous
02/12/26(Thu)03:29:47 No.108126466

Anonymous 02/12/26(Thu)03:29:47 No.108126466

>>108123232
I wish, I can't get pass a numpy.import_array error.

Anonymous
02/12/26(Thu)03:54:17 No.108126585

Anonymous 02/12/26(Thu)03:54:17 No.108126585

>>108126146
>>108126466
you guise must always visit github link before using anything from that place
and then
- read home page "README"
- look at issues both opened and closed (open links if you see something interesting to your case)

you have not done that
not enough memory = not enough gas to run the nodes
>>108126152
try that if it works for you
or ask github developer to consider your case

Anonymous
02/12/26(Thu)04:04:34 No.108126630

Anonymous 02/12/26(Thu)04:04:34 No.108126630

>>108126585
me again
and to reduce vram usage you could try feeding it snippets of audio samples, not full songs.

60 -120 seconds could yield lower vram usage.

Anonymous
02/12/26(Thu)04:49:18 No.108126830

Anonymous 02/12/26(Thu)04:49:18 No.108126830

https://voca.ro/1eLqt5jUmHWl

What if I just dumped the entire Morrowind soundtrack into a LoRA trainer.

Anonymous
02/12/26(Thu)05:19:14 No.108126995

Anonymous 02/12/26(Thu)05:19:14 No.108126995

Without LoRA
https://voca.ro/1cRNtD1kUoHk

With Lora.
https://vocaroo.com/1lmlWsJzUrRo

Anonymous
02/12/26(Thu)06:55:35 No.108127406

Anonymous 02/12/26(Thu)06:55:35 No.108127406

Some more of my anime song LoRA. Just random lyrics and titles made my gemini.

>Overtime Fantasy ~The Demon Lord is my Section Chief~
https://voca.ro/1azCh43Ss7Xu

>UFO in the Tatami Room!? ~Please Don't Eat My Homework, Princess!~
https://voca.ro/145KN2PpfONh

>Absolute Territory! ~Please Buy 10 Copies
https://voca.ro/1dFicT2qic5u

Anonymous
02/12/26(Thu)07:35:16 No.108127625

Anonymous 02/12/26(Thu)07:35:16 No.108127625

>>108127406
Pretty good, way better than without loras.
How many songs did you use to create a specific lora?

Anonymous
02/12/26(Thu)07:39:46 No.108127648

Anonymous 02/12/26(Thu)07:39:46 No.108127648

>>108127625
11 tracks of that particular flavor.

Anonymous
02/12/26(Thu)07:47:34 No.108127692

Anonymous 02/12/26(Thu)07:47:34 No.108127692

>>108127648
I'll have to try.

Anonymous
02/12/26(Thu)08:28:16 No.108127909

Anonymous 02/12/26(Thu)08:28:16 No.108127909

Imagine having the amazing opportunity of training a computer model to replicate any set of songs from a library of hundreds of years worth of recordings mankind provided, and you decide to train on garbage high pitched tracks that are hardly appreciated by anyone other than mentally ill manchildren

Anonymous
02/12/26(Thu)08:45:50 No.108128002

Anonymous 02/12/26(Thu)08:45:50 No.108128002

>>108127909
Absolute loser mindset right here

Anonymous
02/12/26(Thu)09:51:31 No.108128461

Anonymous 02/12/26(Thu)09:51:31 No.108128461

We could share loras here
>>>/t/1374659
I'm training my first lora, let's see how long it takes.

Anonymous
02/12/26(Thu)10:24:01 No.108128675

Anonymous 02/12/26(Thu)10:24:01 No.108128675

>>108127909
> meanwhile seedream 2.0 and other chinese shit

Anonymous
02/12/26(Thu)10:27:32 No.108128705

Anonymous 02/12/26(Thu)10:27:32 No.108128705

can someone make me a lora with this style? https://youtu.be/PFl4QKl0WSE

Anonymous
02/12/26(Thu)11:15:41 No.108129073

Anonymous 02/12/26(Thu)11:15:41 No.108129073

I trained a lora on several adult contemporary tracks from the 1990s and 2000s and it produces superior vocals and bangers much more often
https://voca.ro/1c3mmRMujTkn
https://voca.ro/1fBBblQgbPZq
https://voca.ro/19HYVVQOYnY7

Anonymous
02/12/26(Thu)11:50:02 No.108129337

Anonymous 02/12/26(Thu)11:50:02 No.108129337

https://voca.ro/1gVJAWf65yyg

Anonymous
02/12/26(Thu)12:19:51 No.108129554

Anonymous 02/12/26(Thu)12:19:51 No.108129554

Trained 500 steps and I can barely hear any difference, so I won't bother posting the two versions.
https://vocaroo.com/1KYsjYHyJrj0

Anonymous
02/12/26(Thu)12:23:06 No.108129578

Anonymous 02/12/26(Thu)12:23:06 No.108129578

>>108129554
>Trained 500 steps and I can barely hear any difference
Anon, it's not the "number of steps" that matter, it's the loss values
You have to keep training until you consistently see loss values of ~0.18, and you need to ensure the lora rank is high enough

Anonymous
02/12/26(Thu)12:37:39 No.108129700

Anonymous 02/12/26(Thu)12:37:39 No.108129700

>>108129554
It was epochs sorry, but thanks the loss wasn't in that range by a lot.

Anonymous
02/12/26(Thu)14:17:44 No.108130453

Anonymous 02/12/26(Thu)14:17:44 No.108130453

By the way, I can confirm what the other anon said the other day that training a big Lora on a large well-captioned dataset does seem to improve the model's lyric alignment a bit, it even improves the parts of the song you are supposed to use tags to indicate instrumental solos etc

Anonymous
02/12/26(Thu)14:23:24 No.108130509

Anonymous 02/12/26(Thu)14:23:24 No.108130509

>>108129554
a 1000 epochs and rank to 32. It shows.
https://vocaroo.com/1koKRZmH6ewZ

Anonymous
02/12/26(Thu)15:47:12 No.108131242

Anonymous 02/12/26(Thu)15:47:12 No.108131242

:o I woke up this (afternoon) morning and I have Udio at home! WOW!!! I'm Amazing, I have my own UDIO! AND I have a full copy of Hogwarts Legacy. And some cold coffee. I may go to the store and buy a gallon of half & half to celebrate (yes, really, I'm whiter than u). To celebrate, I'll make a black lives happy song 87)

>>108130509
It's neat, is this meant to be Grunge?

You're going to need to alter it a tiny bit on the LLM side, because the 4B was taught to square everything up nice and Pro Tools tight. Real Grunge vocals miss the beat a LOT.

Anonymous
02/12/26(Thu)15:48:13 No.108131253

Anonymous 02/12/26(Thu)15:48:13 No.108131253

I also want to remind everyone that Jeffrey Epstein is alive and playing Fortnight on a not yet identified account.

Anonymous
02/12/26(Thu)16:19:55 No.108131503

Anonymous 02/12/26(Thu)16:19:55 No.108131503

>>108131242
Dead Kennedys.

Anonymous
02/12/26(Thu)18:27:19 No.108132321

Anonymous 02/12/26(Thu)18:27:19 No.108132321

Anyone training LoRAs might want to look at

https://github.com/koda-dernet/Side-Step

It has a bunch of scripts for properly training sft and base along with a lot of other basic improvements that the default training script is lacking.

Anonymous
02/12/26(Thu)18:45:25 No.108132421

Anonymous 02/12/26(Thu)18:45:25 No.108132421

>>108132321
>low vram support
vramlets, we can still pretend to be proper human beings...
I'm sold

Anonymous
02/12/26(Thu)19:32:53 No.108132721

Anonymous 02/12/26(Thu)19:32:53 No.108132721

>>108132421
Keyword is pretend.

Anonymous
02/12/26(Thu)19:36:12 No.108132740

Anonymous 02/12/26(Thu)19:36:12 No.108132740

I never trained a lora before, what do you actually use in the dataset?
I understand for the audio itself, but what about the "captions"? Just all the lyrics? A description of the genres? A description of the instruments used?

Anonymous
02/12/26(Thu)19:37:25 No.108132750

Anonymous 02/12/26(Thu)19:37:25 No.108132750

>>108129073
>>108129337
>>108130509
acestep has way more potential than I expected, I wonder if some rich anon would make a finetune of it with actual copyrighted music

Anonymous
02/12/26(Thu)19:38:24 No.108132758

Anonymous 02/12/26(Thu)19:38:24 No.108132758

>>108132740
The gradio has a dataset builder. It will give you a json with the correct FORMAT for building your dataset, but I must stress the llm is actually horrendous and auto captioning. You need to go back and manually fix the captions with the correct ones that fit your dataset and trust something like gemini can do the job better.

Anonymous
02/12/26(Thu)19:42:31 No.108132780

Anonymous 02/12/26(Thu)19:42:31 No.108132780

>>108132758
Can you share an example of a proper caption of a known song?

Anonymous
02/12/26(Thu)19:57:59 No.108132879

Anonymous 02/12/26(Thu)19:57:59 No.108132879

File: AS15T__00021_.png (369 KB, 512x512)

369 KB PNG

New song genning. I have an sd1.4 gen run at the start, to give me a thumbnail (using a trick to get it to run first). sd1.4 is trivial at 512^2, and looks really neat.

Anonymous
02/12/26(Thu)20:01:03 No.108132904

Anonymous 02/12/26(Thu)20:01:03 No.108132904

>>108132321
Thanks, apparently the vibe coded release software is not the real in house software.

Anonymous
02/12/26(Thu)20:01:46 No.108132911

Anonymous 02/12/26(Thu)20:01:46 No.108132911

>>108132904
and I guess they won't share the in house stuff?

Anonymous
02/12/26(Thu)20:02:05 No.108132912

Anonymous 02/12/26(Thu)20:02:05 No.108132912

>>108132758
They actually used Gemini 2.5 to do their tagging for training ace step.

Anonymous
02/12/26(Thu)20:06:43 No.108132940

Anonymous 02/12/26(Thu)20:06:43 No.108132940

File: Screenshot from 2026-02-1(...).png (64 KB, 1067x251)

64 KB PNG

>>108132911
They probably won't. I don't think these song models are actually produced separately as is alleged.

picrel:

like all the ai crew in all of china is like fishing boats coordinating, competing, exploiting, according to bugman rules.

Anonymous
02/12/26(Thu)20:28:56 No.108133071

Anonymous 02/12/26(Thu)20:28:56 No.108133071

>>108132912
>They actually used Gemini 2.5 to do their tagging for training ace step.

That doesn't change the fact the dinky little llm in the gradio UI is awful at tagging.

Anonymous
02/12/26(Thu)20:30:22 No.108133081

Anonymous 02/12/26(Thu)20:30:22 No.108133081

>>108132904
The model is far too competently trained compared to the gradio interface for there not to be some fuckery going on. I'm just not sure what that is.

Anonymous
02/12/26(Thu)20:32:23 No.108133090

Anonymous 02/12/26(Thu)20:32:23 No.108133090

>>108132750
no idea how suno and others managed to avoid that
it would be taken down very quickly

Anonymous
02/12/26(Thu)21:08:13 No.108133247

Anonymous 02/12/26(Thu)21:08:13 No.108133247

>>108133090
They didn't and got sued and basically mafia forced to be on major's side :
https://www.yahoo.com/entertainment/music/articles/universal-music-ai-song-generator-112138759.html

Michelin Star AI Chef
02/12/26(Thu)21:15:01 No.108133280

Michelin Star AI Chef 02/12/26(Thu)21:15:01 No.108133280

https://vocaroo.com/1n2Fh0SFpdN3

Untrimmed. The chef leaves the eyeballs on the fish.

:^)

Anonymous
02/12/26(Thu)21:24:13 No.108133322

Anonymous 02/12/26(Thu)21:24:13 No.108133322

I am liking AceStep so far, but I still hope Alibaba releases the musicgen model they promised.

Anonymous
02/12/26(Thu)21:32:27 No.108133351

Anonymous 02/12/26(Thu)21:32:27 No.108133351

>>108133247
The music industry is cartoonishly evil.

Michelin Star AI Chef
02/12/26(Thu)21:32:34 No.108133353

Michelin Star AI Chef 02/12/26(Thu)21:32:34 No.108133353

>>108133322
Can suno or udio do this?
>>108133280

Michelin Star AI Chef
02/12/26(Thu)21:39:05 No.108133376

Michelin Star AI Chef 02/12/26(Thu)21:39:05 No.108133376

>>108133353
also, audio quality is degraded by vocaroo.

Michelin Star AI Chef
02/12/26(Thu)21:43:42 No.108133392

Michelin Star AI Chef 02/12/26(Thu)21:43:42 No.108133392

Does anyone know if the gradio thing generates audio codes? or is that audio code mechanism proprietary?

Michelin Star AI Chef
02/12/26(Thu)22:41:04 No.108133662

Michelin Star AI Chef 02/12/26(Thu)22:41:04 No.108133662

https://files.catbox.moe/9hqykk.mp3

8^)

Blank.

Beautiful.

I think it defaults to nonsense-chinese if not given [instrumental] or whatever prompts.

Anonymous
02/12/26(Thu)22:45:10 No.108133685

Anonymous 02/12/26(Thu)22:45:10 No.108133685

>>108117519
Figured out what this was, turns out the file paths I automatically named with a script were wrong in the .json.

First Initial D LoRA test. Didn't fully converge how I want yet, but it turned out neat.

Lower quality vocaroo since catbox/literbox are both down, not perfect yet but it's getting there. I'll try a LoKR next.

https://vocaroo.com/1jd1RPwi0YOk
https://vocaroo.com/1jbbWhW53zqp
https://vocaroo.com/19Rwwk5tiXl7
https://vocaroo.com/15yUhXzFwVEG

Anonymous
02/12/26(Thu)23:26:16 No.108133908

Anonymous 02/12/26(Thu)23:26:16 No.108133908

Music generation can never reach the breadth and popularity of image generation because it can take seconds to appreciate an image. You're stuck for two minutes minimum if you want to appreciate music. This is confounded by the fact that everyone's taste in music is hyper specific, and one man's favorite genre might illicit disgust in another.

Anonymous
02/12/26(Thu)23:32:35 No.108133925

Anonymous 02/12/26(Thu)23:32:35 No.108133925

>>108117176
You can add input audio now? I was using Yue before because ace step didn't have input

Anonymous
02/12/26(Thu)23:34:16 No.108133931

Anonymous 02/12/26(Thu)23:34:16 No.108133931

>>108133908
Models don't thrive on popularity. They thrive on how fun they are to use. If ACEStep remains the only viable local solution (unlikely), it'll eventually blow up.

Anonymous
02/12/26(Thu)23:38:36 No.108133951

Anonymous 02/12/26(Thu)23:38:36 No.108133951

>>108133925
YuE is significantly inferior to ACEStep 1.5, both in architecture, audio output diversity, speed. It was neat, but still behind commercial models. ACEStep is on par with those, so you shouldn't be using YuE anymore. But in short, yes, ACEStep 1.5 can take in audio input and do covers, audio repainting, and extensions.

Michelin Star AI Chef
02/12/26(Thu)23:40:05 No.108133959

Michelin Star AI Chef 02/12/26(Thu)23:40:05 No.108133959

>>108133931
I don't care if it gets popular.

I have what I want. I have Udio at home. I'm set for life.

Michelin Star AI Chef
02/12/26(Thu)23:41:05 No.108133967

Michelin Star AI Chef 02/12/26(Thu)23:41:05 No.108133967

>>108133951
That's what the indian bloggers say, anyway.

Anonymous
02/12/26(Thu)23:46:28 No.108133995

Anonymous 02/12/26(Thu)23:46:28 No.108133995

>>108133931
>>108133959
it hard not test this model and mess with it atm
but (dont have links since i didnt even bookmark them) if i am correct ace is already working on next model

Michelin Star AI Chef
02/12/26(Thu)23:50:33 No.108134016

Michelin Star AI Chef 02/12/26(Thu)23:50:33 No.108134016

yue is inferior, that is all.

Anonymous
02/12/26(Thu)23:51:13 No.108134018

Anonymous 02/12/26(Thu)23:51:13 No.108134018

>>108133685
Loss is looking at least 2x better with LoKr and the default Gradio settings. Might actually converge now.

Michelin Star AI Chef
02/13/26(Fri)00:02:43 No.108134094

Michelin Star AI Chef 02/13/26(Fri)00:02:43 No.108134094

Let me give you an example of what I mean by this thing ace step 1.5 is amazing:

I can gen unlimited *happy* 2016+ style EDM. I don't like new sad or moralizing or BAME crap.

Anonymous
02/13/26(Fri)00:18:04 No.108134180

Anonymous 02/13/26(Fri)00:18:04 No.108134180

When training a lora I noticed cutting off all the instrumental only parts gives significantly better results, I think the model is learning the lyrics even during an instrumental outro or intro for example, it doesn't understand well when to stop. What else did you guys found out that increases the quality of the loras?

Anonymous
02/13/26(Fri)00:21:35 No.108134201

Anonymous 02/13/26(Fri)00:21:35 No.108134201

>>108134180
Increasing the rank desu. Slide that shit all the way up. There was a writeup on the discord that basically concluded that if there is too much variation in your dataset and your rank was too low it would average accross all of the inputs. A big rank accounts for the variation in data. You just need a very conservatively low lr to go with that.

Anonymous
02/13/26(Fri)00:26:03 No.108134219

Anonymous 02/13/26(Fri)00:26:03 No.108134219

>>108134018
What lokr trainer are you using and what settings? I couldn't get it to do anything except crab when I tried.

Anonymous
02/13/26(Fri)00:32:18 No.108134245

Anonymous 02/13/26(Fri)00:32:18 No.108134245

File: 456454545648.png (140 KB, 1873x527)

140 KB PNG

>>108134219
LoKr has been added to official Gradio, these are default settings
0.001
64/128

Anonymous
02/13/26(Fri)00:42:12 No.108134283

Anonymous 02/13/26(Fri)00:42:12 No.108134283

>>108134245
Huh, I tried these settings on the random anime guy repo and it resulted in basically every song being played over itself at once after 500 steps. If it actually works out for you, I might give it another go.

Anonymous
02/13/26(Fri)00:50:58 No.108134313

Anonymous 02/13/26(Fri)00:50:58 No.108134313

>>108134283
It varies by dataset size. I suspect since it learns so fast, might want to use tiny LR decrements. Everyone has been saying LoKR results in more accurate voices/likeness, and with complex genres it also helps. So I think it's generally accepted this is better. Initial D is no easy target, this is with a dataset of 70 songs so let's see how Turbo does.

Anonymous
02/13/26(Fri)00:54:24 No.108134327

Anonymous 02/13/26(Fri)00:54:24 No.108134327

>When there aren't enough of songs you like in a given style to make a viable dataset

Songs for this feel?

Anonymous
02/13/26(Fri)01:00:39 No.108134355

Anonymous 02/13/26(Fri)01:00:39 No.108134355

Nothing like the very first song you generate after training a lora being a banger

https://voca.ro/1oLuGAhFE5tk

Anonymous
02/13/26(Fri)01:06:49 No.108134377

Anonymous 02/13/26(Fri)01:06:49 No.108134377

>>108134355
Not my kind of music but the quality is very good.

Anonymous
02/13/26(Fri)01:08:00 No.108134379

Anonymous 02/13/26(Fri)01:08:00 No.108134379

Two made with Yue tonight
https://files.catbox.moe/tz26s7.mp3
App for destruction
https://files.catbox.moe/wltpvl.mp3
Guerilla transmission

Anonymous
02/13/26(Fri)01:15:03 No.108134410

Anonymous 02/13/26(Fri)01:15:03 No.108134410

>>108134379
>https://files.catbox.moe/wltpvl.mp3
Was it your intention for the singer to sound like Chris chan mumbling over a poorly mixed instrumental track with a dollar store microphone?

Anonymous
02/13/26(Fri)01:34:54 No.108134482

Anonymous 02/13/26(Fri)01:34:54 No.108134482

File: 1745613154461604.png (415 KB, 600x456)

415 KB PNG

>>108134410
Who you suckas think you're sucking on
I'm the sucking boss
https://files.catbox.moe/544onu.mp3

Anonymous
02/13/26(Fri)01:40:09 No.108134499

Anonymous 02/13/26(Fri)01:40:09 No.108134499

Stuck on choosing my next LoRA.

Choices:

1) Lil' Pump LoRA
2) Hercules the animated movie soundtrack LoRA
3) Yakuza Kiryuu Karaoke collection LoRA
4) Various stage musical tracks LoRA
5) Eroge background music LoRA

Anonymous
02/13/26(Fri)01:42:19 No.108134508

Anonymous 02/13/26(Fri)01:42:19 No.108134508

Did they fix the sft training? I heard the training was coded to work only with turbo.

Anonymous
02/13/26(Fri)01:43:28 No.108134515

Anonymous 02/13/26(Fri)01:43:28 No.108134515

>>108134508
https://github.com/koda-dernet/Side-Step

This guy has fixes for it in his repo. They work as far as I can tell.

Anonymous
02/13/26(Fri)01:46:04 No.108134523

Anonymous 02/13/26(Fri)01:46:04 No.108134523

>>108134515
Yeh im using it right now, i was talking about the official repo

Anonymous
02/13/26(Fri)01:47:55 No.108134532

Anonymous 02/13/26(Fri)01:47:55 No.108134532

>>108134523
I try not to pay attention to it. It's always breaking itself with 500 random ai generated commits a day.

Anonymous
02/13/26(Fri)02:01:36 No.108134601

Anonymous 02/13/26(Fri)02:01:36 No.108134601

I am currently training an Enya lora, very curious about the results, especially she has a very unique style, will post results if it turns out alright

Anonymous
02/13/26(Fri)02:03:34 No.108134611

Anonymous 02/13/26(Fri)02:03:34 No.108134611

comfy repaint nodes when?
cover/reference nodes when?
AIEEEEEEEEEEEEEEEEEEEEEEEEEEE

Anonymous
02/13/26(Fri)02:07:02 No.108134630

Anonymous 02/13/26(Fri)02:07:02 No.108134630

>>108134611
>Hey chat gpt make an audio mask node that taks a vae encoded audio latent and a time range and then blends it back into the latent before decoding it

Anonymous
02/13/26(Fri)02:08:46 No.108134636

Anonymous 02/13/26(Fri)02:08:46 No.108134636

>>108134630
is cover/reference then just playing with the denoise in this case?

Anonymous
02/13/26(Fri)02:13:01 No.108134660

Anonymous 02/13/26(Fri)02:13:01 No.108134660

>>108134636
Pretty sure that's done with audio codes.
You can probably do a jazzy remix of a song more faithfully in comfy by just running the song through a lower denoise.

Michelin Star AI Chef
02/13/26(Fri)02:16:21 No.108134675

Michelin Star AI Chef 02/13/26(Fri)02:16:21 No.108134675

the audio codes part is not publicly available. Hopefully someone will reverse engineer it.

Anonymous
02/13/26(Fri)03:25:08 No.108134958

Anonymous 02/13/26(Fri)03:25:08 No.108134958

>>108132750
> a finetune
The base is too poisoned.

Michelin Star AI Chef
02/13/26(Fri)03:37:26 No.108134993

Michelin Star AI Chef 02/13/26(Fri)03:37:26 No.108134993

comfyui needs to support FLAC thumbnails.

Anonymous
02/13/26(Fri)04:56:41 No.108135311

Anonymous 02/13/26(Fri)04:56:41 No.108135311

Once you generate with ace step, are there models to enhance the result like seedvr exists for videos and images, to get rid of that low quality 64kbps mp3 feel?

Anonymous
02/13/26(Fri)05:26:18 No.108135465

Anonymous 02/13/26(Fri)05:26:18 No.108135465

>>108135311
I use this
https://github.com/entrepeneur4lyf/Web-Audio-Mastering

Anonymous
02/13/26(Fri)05:29:24 No.108135481

Anonymous 02/13/26(Fri)05:29:24 No.108135481

https://voca.ro/11kqJ8NyiVQH
https://voca.ro/13eMGMKVVkkd
https://voca.ro/1kKtMpYi0318

Michelin Star AI Chef
02/13/26(Fri)06:12:28 No.108135684

Michelin Star AI Chef 02/13/26(Fri)06:12:28 No.108135684

>>108135465
Neat, I like the reduce stereo width thing.

Anonymous
02/13/26(Fri)06:16:10 No.108135700

Anonymous 02/13/26(Fri)06:16:10 No.108135700

>>108117192
he done an update to his nodes

i have done some training tests; it fails to do llm captioning part and create samples (maybe it is just me and my setup).

so if you dont have all code nodes will not work.
i had samples generated with his old code.

if you train full songs nan loss info is displayed but training does happen - it starts to capture style at around 250 epochs and it is done at 500 steps.

dataset 5 full length songs, it takes around 13gb to 14gb to train fluctuates. he rewrote his memory management. songs were max 4.5 minutes in length. vocals and instruments are well captured.
test was retard settings of
- 128/256 rank/alpha
- 0.0001 LR
- 500 warmup

i done another test with 2 minute length snippets it hovers just above 8gb vram. no nan happens everything displays well in console and in nodes.
quality is meh but usable via hunting a good gen.
guitar instruments sound like synths.

those nodes might get good.

Anonymous
02/13/26(Fri)06:17:17 No.108135704

Anonymous 02/13/26(Fri)06:17:17 No.108135704

>>108135700
>all code nodes
old node code

Anonymous
02/13/26(Fri)06:42:04 No.108135833

Anonymous 02/13/26(Fri)06:42:04 No.108135833

>>108135700
that was me
>>108135704
and that was me

and it is me again
oh fugg
retard settings or not, i did try others as well o see will it prevent nan (it did not) did not help
nan is real
lora is not usable it is corrupt

seems those nodes can train only maximum of 2 minutes audio samples, they can't process full length songs hence the nan

and nan is real i repeat

Anonymous
02/13/26(Fri)06:43:58 No.108135840

Anonymous 02/13/26(Fri)06:43:58 No.108135840

>>108135833
and me one more time, forgot to say
i forgot to hook it up (my wf is a bit messy) hence false excitement; i got default model gens -.-. but they sound so good i did not even notice lora was not active

Michelin Star AI Chef
02/13/26(Fri)06:48:41 No.108135854

Michelin Star AI Chef 02/13/26(Fri)06:48:41 No.108135854

sft is necessary for really long amounts of text, but base is the real chad.

Anonymous
02/13/26(Fri)07:06:55 No.108135946

Anonymous 02/13/26(Fri)07:06:55 No.108135946

>>108135854
What does base offer over sft?

Anonymous
02/13/26(Fri)07:16:31 No.108135998

Anonymous 02/13/26(Fri)07:16:31 No.108135998

>>108135946
NTA but base absolutely nails vocals in cover mode with a single sample for me, sft didn't even come close with the exact same setup.
Instruments were worse though.
Wouldn't be surprised it it absolutely kills it with loras when I can be fucked to train them.

Anonymous
02/13/26(Fri)07:35:44 No.108136091

Anonymous 02/13/26(Fri)07:35:44 No.108136091

anyone has any idea how to unlock the turbo steps and go beyond 8? i tried editing the limit in handler.py and made no difference

Anonymous
02/13/26(Fri)08:05:37 No.108136227

Anonymous 02/13/26(Fri)08:05:37 No.108136227

>>108134601
As promised, here is some Enya, kek
https://voca.ro/1dnPwaTL3p7V
https://voca.ro/1a0PJll2cypA

Anonymous
02/13/26(Fri)08:05:56 No.108136228

Anonymous 02/13/26(Fri)08:05:56 No.108136228

File: file.png (94 KB, 932x1290)

94 KB PNG

where the hell are base and sft?

Anonymous
02/13/26(Fri)08:07:28 No.108136237

Anonymous 02/13/26(Fri)08:07:28 No.108136237

>>108136227
quite lovely

Anonymous
02/13/26(Fri)08:08:25 No.108136244

Anonymous 02/13/26(Fri)08:08:25 No.108136244

>>108136228
i downloaded them following the instructions in the GH, there's 3 turbo versions too (with different shifts)

Michelin Star AI Chef
02/13/26(Fri)08:14:15 No.108136278

Michelin Star AI Chef 02/13/26(Fri)08:14:15 No.108136278

>>108135998
https://huggingface.co/ACE-Step/models

Anonymous
02/13/26(Fri)08:14:27 No.108136281

Anonymous 02/13/26(Fri)08:14:27 No.108136281

>>108136228
You're in the turbo directory. Look at the author they have each one in their own repo.

Michelin Star AI Chef
02/13/26(Fri)08:16:58 No.108136294

Michelin Star AI Chef 02/13/26(Fri)08:16:58 No.108136294

>>108136227
Someone assembled an experimental set of recordings and put it on, well, saying the name would get me b& because now it's like a shit site, but it used to be grand. One of those mega download type sites. Anyway, I'm sure someone made torrents.

Those would make amazing loras. For everyone, not just those interested in that kind of fruity music - because it would vastly expand the tonal palette

also why is dice game capcha always the highest #?

Anonymous
02/13/26(Fri)10:37:44 No.108137327

Anonymous 02/13/26(Fri)10:37:44 No.108137327

https://voca.ro/18QToU7m2LXS

Anonymous
02/13/26(Fri)10:52:00 No.108137455

Anonymous 02/13/26(Fri)10:52:00 No.108137455

File: Captura de pantalla 2026-(...).png (672 KB, 1134x876)

672 KB PNG

My second Lora only works at 0.3 strengths more than that it collapses. Does the graph tell you anything, I am clueless.

Anonymous
02/13/26(Fri)11:10:20 No.108137593

Anonymous 02/13/26(Fri)11:10:20 No.108137593

>>108137455
first time training anything? that graph means its fucking shit

Michelin Star AI Chef
02/13/26(Fri)13:04:09 No.108138399

Michelin Star AI Chef 02/13/26(Fri)13:04:09 No.108138399

>>108137455
Did you use an audio crop node or what?

Michelin Star AI Chef
02/13/26(Fri)14:00:13 No.108138803

Michelin Star AI Chef 02/13/26(Fri)14:00:13 No.108138803

so far, for me, all my covers have sounded like haunted house music. So, not totally worthless.

Anonymous
02/13/26(Fri)16:16:02 No.108139769

Anonymous 02/13/26(Fri)16:16:02 No.108139769

>>108136281
>>108136244
ok thanks

Anonymous
02/13/26(Fri)16:36:09 No.108139921

Anonymous 02/13/26(Fri)16:36:09 No.108139921

>>108129073
can you share the lora? I like it

Anonymous
02/13/26(Fri)16:43:52 No.108139986

Anonymous 02/13/26(Fri)16:43:52 No.108139986

>>108134245
Got awful results out of this. Either my LR was too so it didn't learn anything, or the inference code on the meme UI for it is not right, or LoKR is a meme.

Anonymous
02/13/26(Fri)18:35:20 No.108140757

Anonymous 02/13/26(Fri)18:35:20 No.108140757

So yeh sft training is broken, only turbo works properly, i tried side step and i get the same garbage results. This model has a lot more potential, but until someone fixes the sft training, we wont be able to reach it. Base non sft is not worth training because it dosent use the lm.

Anonymous
02/13/26(Fri)18:44:33 No.108140817

Anonymous 02/13/26(Fri)18:44:33 No.108140817

>>108140757
Turbo is limited to 8 steps only.

Anonymous
02/13/26(Fri)19:12:07 No.108141020

Anonymous 02/13/26(Fri)19:12:07 No.108141020

What are you guys using to download your albums for LoRA tuning?

https://github.com/vitiko98/qobuz-dl

The only just werks way I could find. Seems like the only reliable quick way to do it, but it requires one to be a paypig after trial period.

Anonymous
02/13/26(Fri)19:13:46 No.108141032

Anonymous 02/13/26(Fri)19:13:46 No.108141032

What's the recommended mp3 quality for training?

Anonymous
02/13/26(Fri)19:22:19 No.108141073

Anonymous 02/13/26(Fri)19:22:19 No.108141073

>>108141032
I recommend FLAC, since it's lossless and the preprocessor accepts that, but if you can't just get the highest quality you can get.

Anonymous
02/13/26(Fri)19:23:25 No.108141079

Anonymous 02/13/26(Fri)19:23:25 No.108141079

>>108136227
that is very good

Anonymous
02/13/26(Fri)19:24:10 No.108141085

Anonymous 02/13/26(Fri)19:24:10 No.108141085

>>108141073
Got it.
Well I have 320kbps mp3, they should be enough for my tests.

Anonymous
02/13/26(Fri)19:32:52 No.108141137

Anonymous 02/13/26(Fri)19:32:52 No.108141137

>>108139986
I tried to warn you about lokr bro

Anonymous
02/13/26(Fri)19:34:18 No.108141145

Anonymous 02/13/26(Fri)19:34:18 No.108141145

>>108137593
Nta but there’s literally no way you can infer the final quality of the LoRA from a loss chart outside of it doing something extremely unusual

Anonymous
02/13/26(Fri)22:46:57 No.108142222

Anonymous 02/13/26(Fri)22:46:57 No.108142222

>>108140757
Skill issue, I trained several loras on the SFT model and they work fine
Remember to set Timestep shift to 1

Anonymous
02/13/26(Fri)22:50:46 No.108142252

Anonymous 02/13/26(Fri)22:50:46 No.108142252

>>108142222
Yes i did set it to 1, do you have any advice? What are you doing differently? I don't think I'm making any mistake.

Anonymous
02/13/26(Fri)23:02:45 No.108142309

Anonymous 02/13/26(Fri)23:02:45 No.108142309

>>108142222
I've also had fine outputs from sft. Even with shift at 3.

Anonymous
02/13/26(Fri)23:25:16 No.108142445

Anonymous 02/13/26(Fri)23:25:16 No.108142445

>>108142309
>>108142252
I'm about to shoot myself

Anonymous
02/14/26(Sat)00:08:57 No.108142683

Anonymous 02/14/26(Sat)00:08:57 No.108142683

lil pump LoRA

https://voca.ro/135cMhiSdc2H

Anonymous
02/14/26(Sat)00:37:54 No.108142852

Anonymous 02/14/26(Sat)00:37:54 No.108142852

https://voca.ro/1kqmuHl6YVJN

Anonymous
02/14/26(Sat)00:47:27 No.108142893

Anonymous 02/14/26(Sat)00:47:27 No.108142893

Does comfy support offloading like gradio now?

Anonymous
02/14/26(Sat)01:22:25 No.108142982

Anonymous 02/14/26(Sat)01:22:25 No.108142982

Does anyone have tried training instrumental lora?

not related, just checking new gradio ui
https://voca.ro/13B13qagi5P7

Anonymous
02/14/26(Sat)01:33:29 No.108143016

Anonymous 02/14/26(Sat)01:33:29 No.108143016

>>108142893
comfy has native cpu offloading since always basically, so yes.
>>108142982
man they make so many commits in their project, it's always in constant flux. I have like 4 versions of acestep right now too FUCK

Anonymous
02/14/26(Sat)01:38:58 No.108143034

Anonymous 02/14/26(Sat)01:38:58 No.108143034

>>108143016
i am not touching any ace code for a full month
there is code updoot every 20 minutes average
and lot of llm nonsense

Anonymous
02/14/26(Sat)01:55:22 No.108143068

Anonymous 02/14/26(Sat)01:55:22 No.108143068

>>108134499
>Eroge background music LoRA
This. And share plz

Anonymous
02/14/26(Sat)02:02:52 No.108143100

Anonymous 02/14/26(Sat)02:02:52 No.108143100

Who said vocaroo degrades the audio quality? I upload an mp3 then redownloaded it, it has the same crc.

Michelin Star AI Chef
02/14/26(Sat)02:55:31 No.108143296

Michelin Star AI Chef 02/14/26(Sat)02:55:31 No.108143296

>>108134499
a capella music (harmony) lora

Michelin Star AI Chef
02/14/26(Sat)02:56:32 No.108143301

Michelin Star AI Chef 02/14/26(Sat)02:56:32 No.108143301

>>108143100
isn't it mono?

Anonymous
02/14/26(Sat)03:07:52 No.108143353

Anonymous 02/14/26(Sat)03:07:52 No.108143353

>>108143034
I think I am going to do the same.
https://vocaroo.com/15pvmdQpY5GD

Anonymous
02/14/26(Sat)03:13:32 No.108143378

Anonymous 02/14/26(Sat)03:13:32 No.108143378

>>108143301
no, same crc means no file modification so the audio should be the same

Anonymous
02/14/26(Sat)03:14:06 No.108143380

Anonymous 02/14/26(Sat)03:14:06 No.108143380

>>108134499
>Various stage musical tracks LoRA
A Julie Andrews LoRa would be lovely.

Anonymous
02/14/26(Sat)03:15:56 No.108143389

Anonymous 02/14/26(Sat)03:15:56 No.108143389

>>108143353
based

Anonymous
02/14/26(Sat)03:18:46 No.108143400

Anonymous 02/14/26(Sat)03:18:46 No.108143400

File: 1771021102830.jpg (83 KB, 1178x827)

83 KB JPG

>>108117091

Anonymous
02/14/26(Sat)03:29:03 No.108143445

Anonymous 02/14/26(Sat)03:29:03 No.108143445

>>108143400
god dammit this shit is moving too fast, I can't keep up

Anonymous
02/14/26(Sat)03:41:36 No.108143489

Anonymous 02/14/26(Sat)03:41:36 No.108143489

>>108143445
thats their SAAS grift

Anonymous
02/14/26(Sat)03:51:28 No.108143524

Anonymous 02/14/26(Sat)03:51:28 No.108143524

https://voca.ro/1kvqKbVHkwg8

I wasn't happy with how the lil pump LoRA didn't really sound like lil pump, so I traied it again with a slightly better set of captions.

Anonymous
02/14/26(Sat)03:51:52 No.108143525

Anonymous 02/14/26(Sat)03:51:52 No.108143525

File: 1758710699856385.jpg (680 KB, 1884x1464)

680 KB JPG

>>108143353
>no metallic sounding voice
how

Anonymous
02/14/26(Sat)04:15:51 No.108143604

Anonymous 02/14/26(Sat)04:15:51 No.108143604

Is it normal lora training preprocessing takes forever and using cpu?

Anonymous
02/14/26(Sat)04:16:30 No.108143608

Anonymous 02/14/26(Sat)04:16:30 No.108143608

>>108143353
out of lyrics in the world and all of the topics you gotta be retarded
such is your life

Anonymous
02/14/26(Sat)04:17:09 No.108143613

Anonymous 02/14/26(Sat)04:17:09 No.108143613

File: 55151125445.png (130 KB, 1901x800)

130 KB PNG

Someone tell me how a vibecoded training UI (which I patched up with Gemini) is 20x faster than the original on 3090

This dataset was taking me 20 hours per run with same settings, now under 60 mins for the same 70 song dataset? Plus this properly has prodigy scheduler. The official Gradio is a disaster.

https://github.com/Estylon/ace-lora-trainer

Anonymous
02/14/26(Sat)04:18:50 No.108143623

Anonymous 02/14/26(Sat)04:18:50 No.108143623

>>108143613
I've vibecoded at least three trainers at this point and they all are like done in like 1/4th of time of the offical because they keep forcing torchao on the windows users.

Anonymous
02/14/26(Sat)04:20:00 No.108143630

Anonymous 02/14/26(Sat)04:20:00 No.108143630

>>108143604
>preprocessing
in any case it is not a good quality,
so use low parameter model and that one does all songs in few minutes max.

someone recommended music-flamingo model by nvidia which you have to be able to "install" since python+venv is required.
apparently it can analyze the styles of music and alike.

Anonymous
02/14/26(Sat)04:25:35 No.108143653

Anonymous 02/14/26(Sat)04:25:35 No.108143653

>>108143623
I think I'm pretty much done with their garbage UI. It's sad that many will get introduced to ACEStep that way. Getting these LoRAs to work like normal in Comfy at this point is pivotal. They work at like a weight of 2, maybe there's a solution to make them work at a weight of 1?

Anonymous
02/14/26(Sat)04:26:41 No.108143659

Anonymous 02/14/26(Sat)04:26:41 No.108143659

>>108143613
>vibecoded UI
should be safe to use,
now i must check it out.
dont use those that "vibecoded" hardware interaction unless you want rma your hardware sometime in future.

Anonymous
02/14/26(Sat)04:33:11 No.108143686

Anonymous 02/14/26(Sat)04:33:11 No.108143686

>>108143659
Only thing is by default LM is loaded which I had to disable, you can't choose path for checkpoint and it auto downloads the model, and it gives some weird errors but those can be fixed with Gemini, then this UI is essentially the ai-toolkit of ACEStep imo.

Anonymous
02/14/26(Sat)04:46:20 No.108143753

Anonymous 02/14/26(Sat)04:46:20 No.108143753

https://voca.ro/1nAHSP4V569N

Anonymous
02/14/26(Sat)08:17:55 No.108144746

Anonymous 02/14/26(Sat)08:17:55 No.108144746

>>108143653
>Getting these LoRAs to work like normal in Comfy at this point is pivotal. They work at like a weight of 2
Just increase the rank and train longer (achieve a lower loss), dude

Anonymous
02/14/26(Sat)08:25:21 No.108144792

Anonymous 02/14/26(Sat)08:25:21 No.108144792

https://voca.ro/1edUWyxS6MXU

Michelin Star AI Chef
02/14/26(Sat)13:27:38 No.108146747

Michelin Star AI Chef 02/14/26(Sat)13:27:38 No.108146747

File: qwen2 a tiger skiing resized.png (1.58 MB, 1478x845)

1.58 MB PNG

personally, I went with this:

nano ~/ComfyUI/user/__manager/config.ini

idk, that's where it is in my fresh install. I changed:

network_mode = offline
db_mode = local

if you know of anything else to do, let me know.

I don't care about Qwen 2. Here's Qwen 2. It's using prompt enhancement, I think, but it doesn't look special.

Michelin Star AI Chef
02/14/26(Sat)13:43:02 No.108146829

Michelin Star AI Chef 02/14/26(Sat)13:43:02 No.108146829

>>108143378
This literally just happened. Great discovery, nobody else on the Internet knows this yet. No point in using catbox, except for workflow flacs.

>>108143613
The official gradio is a vibe coded fake front end. It's not what the chinese used to make ace step. Research papers are a skinsuit for them.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.