[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/wsg/ - Worksafe GIF

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • Supported file types are: GIF, WEBM

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Previous thread >>5515011
Dedicated Suno/Udio thread >>5553687
Post AI generated stuff. Song covers, animations, etc.
OC encouraged, but not required.
This thread focuses on audio and video with an audio component.
Let me know if you have more links to add. This thread is a work in progress.

> Voice-to-Voice
RVC walkthrough (somewhat outdated, collab is dead): https://docs.google.com/document/d/13_l1bd1Osgz7qlAZn-zhklCbHpVRk6bYOuAuB78qmsE/edit
Models, mega links, and mirrors: https://docs.google.com/spreadsheets/d/1tAUaQrEHYgRsm1Lvrnj14HFHDwJWl0Bd9x0QePewNco/edit#gid=0
https://github.com/Mangio621/Mangio-RVC-Fork
https://github.com/Vali-98/XTTS-RVC-UI
https://github.com/voicepaw/so-vits-svc-fork

> Text-To-Speech
https://github.com/collabora/WhisperSpeech
https://github.com/myshell-ai/OpenVoice
https://github.com/yl4579/StyleTTS2
https://github.com/BoltzmannEntropy/xtts2-ui
https://github.com/daswer123/xtts-webui (Warning: Windows version uses prebuilt binaries that anons haven't verified. Use at your own discretion)

> Music
https://github.com/facebookresearch/audiocraft
https://rentry.org/AudioCraftRemix

> Audio Cleanup
https://github.com/Anjok07/ultimatevocalremovergui
https://github.com/resemble-ai/resemble-enhance

> Related boards
>>>/aco/asdg
>>>/aco/csdg
>>>/b/degen
>>>/d/ddg
>>>/e/edg
>>>/g/sdg
>>>/g/lmg
>>>/g/aicg
>>>/h/hdg
>>>/trash/sdg
>>>/u/sdg
>>>/vg/aids
>>>/vt/vtai
>>
ATTENTION
If you're looking for text to song, there is a separate suno\udio thread due to high interest relative to other posts
>>5553687
>>
File: 1716263603179845.webm (5.97 MB, 720x576)
5.97 MB
5.97 MB WEBM
taking webms from dud thread
>>
File: 1716263663146564.webm (5.82 MB, 1280x720)
5.82 MB
5.82 MB WEBM
>>
File: 1716263728699116.webm (2.71 MB, 320x240)
2.71 MB
2.71 MB WEBM
>>
post the one where commander shepard calls jacob a subhuman
>>
>>5557210
Nice
>>
File: Squidwock.webm (5.98 MB, 960x540)
5.98 MB
5.98 MB WEBM
>>
>>5557210
The good old times, when the US wasn't filled with nigs and mongrels.
Cursed be the synagogue of satan.
>>
File: ronnie-mcnutt-dance_1.webm (3.34 MB, 1280x720)
3.34 MB
3.34 MB WEBM
Been having some fun with Viggle AI, there doesn't seem to be filtering on who you can upload. I assume you can't upload nudity ofc, but I uploaded a shirtless guy and the Abu Ghraib prisoner without any trouble.
>>
File: lucas-bedroom-dance_2.webm (4.15 MB, 1280x720)
4.15 MB
4.15 MB WEBM
>>5557785
>>
File: abu-ghriddy_3.webm (4.06 MB, 1280x720)
4.06 MB
4.06 MB WEBM
>>5557791
>>
>>5557785
jesus christ that's cursed
>>
>>5557785
if i could make a request for the same dance but with
Budd Dwyer
Ricardo López
Shuaib Aslam
& Gleb Korablev?
>>
File: george-floyd-dance_4.webm (5.09 MB, 1280x720)
5.09 MB
5.09 MB WEBM
I tried to upload George Floyd but it got blocked, said some shit about community guidelines. Super easy to circumvent.
>>
>>5557785
jesus.. funniest thing ive seen
>>
>>5557210
"heroines", not heroes. so tired of people using the wrong world all the time. still saved though.

man, makes me wonder where the fixed version of little mermaid live action is?
>>
https://docs.google.com/document/d/17fjNvJzj8ZGSer7c7OFe_CNfUKbAxEh_OBv94ZdRG5c/edit#heading=h.n8ac32fhltgg

wanted to drop this guide that I found for using UVR, just in case any anon might find it helpful.
>>
>>5557785
fucking LMAOOOOOOOOOOOOOOO holy shit you got me good
>>
>>5557969
Does anybody outside of us memers even give a shit about this guy anymore? He seems memoryholed already
>>
Anybody got those of the zoomer concert entrance?
>>
Anyone have this?
https://desuarchive.org/wsg/thread/5518247/#5538629
>>
>>5559115
thanks for sharing anon, seems really comprehensive
https://voca.ro/11yLUnVg1lU8
>>
>>5557785
wtf it was a prank all along?
>>
Praise be the Omnissiah for Mechanicus II: Necron Boogaloo has been announced.
>>
>>5557785
i.... can't look away
>>
>>5559626
How does it change key from the original like that? Isn't it taking the notes straight from the vocal track?
I've definitely heard some that do not do this that cause the ai vocal to be outside the character's normal range, but hits the notes normally.
>>
>>
>>5560022
I might have left a filter on the pitch control that I shouldn't have. That might be why it sounds flatter than the original. Unless that's not your question, then if it's about the key/purchases the vocals are on, I changed it manually in RVC because without adjustment it was out of marcelines usual singing vocal range.
>>
>>5560367
Key/pitch. Damn phoneposting got me good there
>>
>>5560367
I am going to be honest, I think you should also match the backing with a straight pitch up to the same key as the vocals. It clashes pretty hard here without that.
It is interesting to know that you can adjust the vocal pitch though.
>>
>>5560706
I'll try it and see if it works out.
>>
>>5560706
so idk why i even pitch shifted the vocals to begin with. this sounds fine without it IMO. I also completely removed the pitch filter and it sounds way better.
https://vocaroo.com/16Rm1sCOcqaW (embed) (embed)

Here's what it sounds like with the instrumental pitch shifted the same as the vocal track (+1/12 octaves or +16%). Pitch shifting to the same pitch f/g 1 didn't work either and sorta made it sound like an attempt at a metal remix of the notes. but this is the funnier result because the instrumental sounds like something outa budokai 1.
https://vocaroo.com/18fxle7SzRmU (embed) (embed)
>>
File: sides crippled.gif (117 KB, 189x292)
117 KB
117 KB GIF
>>5557785
>>
File: pitch.webm (4.17 MB, 437x389)
4.17 MB
4.17 MB WEBM
>>5560981
This is very confusing to hear.
The vocal is closer to a half-step up from the original; the 2nd link with the shifted instrumentation overshot it. By the way, according to audacity, a half-step is approximately a 5.95% change, pic related was what I applied to match the vocal.
As for the filter, I will be honest I can't significantly distinguish the vocals apart between the three in the first minute or so, though I did not listen to the entirety of each.
>>
File: pitch2.webm (2.23 MB, 300x300)
2.23 MB
2.23 MB WEBM
>>5560981
Just for fun, tried shifting your sample down as an example. It is really clear that the vocal track didn't match the backing track at 0:35 when my noob attempt at vocal isolation let the guitar through.
I recommend doing something similar with your isolated vocal track instead.
>>
>>5561300
Wait so was the issue the difference in key between the main vocals and the backing vocals or the main vocals and the instrumental?
>>
>>5561578
Paramore track: unchanged pitch, vocal removed
AI track: pitched half-step down, vocal isolated
The clash is because of the pitched down AI track having a crude isolation that lets some of the AI instrumental backing through, which is off-key now because the AI vocal was off-key and correcting that puts the instrumentation off-key instead.
>>
>>5561608
i can't make any real adjustments right now, but I think what you're getting at is that I need to re-isolate the vocals to ensure all of the instrumentals are pulled from them.

I've been trying changing the pitches of both the instrumental and the isolated vocals in Audacity separately (to test the effect of the pitch shift) by max a half step to .5 or even .25 half steps and every time the pitch adjustment just makes the tracks sound waaay off. I'll try restarting with the original track later and see if i can't get a better result after consulting the guide above.

Thanks for the tips though anon, I'm a giga novice when it comes to sound mixing so really i don't quite understand completely but this has been interesting to learn about. I just sorta always wrote off any mixing issues or mistakes as "imagine you're in karaoke or something and it just wont be perfect lol"
>>
>>5561730
Try listening to the original track alongside the AI one to compare the two. It is really obvious then. In audacity, you can import the two and play them alongside easily.
I assumed you were getting an isolated vocal output already; weird and annoying that it would pre-mix it for you like that when it's only generating a vocal.
Good luck separating the two, I am clueless there as well.
>>
>>5561765
i re-isolated the instrumentals from the original song and overlayed the vocals from my old one. I noticed the old one started a few beats earlier than the original song for some reason so I lined them up and also noticed the key was off a bit on the old one like you were saying. I think I'll double-check like this for my future covers. i had no idea something like this would even happen.

This one is a new one's instrumental sounds exactly the same as the original when I line them up. Is it?
https://voca.ro/1lSvduVJ4K4V
>>
The joke is on me. How many times will I listen to this? Four apparently.
You are baiting or tone-deaf, and I do not care which. Fuck this.
>>
>>5562056
man i genuinely dont hear what youre hearing then lol, my bad ¯\_(ツ)_/¯
>>
>>5562066
>>5562056
the only thing different is the vocals and, of course they are.
>>
>>5557785
kek
but definitely would be way funnier with some samsung tunes
>>
>>5559228
reddit and twitter will bring him up whenever they want to kill cops
>>
File: txf-pepe.webm (4.81 MB, 720x480)
4.81 MB
4.81 MB WEBM
>>5557207
>>
Does anyone have the Frank horrigan one where he sings human? The one on YouTube was deleted...
>>
File: 1712843082351190.webm (3.71 MB, 852x720)
3.71 MB
3.71 MB WEBM
>>
>>5563016
Did you made this with Suno AI? What's the title?
>>
>>5562714

i didnt have it, so i made one. needed to compress the shit outa it to post a webm though lol
https://voca.ro/1lVzDcXAbGfH
https://files.catbox.moe/zf8bxw.mp4
>>
>>5564740
Thanks bro!
>>
>>5563527
i reposted this so i could listen to it on my phone through clover, i actually have no idea
>>
File: Credit to ADCon-wsg-.webm (5.83 MB, 1024x576)
5.83 MB
5.83 MB WEBM
>>
File: frankrap.webm (2.13 MB, 1280x108)
2.13 MB
2.13 MB WEBM
>>5564740
i don't think it sounds entirely correct, maybe it's impossible to make him sing correctly.
this one is more talking than singing.
>>
>>5566130
frank's got such a deep and crusty voice, there's no way it'll sound perfect when the AI's doing it. Best I could do is get close.

That being said, that's a banger
>>
tried several collabs to train a voice model but they disconnect randomly for maximun capacity
any less known collab or local training?
>>
>>5566353
RVC has a built-in trainer. you just gotta do something else for a few hours
>>
whats the preferred text to speech in the OP? is one better than the other or are they all pretty similar in quality/ease of use?
>>
File: 1702981627554548.webm (4.89 MB, 720x720)
4.89 MB
4.89 MB WEBM
>>
File: 1704837844313055.webm (5.75 MB, 720x480)
5.75 MB
5.75 MB WEBM
>>5567812
>>
File: 1711942758384392.webm (1.13 MB, 480x852)
1.13 MB
1.13 MB WEBM
>>5567815
>>
File: 1714144986247885.webm (3.53 MB, 720x720)
3.53 MB
3.53 MB WEBM
>>5567818
>>
File: VinDrinksDiesel.webm (4.54 MB, 960x540)
4.54 MB
4.54 MB WEBM
Why do they call him Vin?
>>
File: GF AC.webm (1022 KB, 1280x626)
1022 KB
1022 KB WEBM
>>5559228
wdym? They just made a whole video game for him.
>>
>>5568225
the fuck is this shit
>>
>>5568234
It's the latest woke Assassin's Creed game.

Ubisoft makes a game about ancient Japan and zoom in on the one guy in the whole country who was black. History shows that he did exist but was actually just a servant that they dressed up like a samurai for fun, but of course the devs pretend like that wasn't the case. There's an edit war on Wikipedia to "correct" the record and remove old information.
In addition to lying about history (and the fact that it would be very retarded for the one guy that sticks out the most to be an assassin) you can also play as a stronk womyn because women are totally badass and can do everything a man can do. She's at least Japanese.
Needless to say Japanese gamers are upset with the game and the left don't know how much they can bite back since they aren't as used to shit on asian people as they are shitting on white people.
>>
>>5568244
>the left don't know how much they can bite back since they aren't as used to shit on asian people
guess we'll have to wait for the kikes to come up with the latest instructions for the golems
>>
>>5568234
>>5568244
Don't forget, he is not only black, but also gay.
>>
>>5568326
lulz was that actually confirmed? i think that is just a meme
>>
Trump just got convicted on all counts.

Welcome to the greatest timeline.
>>
>>5560068
It's insane how an AI made this scene 10x more kino!
>>
is it better with the backing vocals
https://vocaroo.com/14LObMgeLBfp

or without?
https://vocaroo.com/1281P8KEL8Qh
>>
>I forgot this thread existed.
Damn, System Shock is a helluva drug.
>>
>>5569042
you should use the Halsey cover of that song instead, it'll sound more natural.
>>
>>5569060
Oh dang yeah thisll be way better. Thanks mate
>>
>>5568612
i heard, America is dead
>>
>>5569042
>>5569060
once again, thanks anon, the whole ensemble fits better for it too. Dont know why I didn't search for a cover last night when I usually do. musta been suffering from poo brain
https://vocaroo.com/1nSvXGUBMdr3
>>
>>5569468
no problem, glad it worked out.
I like matching voices to songs, it's fun.
>>
>>5569468
did you adjust the pitch when you ran it through RVC? it sounds a little off to me. if so, I'd advise against doing that unless it's by +12 or -12, or a rap song.
>>
>>5569450
Why would America be dead just because a slave of the jews got declared guilty?
>>
>>5569668
Busted election system, busted legal system, busted media, people not believing in free speech (swallowing concepts like hate speech, disinformation, misinformation), values not passed on to the youth. Then there's people such as yourself who simply don't get it on their own and will probably not have the mental capacity to understand what I'm saying even when I point it out in plain text (not saying it just because I'm trying to pick a fight, it's simply true). There's still a thing called "America" or "USA" but it's in name only.
>>
>>5569690
everything is fake and gay. always has been.
>>
>>5569573
Yeah I tried it at +0 and it didnt sound right and theres no way +12 wouldnt sound like a squeeker in a COD lobby. So i adjusted it to +2 for most of the song and +3 for the chorus because I didn't feel like the +2 sounded like she was "trying enough" if you get what I mean.
>>
>>5570096
it's... off key, though. if it doesn't sound right at +12 or -12, or in rare cases -24 or +24, then the song isn't a good candidate for the voice you're using.
>>
>>5570096
You're not dealing with just random pitch shifting, the 12 tone scale is music theory. The pitch might allow the voice to sound more like the character, but shifting it causes it to be off key. So it'll be discordant and sound like shit when sung against the instrumental. Basically you have to shift it in intervals of 12, but you might have a slight chance of it not sounding like absolute ass at +7 which is a perfect fifth.
https://en.wikipedia.org/wiki/Interval_(music)
>>
>>5570146
thank you for explaining that. I knew that it didn't work, but not why. I'll have to try +7 sometime.
>>
>>5570146
I see, guess I'll try +7 later because 0 was bad and 12 will be even worse
>>
>>5570146
Ah fuck I had a brain blast and I finally understand what you're saying about key and frequency after someone else explained it by comparing the keys with their hz values. Damn lol
>>
>>
>>5557210
this restores my faith in ai
>>
>>5572024
a masterpiece
>>
>>
>>5570146
idk what i was smoking lol. +0 is good. i guess I was so focused on making it sound exactly like the character I didn't really think of the big picture when it all came together. I'll keep it in mind from now on, thanks anon
>>
>>5567515
TTS doesn't seem to be used a lot by anons here, none of it is on the same level as elevenlabs, but if I recall, a few threads back anons liked xtts due to ease of use (there are 2 front-ends linked in the OP, I don't know which is better)
I don't think there have been a lot of ground-breaking developments in self-hosted TTS lately, so that's probably why there's not a lot of buzz around it.
>>
>>5573990
Same goes with any form of OSS audio lately. Stable Audio 1.0 got leaked but it's dogshit, and that's probably the best we'll get for a while. The struggle is real
>>
>>5573990
>>5574303
tragic, thanks for the info tho
>>
>>5574583
The best fully open source option for TTS at the moment seems to be XTTSv2/StyleTTS2 + RVC, but if there isn't already an RVC model of the voice you're trying to replicate you have to train it yourself.
>>
>>5574589
i'll check it out, thanks anon. i was planning on using it mostly to check the quality of RVC voices I train anyway
>>
File: tinytim-hl2zombie.webm (1.8 MB, 288x360)
1.8 MB
1.8 MB WEBM
>>
>>5574589
welp, looks like ive temporarily bricked my local RVC by fucking with the text to speech programs in the OP.
>>
>>5574973
Did you reuse the same python virtual environment or something? That's not a good idea, venvs are there specifically to avoid dependency conflicts.
>>
>>5567515
>>5573990
Don't forget that you can apply your own voice models to elevenlabs audio. Just a matter of finding a model on that site that has a similar enough accent.
>>
>>5575180
i mean, i've learned that now lul

>>5575184
probably the better idea but I was curious
>>
>tfw people keep memeing about America while forgetting about the true imperial collapse
>>
>>5570146
The guy's clearly a troll. Wouldn't bother.
>>
>>5575184
sure wish i did this first lol, woulda saved the headache.
https://voca.ro/1e3MnmkdXHsp
>>
out of all the messing around i've done with this I think this one turned out the best so far.
>>
>>5575587
link for the vader model if anyone else wants to fuck around with it. have fun niggers.
https://huggingface.co/OwlCity/OwlCityRVC/resolve/main/Darth%20Vader%20Ultimate.zip?download=true
>>
I wonder if Marceline-Mate is still here.
>>
>>5576033
well, one of us is here anyway. i haven't had any inspiration for songs that sound good with her voice

have any suggestions?
>>
>>5576033
I'm here. I. Was thee one getting scolded/taught about using off key octaves to get the right voice. I've mostly just been way more busy with other things to spam the thread like before, but my mixtape grows (slowly)
>>
File: marcy_friendly_teeth.webm (3.09 MB, 500x709)
3.09 MB
3.09 MB WEBM
>>5576072
>>5576033
here's a quick little something
https://voca.ro/12XuNzb5XTBM
>>
>>5576163
i was getting decoding errors with this webm on firefox so i re-encoded
>>
>>5563016
damn this is good.
i legit thought it was some AI voice cover of an existing song and been googling the lyrics like a retard.
can't wait for the voices to become lifelike so i can finally create all the random songs and melodies that've been rolling around in my head throughout the years.
future's bright
>>
>>5576163
nice one anon, glad you're still around

my RVC is being fucky right now and probably looping or frozen on something so I cant make anything new. I found this song a while ago and it always made me think of early Marceline.
https://voca.ro/18fgeMfqyt33

Also, I don't know how to upload my trained models into the google sheet so the Vader anon inspired me to share this way. all three models are 300e and pretty good imo
Eclipsa: https://mega.nz/file/qY4XUDjB#SAbKNf1GhyHGL2Nw4b1V5TCetjed70kYf9gpucpAeVs
Huntress Wizard: https://mega.nz/file/KQRDlTLB#WHF_SiEeNX6oor86gQBsMBRlMEPSQxM7Fv7sEq8MDW4
Nicole Watterson: https://mega.nz/file/PRRSnC7R#VjcxY8eykAXZYQt_xy7XdqsH5ZgMMOfeYfHMNPW8vxg
>>
>>5576197
https://voca.ro/1kxL7lqAkBjj
>>
>>5576597
i think RVC isnt seeing my GPU anymore for some reason. Any ideas on how to fix that?

>>5577075
rockin anon, good vibes on that one
>>
>>5577668
okay so that wasnt the issue, stuff may be much more fucked than anticipated
>>
>>5577075
Would you mind doing one for https://www.youtube.com/watch?v=I-ed7GhM3F0 ?
>>
File: marcy_sk8er_boi.webm (2.21 MB, 500x500)
2.21 MB
2.21 MB WEBM
>>5577894
https://voca.ro/1dijnFjloZuY
>>
>>5578095
King shit
>>
https://vocaroo.com/18a5WYI0wBUN

I need to make more high-energy song covers. maybe I'll do a bunch of Avril or Pink for my next lot
>>
>>5568225
A Nigger's Creed
>>
>>5578840
Is the game gonna be that bad?
>>
>>5579960
It's a modern Ubisoft game. Yes, it's going to be that bad.
>>
What's the best TTS tool that does voice cloning rn? I just wanna have my waifu read me audiobooks.
>>
>>5580380
find a free model on Elevenlabs that sounds close and use RVC to change the voice. works great for me
>>
>>5580713
I used to use elevenlabs and it was so convenient cause I am too much of a brainlet to mess with models and all the stuff on the OP but it no longer does cloning for free.
>>
>>5580729
yeah same, but they have some free models you can find in the voicelab>Add generated or cloned voice>voice library that you can use. After you get a good-sounding reading, you can use RVC to change the voice from the elven labs voice to whoever you want
>>
>>5580729
As long as you're not training your own models its fairly easy to use RVC. Problem is that someone has to have cloned your waifu already
>>
File: techpriherm.webm (2.61 MB, 512x512)
2.61 MB
2.61 MB WEBM
>>5580756
>>5580380
Also if you already have the audiobook you don't even need to do any TTS. Male-to-female voice alteration is fairly trivial.
>>
>>5580845
Oh true. Since it's an audio book, half of it is done already. Unless they have multiple readers or they try to make different voices for characters. I've had a few do that and idk if it's sound good if a female voice is dubbed over a male reader making a female pitch voice lol.
>>
>>5581011
>they try to make different voices for characters.
That still works fine a lot of the time unless the original reader has larger vocal range and the model simply spazzes out when they go up a bunch of octaves and suddenly Emma Watson is making dog whistling noises.
>>
>>5574818
I hate this a lot but also love it
>>
>>5568389
They confirmed that the game's protagonists are lgbtqiaetcetc
didn't specify gay, but highly likely
>>
File: AverageBlackPreacher.gif (2.78 MB, 498x270)
2.78 MB
2.78 MB GIF
Frens... The power of Christ compelled Suno to make a certified banger. And I can't grooving to it.
https://suno.com/song/3e68a014-37b7-4c85-80e4-dceaa9075dca
>>
>>5581344
*can't stop grooving
>>
>>5581346
*can't stop grooming
>>
>>5568931
I wanna redo the whole fight with the jack of blades vocals, it's honestly pretty fucking cool I just gotta not be a lazy ass and put it together.
>>
>>5559228
there are city statues of him now
>>
>>5559228
of course. he gets mentioned almost every time time american internal politics get brought up.
like seriously it's almost like godwin's or mutt's law.
>As a discussion concerning the state of American police or civil rights or any remotely political topic for that matter grows longer, the probability of George Floyd being brought up approaches 1
>>
>>5581793
>>5581802
that's mad
cant wait for this era to end
>>
>>5580729
How's the project going anon?
>>
yeah to anyone wondering rvc is really easy to use, my pc is too shit to run it locally but i use this https://colab.research.google.com/github/hinabl/AICoverGen-Colab/blob/main/Hina_Mod_AICoverGen_colab.ipynb

it's really easy, just look at it and fuck around if you're not a total nigger brain you'll figure it out pretty quick.
>>
>>5582132
if you are running it locally though, just be sure you have the C++ from the windows dev tools and generic python libraries installed. i got a headache trying to install it on a laptop I've got before I figured that out
>>
>>5582132
>you're not a total nigger brain you'll figure it out pretty quick.
I must be a niggerbrained cause I don't know how to get models to work. The list doesn't show up and I don't know where to look for them.
>>
>>5582757
to get a voice in RVC, you want to put the [name].pth in the RVC>assets>weights folder and the folder with the same [name] that has the .index file in the RVC>logs folder.

If you don't have the index file in the logs folder, you can also put the direct path to it in RVC, but there's no point to doing that.

Then just click the inferencing voice drop-down and select the one you want. RVC will select the index file if the folder is named the exact same. slide the re-sampler to 48000 because why not and mess with stuff until it sounds right.
>>
>>5557215
Is this sora? Where can I download?
>>
>>5582896
I don't know what literally any of this means. Assume I just have a whole load of sample audio. How do I make a pth file. How do I make an index file.
>>
>>5582995
ooh, you're trying to train the voice. unlucky that you need to. i assume it's not a character in the google sheet above.

Go to "Train" in RVC

On the first row, name your model, then select 48k, true, and version 2.

On the second row, enter the path for the folder, leave speaker ID as 0, and click process data.

Third row, leave as is, but select rmvpe_gpu is available. click feature extraction.

fourth row, save every x (I usually do 20 just to be safe) epochs, set the training to around 200 epochs, leave batch size, save only the latest .ckpt, don't cache the training sets, and select no for the saving each save points.

Finally, for model G path, put assets/pretrained_v2/f0G40k.pth and for the model D path, put pretrained assets/pretrained_v2/f0D40k.pth. GPU indexes are 0, and click one-click training

After a while, RVC will save the voice and you can select it from the dropdown
>>
>>5557210
B-B-BUT AI WILL MEAN THE CONTENT WILL BE LOW QUALITY
NONONONO CHUDS YOU CANT HAVE FUN PLEASE STOP NOOOOOOOOOOOOO
>>
>>5557215
>>5557207
more songs pls
>>
>>5583021
But there's no train function in
>>5582132
>>
>>5583149
ooh, i have no idea how that collab works. i was giving instructions for local/virtual machine RVC
>>
>>5583149
>>5583158
okay, i opened the webui for the colab the other anon posted. There's no training so id recommend downloading RVC yourself if you cant find the voice in the OP.

if you cant run RVC (idk how taxing it is on GPUs) send a mega or huggingface of the choice clips and I can make it for you overnight
>>
>>5583168
I'd so appreciate it if you did, anon. Here's the clips I've got
https://files.catbox.moe/asjdgu.zip
Ripped from this vid
https://youtu.be/zwkWWXVt-vY
I'm sorry if I may've come off as rude earlier. It's just all so confusing cause there's a million tutorials all using different methods and stuff and they all assume you know the fuck you're doing and I very much don't.
>>
>>5583197
oh hell yeah, curie, good choice. it's wild no one made a model for her. also it's all good. kinda hard to talk and troubleshoot when we cant see the same thing lol. that, and there are a shit ton of outdated tutorials.

I'll get started after work
>>
>>5583217
Tysm anon!
>>
>>>5582132
I've been using this all day and now it suddenly hits me with
>Found no NVIDIA driver on your system.
>>
>>5583234
aight, i got a bunch of clips from curie dialogue so its ready to train, but it's gonna take like 3 hours so I'm gonna let it run overnight. I'll post the results tomorrow evening or in the morning if I have the time/energy to do it before work lol
>>
got inspired by the audiobook anon, but I think my "I want my waifu to direct guided meditation" has an actual air of sad around it lol.
https://voca.ro/1fSWPihDAzUd
https://voca.ro/19L7l16IynQ0
https://voca.ro/14dWHta5REFU
>>
>>5583565
Which character's that?
>>
File: 1718015480718870.webm (4.77 MB, 1280x720)
4.77 MB
4.77 MB WEBM
>>
>>5583234
So it's done. I think it's okay, but using it for English loses out on her inflections / a lot of her accent or throaty sounds. (unless I did it wrong, but idk) It's not like elevenlabs where it made the voice from the sound files, but it sounds pretty good when I used a french accent in elevenlabs to cover. I wonder if I try again but with maybe double the audio files its do it, but it may be more than the program can handle.

American Accent Elevenlabs Vocals dubbed over: https://voca.ro/12XThtMgaJKC

French Accent Elevenlabs Vocals dubbed over: https://voca.ro/1oboDLM2JOXs

Voice files: https://mega.nz/file/vc4CRKaT#CrNLPnZshH5ekgmbWlARL7vxS2lIEEAZtHREN-iJLyU
>>
>>5584057
Oh my god, anon, thank you so much! I see what you mean, it's not as good as elevenlabs when I tried it out, but it's still pretty fantastic all things considered. Still, thank you, anon. I was pretty bummed out I couldn't do it myself, but you made my day.
>>
>>5584138
>>5584057
i tried cloning her with xtts2 and it came out... ok. this is a sample with no postprocessing and almost no input processing. just posting to have some other examples of other processes
https://voca.ro/1kkgCxtrAPqR
>>
>>5584138
No prob anon, hope it works out for you

>>5584330
This sounds pretty good imo. When I tried xtts2 I wasn't very impressed but this is good. I may have been moreso mad that it ruined my rvc python setup but thats 100% my fault.
>>
>>5584339
are you using separate python versions between RVC and xtts2? how did you install each, or what was making them overlap? i cant remember if xtts2 needed a different version of python than RVC but with venv you shouldnt have issues with them overlapping/conflicting (i dont, anyway)
>>
>>5584353
Yeah I'm not using a venv so when I tried to use both it imploded. I'm kinda a brainlette and was happy enough using just rvc so I never saw the point.
>>
>>5584330
That sounds pretty good too, though the reverb-ish sound ai voices have is more pronounced here. Really cute choice of text, too.
>>
>>5584330
as a second test i trained an rvc model from ~2hrs of her human dialog and applied that over the xtts2 results and that definitely helps with the raspy/reverby ai xtts2 render. you could totally do a whole audio book this way but for something production quality you'd have to do multiple gens and lots of hand picking from best results

https://voca.ro/12UJWWvc6Dhb
>>
>>5584562
When you do rvc models, do you split it into a bunch of small clips or do you leave it as a long mp3? Because mine was about 20 min worth of 80 small clips
>>
>>5584572
i yt-dlp'd the dialog video posted earlier, cut out just the human dialog in ~5 minute long sections, and used the directory with those clips as the training path. there were 22 five minute long clips and 1 two and a half minute long clip. if i can find a good host for it i'll post my training data and checkpoints for e150, e200, and e250 if you want it. it's ~5 gigs tho
>>
>>5584575
>>5584572
but ultimately because of the rvc step 2a (the first "process data" button), it shouldnt matter how the original data is formatted. it should be just as good to leave them separated or concatenated. originally i tried to feed it the entire 2hr long dialog as one audio file and it really really didnt like that so i split it up
>>
>>5584575
>>5584578
that would have been way easier, In the only tutorial video I've watched the guy said to use ~10-second clips so I've been doing that for all of my models lol. been a real time sink

Now I know though, and thank god, that'll be way easier than making 80-100 tiny clips.
>>
>>5584598
The 10ish seconds is because processing takes more vram the longer the clip, so it's generally advisable to keep it under 30 seconds or less depending on your GPU.
I made myself a script that calls ffmpeg and auto splits all audio files in a directory into 30 second segments. Even though preprocessing should split it, it's not perfect.
>>
Dream Machine just dropped
https://lumalabs.ai/dream-machine

how long will it stay up
>>
>>5584672
does it matter if it splits in the middle of a word or sentence? because I've been manually clipping between words when possible and if it doesn't then that would be nice
>>
>>5584708
In the middle of a sentence? No. It might have some detrimental effect if it cuts it off in the middle of a word, but with like an hour of stuff I can't imagine it would be a huge deal.
When processing, it identifies phonemes (every sound that you can make to form a word) so it's already splitting things into fragments of words in a way. Worst case it would identify the hacked up audio as a different phoneme, but there should be plenty other examples to grab if there's enough audio. I'm sure not gonna spend time manually splitting 2 hours of audio when I can do it in a script.
>>
File: oragnge3d_1.webm (2.1 MB, 1024x1024)
2.1 MB
2.1 MB WEBM
>>5584695
Luma image input
>>
>>5568212
insane. hardcore. beautiful.
>>
quick test of an injured hayden christensen model speaking tweaked jack of blades lines. excuse the second obi wan, I forgot to drag it out.
>>5583149
If you look around on huggingface there are a lot like that one with the training function intact. If use find one on colab that lets you train, don't do it on an account you care about, google is very butthurt over competing ai shit.
>>
>>5584057
>>5584138
I'm gonna try and make a new model with way more data than this one to see if it can get her accent down. I don't know why it wouldn't if rvc can make a good neco arc
>>
>>5585292
your rvc model isn't the problem. your input needs a french accent for it to come through in the conversion. i'll just give you my model since i've already trained more than enough for whatever you need.

>example spoken in french then converted
https://voca.ro/1jPnq7WuQJim

>weights+index (e945, 48k)
https://files.catbox.moe/ssa773.zip
>>
>>5585323
converting speaker doing no french accent, then a french accent highlights the difference. this was converted all at once with that rvc model
https://voca.ro/1kd0QQPS3fPi
>>
>>5572024
This isn't AI.
>>
>>5585337
If you say so. I do t understand why the French accent isn't pulled out like the sounds neco arc makes. Though I guess in curies case it's moreso between words and letters that have the most impact on the accent rather than a nya sound to some words
>>
>>5576197
>i was getting decoding errors with this webm on firefox so i re-encoded
So this shit is why I can't open videos sometimes?
>>
>>5585702
the technical details were using a jpeg without specifying pix_fmt in ffmpeg. to re-encode i used pix_fmt yuv420p explicitly. it might have had to do with the height not being divisible by 2, but that's usually only a libx264 requirement
>>
>>5585429
if accent A uses phoneme X in a word while accent B replaces for phoneme Y, no matter how much you train on accent A any input with accent B will use the model speaker's phoneme Y. afaik there's nothing in rvc to translate phonemes between accents
>>
>>5585749
Interesting. Thanks for spelling it out for me and saving my graphics card another 3-4 hours of suffering lol
>>
>>5585749
To add to this, different languages have different phonemes. Like Japanese doesn't have "th" for instance. So like if you train on a voice that is only speaking Japanese, it will be lacking certain phonemes used in English. In this sense, sometimes an accent can bleed through, but even then it's pretty slight.
I feel like current voice-to-voice tech is very misunderstood. Basically nobody knows that it's phoneme based, and probably not a lot of people even know what a phoneme is.
>>
>>5576197
There was a song by Leonard Cohen called 'The captain", which was probably the second coolest song he did. If Hallelujah didn't take off the way it did, the captain would have probably turned out to be his sleeper hit.
Tom Waits did a slow burner in his younger years before his larynx turned into tire rubber called 'Martha' in the Closing Time album. As time has gone by, I appreciate it more than Swordfish Trombone, but anyway - food for thought.
>>
File: 1718251636008379.webm (383 KB, 1024x1024)
383 KB
383 KB WEBM
>>
>>5581344
Holy shit based and blessed
>>
>>
>>5581872
The statues are made to demoralise you, it's to make you feel powerless against them. Which itself is an admission that you aren't as otherwise they wouldn't need to instill that belief.
>>
>>5586089
>all these months and suno is still metallic as fuck
shite ai
>>
>>5586084
fucking gold
>>
File: improved AI.webm (5.79 MB, 428x480)
5.79 MB
5.79 MB WEBM
>>
>>5585861
That's fair to say. I never really looked into how it wors, so RVC is essentially a scroll that I cast to get my waifu to sing or read. I imagine that's how it is for most people.
>>
File: never-forgetti.webm (5.35 MB, 1280x720)
5.35 MB
5.35 MB WEBM
>>5586363
Soul vs soulless
>>5586370
It's cool that it's easy enough for anyone to use. It just helps to know more about it so that you know what to expect and have the knowledge to train better models.
>>
File: vegeta-youre-the-best.webm (5.93 MB, 1280x720)
5.93 MB
5.93 MB WEBM
>>
File: It's so over.webm (4.47 MB, 720x720)
4.47 MB
4.47 MB WEBM
I'm not ready bros...
>>
File: 1718388831148.webm (4.4 MB, 1280x720)
4.4 MB
4.4 MB WEBM
>>5584695
>>
>>5587865
I was not ready for the AT-AT turning into a fucking mech, kek.
>>
>>5587593
amazing. Fore sure his theme song lol. what Vegeta model did you use?
>>
>>5588036
> what Vegeta model did you use?
Made it myself just yesterday. It's so-vits-svc, can't speak for compatibility with rvc. I don't mind sharing it if you want but very few people are interested in sovits models.
>>
>>
>>5588098
anon, it's beautiful. I'm in tears.
>>
>>5588086
might as well, you never know.
also do you have an mp3 of the song? i can pull it from the webm but it doesn't hurt to ask
>>
>>5588355
Webm's audio is 192kbps opus, which should be good enough, but here:
https://vocaroo.com/1b4vzGmxpp0s
https://huggingface.co/chameleon-ai/so-vits-svc-models/tree/main/vegeta
>>
>>



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.