Previous thread >>5515011 Dedicated Suno/Udio thread >>5553687 Post AI generated stuff. Song covers, animations, etc.OC encouraged, but not required.This thread focuses on audio and video with an audio component.Let me know if you have more links to add. This thread is a work in progress.> Voice-to-VoiceRVC walkthrough (somewhat outdated, collab is dead): https://docs.google.com/document/d/13_l1bd1Osgz7qlAZn-zhklCbHpVRk6bYOuAuB78qmsE/editModels, mega links, and mirrors: https://docs.google.com/spreadsheets/d/1tAUaQrEHYgRsm1Lvrnj14HFHDwJWl0Bd9x0QePewNco/edit#gid=0https://github.com/Mangio621/Mangio-RVC-Forkhttps://github.com/Vali-98/XTTS-RVC-UIhttps://github.com/voicepaw/so-vits-svc-fork> Text-To-Speechhttps://github.com/collabora/WhisperSpeechhttps://github.com/myshell-ai/OpenVoicehttps://github.com/yl4579/StyleTTS2https://github.com/BoltzmannEntropy/xtts2-uihttps://github.com/daswer123/xtts-webui (Warning: Windows version uses prebuilt binaries that anons haven't verified. Use at your own discretion)> Musichttps://github.com/facebookresearch/audiocrafthttps://rentry.org/AudioCraftRemix> Audio Cleanuphttps://github.com/Anjok07/ultimatevocalremoverguihttps://github.com/resemble-ai/resemble-enhance> Related boards>>>/aco/asdg>>>/aco/csdg>>>/b/degen>>>/d/ddg>>>/e/edg>>>/g/sdg>>>/g/lmg>>>/g/aicg>>>/h/hdg>>>/trash/sdg>>>/u/sdg>>>/vg/aids>>>/vt/vtai
ATTENTIONIf you're looking for text to song, there is a separate suno\udio thread due to high interest relative to other posts>>5553687
taking webms from dud thread
post the one where commander shepard calls jacob a subhuman
>>5557210Nice
>>5557210The good old times, when the US wasn't filled with nigs and mongrels.Cursed be the synagogue of satan.
Been having some fun with Viggle AI, there doesn't seem to be filtering on who you can upload. I assume you can't upload nudity ofc, but I uploaded a shirtless guy and the Abu Ghraib prisoner without any trouble.
>>5557785
>>5557791
>>5557785jesus christ that's cursed
>>5557785if i could make a request for the same dance but withBudd DwyerRicardo LópezShuaib Aslam& Gleb Korablev?
I tried to upload George Floyd but it got blocked, said some shit about community guidelines. Super easy to circumvent.
>>5557785jesus.. funniest thing ive seen
>>5557210"heroines", not heroes. so tired of people using the wrong world all the time. still saved though.man, makes me wonder where the fixed version of little mermaid live action is?
https://docs.google.com/document/d/17fjNvJzj8ZGSer7c7OFe_CNfUKbAxEh_OBv94ZdRG5c/edit#heading=h.n8ac32fhltggwanted to drop this guide that I found for using UVR, just in case any anon might find it helpful.
>>5557785fucking LMAOOOOOOOOOOOOOOO holy shit you got me good
>>5557969Does anybody outside of us memers even give a shit about this guy anymore? He seems memoryholed already
Anybody got those of the zoomer concert entrance?
Anyone have this?https://desuarchive.org/wsg/thread/5518247/#5538629
>>5559115thanks for sharing anon, seems really comprehensivehttps://voca.ro/11yLUnVg1lU8
>>5557785wtf it was a prank all along?
Praise be the Omnissiah for Mechanicus II: Necron Boogaloo has been announced.
>>5557785i.... can't look away
>>5559626How does it change key from the original like that? Isn't it taking the notes straight from the vocal track?I've definitely heard some that do not do this that cause the ai vocal to be outside the character's normal range, but hits the notes normally.
>>5560022I might have left a filter on the pitch control that I shouldn't have. That might be why it sounds flatter than the original. Unless that's not your question, then if it's about the key/purchases the vocals are on, I changed it manually in RVC because without adjustment it was out of marcelines usual singing vocal range.
>>5560367Key/pitch. Damn phoneposting got me good there
>>5560367I am going to be honest, I think you should also match the backing with a straight pitch up to the same key as the vocals. It clashes pretty hard here without that.It is interesting to know that you can adjust the vocal pitch though.
>>5560706I'll try it and see if it works out.
>>5560706so idk why i even pitch shifted the vocals to begin with. this sounds fine without it IMO. I also completely removed the pitch filter and it sounds way better.https://vocaroo.com/16Rm1sCOcqaW (embed) (embed)Here's what it sounds like with the instrumental pitch shifted the same as the vocal track (+1/12 octaves or +16%). Pitch shifting to the same pitch f/g 1 didn't work either and sorta made it sound like an attempt at a metal remix of the notes. but this is the funnier result because the instrumental sounds like something outa budokai 1.https://vocaroo.com/18fxle7SzRmU (embed) (embed)
>>5560981This is very confusing to hear.The vocal is closer to a half-step up from the original; the 2nd link with the shifted instrumentation overshot it. By the way, according to audacity, a half-step is approximately a 5.95% change, pic related was what I applied to match the vocal.As for the filter, I will be honest I can't significantly distinguish the vocals apart between the three in the first minute or so, though I did not listen to the entirety of each.
>>5560981Just for fun, tried shifting your sample down as an example. It is really clear that the vocal track didn't match the backing track at 0:35 when my noob attempt at vocal isolation let the guitar through.I recommend doing something similar with your isolated vocal track instead.
>>5561300Wait so was the issue the difference in key between the main vocals and the backing vocals or the main vocals and the instrumental?
>>5561578Paramore track: unchanged pitch, vocal removedAI track: pitched half-step down, vocal isolatedThe clash is because of the pitched down AI track having a crude isolation that lets some of the AI instrumental backing through, which is off-key now because the AI vocal was off-key and correcting that puts the instrumentation off-key instead.
>>5561608i can't make any real adjustments right now, but I think what you're getting at is that I need to re-isolate the vocals to ensure all of the instrumentals are pulled from them. I've been trying changing the pitches of both the instrumental and the isolated vocals in Audacity separately (to test the effect of the pitch shift) by max a half step to .5 or even .25 half steps and every time the pitch adjustment just makes the tracks sound waaay off. I'll try restarting with the original track later and see if i can't get a better result after consulting the guide above. Thanks for the tips though anon, I'm a giga novice when it comes to sound mixing so really i don't quite understand completely but this has been interesting to learn about. I just sorta always wrote off any mixing issues or mistakes as "imagine you're in karaoke or something and it just wont be perfect lol"
>>5561730Try listening to the original track alongside the AI one to compare the two. It is really obvious then. In audacity, you can import the two and play them alongside easily.I assumed you were getting an isolated vocal output already; weird and annoying that it would pre-mix it for you like that when it's only generating a vocal.Good luck separating the two, I am clueless there as well.
>>5561765i re-isolated the instrumentals from the original song and overlayed the vocals from my old one. I noticed the old one started a few beats earlier than the original song for some reason so I lined them up and also noticed the key was off a bit on the old one like you were saying. I think I'll double-check like this for my future covers. i had no idea something like this would even happen.This one is a new one's instrumental sounds exactly the same as the original when I line them up. Is it? https://voca.ro/1lSvduVJ4K4V
The joke is on me. How many times will I listen to this? Four apparently.You are baiting or tone-deaf, and I do not care which. Fuck this.
>>5562056man i genuinely dont hear what youre hearing then lol, my bad ¯\_(ツ)_/¯
>>5562066>>5562056the only thing different is the vocals and, of course they are.
>>5557785kekbut definitely would be way funnier with some samsung tunes
>>5559228reddit and twitter will bring him up whenever they want to kill cops
>>5557207
Does anyone have the Frank horrigan one where he sings human? The one on YouTube was deleted...
>>5563016Did you made this with Suno AI? What's the title?
>>5562714i didnt have it, so i made one. needed to compress the shit outa it to post a webm though lolhttps://voca.ro/1lVzDcXAbGfHhttps://files.catbox.moe/zf8bxw.mp4
>>5564740Thanks bro!
>>5563527i reposted this so i could listen to it on my phone through clover, i actually have no idea
>>5564740i don't think it sounds entirely correct, maybe it's impossible to make him sing correctly.this one is more talking than singing.
>>5566130frank's got such a deep and crusty voice, there's no way it'll sound perfect when the AI's doing it. Best I could do is get close. That being said, that's a banger
tried several collabs to train a voice model but they disconnect randomly for maximun capacityany less known collab or local training?
>>5566353RVC has a built-in trainer. you just gotta do something else for a few hours
whats the preferred text to speech in the OP? is one better than the other or are they all pretty similar in quality/ease of use?
>>5567812
>>5567815
>>5567818
Why do they call him Vin?
>>5559228wdym? They just made a whole video game for him.
>>5568225the fuck is this shit
>>5568234It's the latest woke Assassin's Creed game.Ubisoft makes a game about ancient Japan and zoom in on the one guy in the whole country who was black. History shows that he did exist but was actually just a servant that they dressed up like a samurai for fun, but of course the devs pretend like that wasn't the case. There's an edit war on Wikipedia to "correct" the record and remove old information.In addition to lying about history (and the fact that it would be very retarded for the one guy that sticks out the most to be an assassin) you can also play as a stronk womyn because women are totally badass and can do everything a man can do. She's at least Japanese.Needless to say Japanese gamers are upset with the game and the left don't know how much they can bite back since they aren't as used to shit on asian people as they are shitting on white people.
>>5568244>the left don't know how much they can bite back since they aren't as used to shit on asian peopleguess we'll have to wait for the kikes to come up with the latest instructions for the golems
>>5568234>>5568244Don't forget, he is not only black, but also gay.
>>5568326lulz was that actually confirmed? i think that is just a meme
Trump just got convicted on all counts. Welcome to the greatest timeline.
>>5560068It's insane how an AI made this scene 10x more kino!
is it better with the backing vocalshttps://vocaroo.com/14LObMgeLBfpor without?https://vocaroo.com/1281P8KEL8Qh
>I forgot this thread existed.Damn, System Shock is a helluva drug.
>>5569042you should use the Halsey cover of that song instead, it'll sound more natural.
>>5569060Oh dang yeah thisll be way better. Thanks mate
>>5568612i heard, America is dead
>>5569042>>5569060once again, thanks anon, the whole ensemble fits better for it too. Dont know why I didn't search for a cover last night when I usually do. musta been suffering from poo brain https://vocaroo.com/1nSvXGUBMdr3
>>5569468no problem, glad it worked out.I like matching voices to songs, it's fun.
>>5569468did you adjust the pitch when you ran it through RVC? it sounds a little off to me. if so, I'd advise against doing that unless it's by +12 or -12, or a rap song.
>>5569450Why would America be dead just because a slave of the jews got declared guilty?
>>5569668Busted election system, busted legal system, busted media, people not believing in free speech (swallowing concepts like hate speech, disinformation, misinformation), values not passed on to the youth. Then there's people such as yourself who simply don't get it on their own and will probably not have the mental capacity to understand what I'm saying even when I point it out in plain text (not saying it just because I'm trying to pick a fight, it's simply true). There's still a thing called "America" or "USA" but it's in name only.
>>5569690everything is fake and gay. always has been.
>>5569573Yeah I tried it at +0 and it didnt sound right and theres no way +12 wouldnt sound like a squeeker in a COD lobby. So i adjusted it to +2 for most of the song and +3 for the chorus because I didn't feel like the +2 sounded like she was "trying enough" if you get what I mean.
>>5570096it's... off key, though. if it doesn't sound right at +12 or -12, or in rare cases -24 or +24, then the song isn't a good candidate for the voice you're using.
>>5570096You're not dealing with just random pitch shifting, the 12 tone scale is music theory. The pitch might allow the voice to sound more like the character, but shifting it causes it to be off key. So it'll be discordant and sound like shit when sung against the instrumental. Basically you have to shift it in intervals of 12, but you might have a slight chance of it not sounding like absolute ass at +7 which is a perfect fifth.https://en.wikipedia.org/wiki/Interval_(music)
>>5570146thank you for explaining that. I knew that it didn't work, but not why. I'll have to try +7 sometime.
>>5570146I see, guess I'll try +7 later because 0 was bad and 12 will be even worse
>>5570146Ah fuck I had a brain blast and I finally understand what you're saying about key and frequency after someone else explained it by comparing the keys with their hz values. Damn lol
>>5557210this restores my faith in ai
>>5572024a masterpiece
>>5570146idk what i was smoking lol. +0 is good. i guess I was so focused on making it sound exactly like the character I didn't really think of the big picture when it all came together. I'll keep it in mind from now on, thanks anon
>>5567515TTS doesn't seem to be used a lot by anons here, none of it is on the same level as elevenlabs, but if I recall, a few threads back anons liked xtts due to ease of use (there are 2 front-ends linked in the OP, I don't know which is better)I don't think there have been a lot of ground-breaking developments in self-hosted TTS lately, so that's probably why there's not a lot of buzz around it.
>>5573990Same goes with any form of OSS audio lately. Stable Audio 1.0 got leaked but it's dogshit, and that's probably the best we'll get for a while. The struggle is real
>>5573990>>5574303tragic, thanks for the info tho
>>5574583The best fully open source option for TTS at the moment seems to be XTTSv2/StyleTTS2 + RVC, but if there isn't already an RVC model of the voice you're trying to replicate you have to train it yourself.
>>5574589i'll check it out, thanks anon. i was planning on using it mostly to check the quality of RVC voices I train anyway
>>5574589welp, looks like ive temporarily bricked my local RVC by fucking with the text to speech programs in the OP.
>>5574973Did you reuse the same python virtual environment or something? That's not a good idea, venvs are there specifically to avoid dependency conflicts.
>>5567515>>5573990Don't forget that you can apply your own voice models to elevenlabs audio. Just a matter of finding a model on that site that has a similar enough accent.
>>5575180i mean, i've learned that now lul>>5575184probably the better idea but I was curious
>tfw people keep memeing about America while forgetting about the true imperial collapse
>>5570146The guy's clearly a troll. Wouldn't bother.
>>5575184sure wish i did this first lol, woulda saved the headache. https://voca.ro/1e3MnmkdXHsp
out of all the messing around i've done with this I think this one turned out the best so far.
>>5575587link for the vader model if anyone else wants to fuck around with it. have fun niggers. https://huggingface.co/OwlCity/OwlCityRVC/resolve/main/Darth%20Vader%20Ultimate.zip?download=true
I wonder if Marceline-Mate is still here.
>>5576033well, one of us is here anyway. i haven't had any inspiration for songs that sound good with her voicehave any suggestions?
>>5576033I'm here. I. Was thee one getting scolded/taught about using off key octaves to get the right voice. I've mostly just been way more busy with other things to spam the thread like before, but my mixtape grows (slowly)
>>5576072>>5576033here's a quick little somethinghttps://voca.ro/12XuNzb5XTBM
>>5576163i was getting decoding errors with this webm on firefox so i re-encoded
>>5563016damn this is good. i legit thought it was some AI voice cover of an existing song and been googling the lyrics like a retard.can't wait for the voices to become lifelike so i can finally create all the random songs and melodies that've been rolling around in my head throughout the years. future's bright
>>5576163nice one anon, glad you're still aroundmy RVC is being fucky right now and probably looping or frozen on something so I cant make anything new. I found this song a while ago and it always made me think of early Marceline.https://voca.ro/18fgeMfqyt33Also, I don't know how to upload my trained models into the google sheet so the Vader anon inspired me to share this way. all three models are 300e and pretty good imoEclipsa: https://mega.nz/file/qY4XUDjB#SAbKNf1GhyHGL2Nw4b1V5TCetjed70kYf9gpucpAeVsHuntress Wizard: https://mega.nz/file/KQRDlTLB#WHF_SiEeNX6oor86gQBsMBRlMEPSQxM7Fv7sEq8MDW4Nicole Watterson: https://mega.nz/file/PRRSnC7R#VjcxY8eykAXZYQt_xy7XdqsH5ZgMMOfeYfHMNPW8vxg
>>5576197https://voca.ro/1kxL7lqAkBjj
>>5576597i think RVC isnt seeing my GPU anymore for some reason. Any ideas on how to fix that?>>5577075rockin anon, good vibes on that one
>>5577668okay so that wasnt the issue, stuff may be much more fucked than anticipated
>>5577075Would you mind doing one for https://www.youtube.com/watch?v=I-ed7GhM3F0 ?
>>5577894https://voca.ro/1dijnFjloZuY
>>5578095King shit
https://vocaroo.com/18a5WYI0wBUNI need to make more high-energy song covers. maybe I'll do a bunch of Avril or Pink for my next lot
>>5568225A Nigger's Creed
>>5578840Is the game gonna be that bad?
>>5579960It's a modern Ubisoft game. Yes, it's going to be that bad.
What's the best TTS tool that does voice cloning rn? I just wanna have my waifu read me audiobooks.
>>5580380find a free model on Elevenlabs that sounds close and use RVC to change the voice. works great for me
>>5580713I used to use elevenlabs and it was so convenient cause I am too much of a brainlet to mess with models and all the stuff on the OP but it no longer does cloning for free.
>>5580729yeah same, but they have some free models you can find in the voicelab>Add generated or cloned voice>voice library that you can use. After you get a good-sounding reading, you can use RVC to change the voice from the elven labs voice to whoever you want
>>5580729As long as you're not training your own models its fairly easy to use RVC. Problem is that someone has to have cloned your waifu already
>>5580756>>5580380Also if you already have the audiobook you don't even need to do any TTS. Male-to-female voice alteration is fairly trivial.
>>5580845Oh true. Since it's an audio book, half of it is done already. Unless they have multiple readers or they try to make different voices for characters. I've had a few do that and idk if it's sound good if a female voice is dubbed over a male reader making a female pitch voice lol.
>>5581011>they try to make different voices for characters. That still works fine a lot of the time unless the original reader has larger vocal range and the model simply spazzes out when they go up a bunch of octaves and suddenly Emma Watson is making dog whistling noises.
>>5574818I hate this a lot but also love it
>>5568389They confirmed that the game's protagonists are lgbtqiaetcetcdidn't specify gay, but highly likely
Frens... The power of Christ compelled Suno to make a certified banger. And I can't grooving to it.https://suno.com/song/3e68a014-37b7-4c85-80e4-dceaa9075dca
>>5581344*can't stop grooving
>>5581346*can't stop grooming
>>5568931I wanna redo the whole fight with the jack of blades vocals, it's honestly pretty fucking cool I just gotta not be a lazy ass and put it together.
>>5559228there are city statues of him now
>>5559228of course. he gets mentioned almost every time time american internal politics get brought up.like seriously it's almost like godwin's or mutt's law.>As a discussion concerning the state of American police or civil rights or any remotely political topic for that matter grows longer, the probability of George Floyd being brought up approaches 1
>>5581793>>5581802that's madcant wait for this era to end
>>5580729How's the project going anon?
yeah to anyone wondering rvc is really easy to use, my pc is too shit to run it locally but i use this https://colab.research.google.com/github/hinabl/AICoverGen-Colab/blob/main/Hina_Mod_AICoverGen_colab.ipynb it's really easy, just look at it and fuck around if you're not a total nigger brain you'll figure it out pretty quick.
>>5582132if you are running it locally though, just be sure you have the C++ from the windows dev tools and generic python libraries installed. i got a headache trying to install it on a laptop I've got before I figured that out
>>5582132>you're not a total nigger brain you'll figure it out pretty quick.I must be a niggerbrained cause I don't know how to get models to work. The list doesn't show up and I don't know where to look for them.
>>5582757to get a voice in RVC, you want to put the [name].pth in the RVC>assets>weights folder and the folder with the same [name] that has the .index file in the RVC>logs folder. If you don't have the index file in the logs folder, you can also put the direct path to it in RVC, but there's no point to doing that. Then just click the inferencing voice drop-down and select the one you want. RVC will select the index file if the folder is named the exact same. slide the re-sampler to 48000 because why not and mess with stuff until it sounds right.
>>5557215Is this sora? Where can I download?
>>5582896I don't know what literally any of this means. Assume I just have a whole load of sample audio. How do I make a pth file. How do I make an index file.
>>5582995ooh, you're trying to train the voice. unlucky that you need to. i assume it's not a character in the google sheet above. Go to "Train" in RVCOn the first row, name your model, then select 48k, true, and version 2. On the second row, enter the path for the folder, leave speaker ID as 0, and click process data. Third row, leave as is, but select rmvpe_gpu is available. click feature extraction. fourth row, save every x (I usually do 20 just to be safe) epochs, set the training to around 200 epochs, leave batch size, save only the latest .ckpt, don't cache the training sets, and select no for the saving each save points. Finally, for model G path, put assets/pretrained_v2/f0G40k.pth and for the model D path, put pretrained assets/pretrained_v2/f0D40k.pth. GPU indexes are 0, and click one-click trainingAfter a while, RVC will save the voice and you can select it from the dropdown
>>5557210B-B-BUT AI WILL MEAN THE CONTENT WILL BE LOW QUALITYNONONONO CHUDS YOU CANT HAVE FUN PLEASE STOP NOOOOOOOOOOOOO
>>5557215>>5557207more songs pls
>>5583021But there's no train function in>>5582132
>>5583149ooh, i have no idea how that collab works. i was giving instructions for local/virtual machine RVC
>>5583149>>5583158okay, i opened the webui for the colab the other anon posted. There's no training so id recommend downloading RVC yourself if you cant find the voice in the OP. if you cant run RVC (idk how taxing it is on GPUs) send a mega or huggingface of the choice clips and I can make it for you overnight
>>5583168I'd so appreciate it if you did, anon. Here's the clips I've got https://files.catbox.moe/asjdgu.zipRipped from this vid https://youtu.be/zwkWWXVt-vYI'm sorry if I may've come off as rude earlier. It's just all so confusing cause there's a million tutorials all using different methods and stuff and they all assume you know the fuck you're doing and I very much don't.
>>5583197oh hell yeah, curie, good choice. it's wild no one made a model for her. also it's all good. kinda hard to talk and troubleshoot when we cant see the same thing lol. that, and there are a shit ton of outdated tutorials. I'll get started after work
>>5583217Tysm anon!
>>>5582132I've been using this all day and now it suddenly hits me with>Found no NVIDIA driver on your system.
>>5583234aight, i got a bunch of clips from curie dialogue so its ready to train, but it's gonna take like 3 hours so I'm gonna let it run overnight. I'll post the results tomorrow evening or in the morning if I have the time/energy to do it before work lol
got inspired by the audiobook anon, but I think my "I want my waifu to direct guided meditation" has an actual air of sad around it lol.https://voca.ro/1fSWPihDAzUdhttps://voca.ro/19L7l16IynQ0https://voca.ro/14dWHta5REFU
>>5583565Which character's that?
>>5583234So it's done. I think it's okay, but using it for English loses out on her inflections / a lot of her accent or throaty sounds. (unless I did it wrong, but idk) It's not like elevenlabs where it made the voice from the sound files, but it sounds pretty good when I used a french accent in elevenlabs to cover. I wonder if I try again but with maybe double the audio files its do it, but it may be more than the program can handle. American Accent Elevenlabs Vocals dubbed over: https://voca.ro/12XThtMgaJKCFrench Accent Elevenlabs Vocals dubbed over: https://voca.ro/1oboDLM2JOXsVoice files: https://mega.nz/file/vc4CRKaT#CrNLPnZshH5ekgmbWlARL7vxS2lIEEAZtHREN-iJLyU
>>5584057Oh my god, anon, thank you so much! I see what you mean, it's not as good as elevenlabs when I tried it out, but it's still pretty fantastic all things considered. Still, thank you, anon. I was pretty bummed out I couldn't do it myself, but you made my day.
>>5584138>>5584057i tried cloning her with xtts2 and it came out... ok. this is a sample with no postprocessing and almost no input processing. just posting to have some other examples of other processeshttps://voca.ro/1kkgCxtrAPqR
>>5584138No prob anon, hope it works out for you>>5584330This sounds pretty good imo. When I tried xtts2 I wasn't very impressed but this is good. I may have been moreso mad that it ruined my rvc python setup but thats 100% my fault.
>>5584339are you using separate python versions between RVC and xtts2? how did you install each, or what was making them overlap? i cant remember if xtts2 needed a different version of python than RVC but with venv you shouldnt have issues with them overlapping/conflicting (i dont, anyway)
>>5584353Yeah I'm not using a venv so when I tried to use both it imploded. I'm kinda a brainlette and was happy enough using just rvc so I never saw the point.
>>5584330That sounds pretty good too, though the reverb-ish sound ai voices have is more pronounced here. Really cute choice of text, too.
>>5584330as a second test i trained an rvc model from ~2hrs of her human dialog and applied that over the xtts2 results and that definitely helps with the raspy/reverby ai xtts2 render. you could totally do a whole audio book this way but for something production quality you'd have to do multiple gens and lots of hand picking from best resultshttps://voca.ro/12UJWWvc6Dhb
>>5584562When you do rvc models, do you split it into a bunch of small clips or do you leave it as a long mp3? Because mine was about 20 min worth of 80 small clips
>>5584572i yt-dlp'd the dialog video posted earlier, cut out just the human dialog in ~5 minute long sections, and used the directory with those clips as the training path. there were 22 five minute long clips and 1 two and a half minute long clip. if i can find a good host for it i'll post my training data and checkpoints for e150, e200, and e250 if you want it. it's ~5 gigs tho
>>5584575>>5584572but ultimately because of the rvc step 2a (the first "process data" button), it shouldnt matter how the original data is formatted. it should be just as good to leave them separated or concatenated. originally i tried to feed it the entire 2hr long dialog as one audio file and it really really didnt like that so i split it up
>>5584575>>5584578that would have been way easier, In the only tutorial video I've watched the guy said to use ~10-second clips so I've been doing that for all of my models lol. been a real time sinkNow I know though, and thank god, that'll be way easier than making 80-100 tiny clips.
>>5584598The 10ish seconds is because processing takes more vram the longer the clip, so it's generally advisable to keep it under 30 seconds or less depending on your GPU.I made myself a script that calls ffmpeg and auto splits all audio files in a directory into 30 second segments. Even though preprocessing should split it, it's not perfect.
Dream Machine just droppedhttps://lumalabs.ai/dream-machinehow long will it stay up
>>5584672does it matter if it splits in the middle of a word or sentence? because I've been manually clipping between words when possible and if it doesn't then that would be nice
>>5584708In the middle of a sentence? No. It might have some detrimental effect if it cuts it off in the middle of a word, but with like an hour of stuff I can't imagine it would be a huge deal.When processing, it identifies phonemes (every sound that you can make to form a word) so it's already splitting things into fragments of words in a way. Worst case it would identify the hacked up audio as a different phoneme, but there should be plenty other examples to grab if there's enough audio. I'm sure not gonna spend time manually splitting 2 hours of audio when I can do it in a script.
>>5584695Luma image input
>>5568212insane. hardcore. beautiful.
quick test of an injured hayden christensen model speaking tweaked jack of blades lines. excuse the second obi wan, I forgot to drag it out.>>5583149If you look around on huggingface there are a lot like that one with the training function intact. If use find one on colab that lets you train, don't do it on an account you care about, google is very butthurt over competing ai shit.
>>5584057>>5584138I'm gonna try and make a new model with way more data than this one to see if it can get her accent down. I don't know why it wouldn't if rvc can make a good neco arc
>>5585292your rvc model isn't the problem. your input needs a french accent for it to come through in the conversion. i'll just give you my model since i've already trained more than enough for whatever you need.>example spoken in french then convertedhttps://voca.ro/1jPnq7WuQJim>weights+index (e945, 48k)https://files.catbox.moe/ssa773.zip
>>5585323converting speaker doing no french accent, then a french accent highlights the difference. this was converted all at once with that rvc modelhttps://voca.ro/1kd0QQPS3fPi
>>5572024This isn't AI.
>>5585337If you say so. I do t understand why the French accent isn't pulled out like the sounds neco arc makes. Though I guess in curies case it's moreso between words and letters that have the most impact on the accent rather than a nya sound to some words
>>5576197>i was getting decoding errors with this webm on firefox so i re-encodedSo this shit is why I can't open videos sometimes?
>>5585702the technical details were using a jpeg without specifying pix_fmt in ffmpeg. to re-encode i used pix_fmt yuv420p explicitly. it might have had to do with the height not being divisible by 2, but that's usually only a libx264 requirement
>>5585429if accent A uses phoneme X in a word while accent B replaces for phoneme Y, no matter how much you train on accent A any input with accent B will use the model speaker's phoneme Y. afaik there's nothing in rvc to translate phonemes between accents
>>5585749Interesting. Thanks for spelling it out for me and saving my graphics card another 3-4 hours of suffering lol
>>5585749To add to this, different languages have different phonemes. Like Japanese doesn't have "th" for instance. So like if you train on a voice that is only speaking Japanese, it will be lacking certain phonemes used in English. In this sense, sometimes an accent can bleed through, but even then it's pretty slight. I feel like current voice-to-voice tech is very misunderstood. Basically nobody knows that it's phoneme based, and probably not a lot of people even know what a phoneme is.
>>5576197There was a song by Leonard Cohen called 'The captain", which was probably the second coolest song he did. If Hallelujah didn't take off the way it did, the captain would have probably turned out to be his sleeper hit.Tom Waits did a slow burner in his younger years before his larynx turned into tire rubber called 'Martha' in the Closing Time album. As time has gone by, I appreciate it more than Swordfish Trombone, but anyway - food for thought.
>>5581344Holy shit based and blessed
>>5581872The statues are made to demoralise you, it's to make you feel powerless against them. Which itself is an admission that you aren't as otherwise they wouldn't need to instill that belief.
>>5586089>all these months and suno is still metallic as fuckshite ai
>>5586084fucking gold
>>5585861That's fair to say. I never really looked into how it wors, so RVC is essentially a scroll that I cast to get my waifu to sing or read. I imagine that's how it is for most people.
>>5586363Soul vs soulless>>5586370It's cool that it's easy enough for anyone to use. It just helps to know more about it so that you know what to expect and have the knowledge to train better models.
I'm not ready bros...
>>5584695
>>5587865I was not ready for the AT-AT turning into a fucking mech, kek.
>>5587593amazing. Fore sure his theme song lol. what Vegeta model did you use?
>>5588036> what Vegeta model did you use?Made it myself just yesterday. It's so-vits-svc, can't speak for compatibility with rvc. I don't mind sharing it if you want but very few people are interested in sovits models.
>>5588098anon, it's beautiful. I'm in tears.
>>5588086might as well, you never know. also do you have an mp3 of the song? i can pull it from the webm but it doesn't hurt to ask
>>5588355Webm's audio is 192kbps opus, which should be good enough, but here:https://vocaroo.com/1b4vzGmxpp0shttps://huggingface.co/chameleon-ai/so-vits-svc-models/tree/main/vegeta