Welcome to the Pony Voice Preservation Project!youtu.be/730zGRwbQuEThe Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.Technology has progressed such that a trained neural network can generate convincing voice clips, drawings and text for any person or character using existing audio recordings, artwork and fanfics as a reference. As you can surely imagine, AI pony voices, drawings and text have endless applications for pony content creation.AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.EQG and G5 are not welcome.>Quick start guide:docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/editIntroduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.>The main Doc:docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/editAn in-depth repository of tutorials, resources and archives.>Online speech generationhaysay.aialpha.15.dev>Active tasks:Research into animation AIResearch into pony image generation>Latest developments:pastebin.com/4p00iUZM>The PoneAI drive, an archive for AI pony voice content:drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp>Clipper’s Master Files, the central location for MLP voice data:mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSigmega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQdrive.google.com/drive/folders/1MuM9Nb_LwnVxInIPFNvzD_hv3zOZhpwx>Cool, where is the discord/forum/whatever unifying place for this project?You're looking at it.Last Thread: https://desuarchive.org/mlp/thread/43073987/#43073987
FAQs:If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.Quick: docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/editMain: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit>Where can I find the AI text-to-speech tools and how do I use them?A list of TTS tools: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.yuhl8zjiwmwqHow to get the best out of them: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.mnnpknmj1hcy>Where can I find content made with the voice AI?In the PoneAI drive: drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCpAnd the PPP Mega Compilation: docs.google.com/spreadsheets/d/1T2TE3OBs681Vphfas7Jgi5rvugdH6wnXVtUVYiZyJF8/edit>I want to know more about the PPP, but I can’t be arsed to read the doc.See the live PPP panel shows presented on /mlp/con for a more condensed overview.2020 pony.tube/w/5fUkuT3245pL8ZoWXUnXJ42021 pony.tube/w/a5yfTV4Ynq7tRveZH7AA8f2022 pony.tube/w/mV3xgbdtrXqjoPAwEXZCw52023 pony.tube/w/fVZShksjBbu6uT51DtvWWz>How can I help with the PPP?Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.>Did you know that such and such voiced this other thing that could be used for voice data?It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.>What about fan-imitations of official voices?No.>Will you guys be doing a [insert language here] version of the AI?Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.>What about [insert OC here]'s voice?It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.>I have an idea!Great. Post it in the thread and we'll discuss it.>Do you have a Code of Conduct?Of course: 15.ai/code>Is this project open source? Who is in charge of this?pony.tube/w/mqJyvdgrpbWgZduz2cs1CmPPP Redubs:pony.tube/w/p/aR2dpAFn5KhnqPYiRxFQ97Stream Premieres:pony.tube/w/6cKnjJEZSCi3gsvrbATXnCpony.tube/w/oNeBFMPiQKh93ePqTz1ns8
I wish I could make ai ponies instead of wagecucking
Anything new?
>>43127156Well, I'm kind of doing something but like I've mentioned last thread, things are going super slow.At least the Chinese ai video image stuff Anons showed from month or two ago is able to make a really nice quality ponies (even if the movement itself is still derped)
>https://u.pone.rs/tyfahhli.mp3>https://u.pone.rs/wzzujnzi.mp3some Vul stuff found in the wild>https://u.pone.rs/bedglhrn.mp3/create/ cover of Mrs Robinson
>https://u.pone.rs/dfezfvrk.mp4right now it seems all of the cooler video ai are limited to making 5 seconds (or 20 seconds at best) of footage, technically someone with insane patience could generate bizillion clips and stitch them all together to create a coherent ai episode.However seeing how stuff improves by yearly basis I feel like once we get a open source model that can make a whole minute of decent quality ai video the interest to making one should comeback among Anons (but only if the gpus will stop costing an arm, leg and both kidneys).
>>43128959Oh, and I meant to also crosspost this new green screen ai model from ai art thread (a guy trained a custom stablediffusion model to take in green screen image sequence and output true transparency pngs, were even wearing green clothes and holding reflective glass would still result in a almost industry standard footage masking)>>43098665>>43102567https://www.youtube.com/watch?v=3Ploi723hg4 https://github.com/nikopueringer/CorridorKeyhttps://github.com/edenaion/EZ-CorridorKey
>>43127073just let it go, manit's over
one last bump for good measure
When was the last time we did a redub?
>>43130637Should I just dm the namefags on fucking discord or something?What the fuck happened to this thread?
Man, I just spent 3 hours trying to get the RVC UI to run and I finally just gave up. Why do python and gradio have to be so difficult?
>>43131240Which GitHub are you trying to install? Also what's your gpu, if you trying to get the newest stuff the requirements txt will fuck your stuff up on basis that it's trying to get newest modules that are not always compatible with eachother andor hardware you have?
Do we have a panel for marecon?
>>43131768As far as I'm aware, nobody had made one so far. If you want to, it would be cool to see people making something for it.
>>43131240>https://u.pone.rs/hegtqssw.txthey bud, I was looking at my own conda installation instruction and it does looks like an absolute clouserfuck of patch notes written on top of each other (as seen per pip freeze above, its a mess and some).One of the things that seem to highlighted is to make sure the environment is set as follows:conda create -n "_name_of_your_env_" python=3.10.3 ipython==9.11.0 --yesFollowed by this sequence:typing-extensions==4.5.0pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1+cu116 "tensorflow[and-cuda]" --extra-index-url https://download.pytorch.org/whl/ --no-cache-dir --force-reinstallconda install curl=8.9.1pip install omegaconf==2.0.6pip install abcpip install omegaconf==2.0.6 --force-reinstall#(I think installing abc module breaks the omegaconf module?)pip install --upgrade setuptools==49.6.0 pip --user --force-reinstallC:\Users\User001\anaconda3\envs\RVC_vul_5\python.exe -m pip install --upgrade setuptools==49.6.0 pip --user --force-reinstallpip install PyYAML==5.1.2conda install requirements.txt -c conda-forgeAnd for some reason there is also this fucking note, because apparently getting the PyYAML to work its adventure on its own:-------------------------#requirement error PyYAML (>=5.1.*)git clone https://github.com/omry/omegaconf/ --branch v2.0.6 --depth 1##this instructionscd omegaconf#go to the file \omegaconf\requirements\base.txt#change the PyYAML requirement from PyYAML (>=5.1.*) to PyYAML (>=5.1).#create and install the modulepython setup.py sdistpip install dist/omegaconf-2.0.6.tar.gz#exit directorycd ..-------------------------Sorry if the above terminal installation steps are confusing, I've got the RVC working few years ago and dare not touch that part of console with ten foot pole in case it breaks. please do post if you need more help, I will lurk here and in the ai art thread mostly
>>43131366>>43132275Thanks so much for being willing to help out. I’m trying to install the webUI from https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI on a Dell Poweredge R730 server (cpu only, no gpu) to see how well it performs for voice conversion. I’ll take another stab at it tonight and let you know how it goes. I did eventually get the webui running and I could connect to it from another machine, but the UI immediately displays an error whenever I click any buttons or dropdowns. As far as I can tell, the callbacks defined in the gradio components are not getting called at all (I can insert a print statement in them and it doesn’t print). Something is broken with gradio but I have no stack trace to work with, so it’s very hard to troubleshoot. I’ll try reinstalling it from scratch using the pip freeze; maybe some other module I currently have installed is incompatible with the version of gradio I got.
>>43132364>(cpu only, no gpu)hmm, Im not sure if rvc code has a default swtich to cpu if gpu is not present, it may be worth looking int ot he main start up python code and comment out the gpu device lines to put in their place something like this?import torchtorch.cuda.is_available = lambda : Falsedevice = torch.device('cuda' if torch.cuda.is_available() else 'cpu')No idea if this will work, I just took it from the first semi decent looking link on front page https://community.esri.com/t5/arcgis-image-analyst-questions/how-force-pytorch-to-use-cpu-instead-of-gpu/td-p/1046738Also, the only rvc Ive used was the one from Vul github and something called RVC1006Nvidia for doing the voice training.
>>43131085>What the fuck happened to this thread?It achieved everything it set out to do, vastly exceeding expectations in many cases. Now AI progress has generally plateaued and will probably stay that way until the next big breakthrough, at which time the datasets will be there and ready to go.
>>43129728ngl I actually like this thread as an occasionally recurring general, I miss it when it's not around. Shame it's a bit slow to stay on the catalogue until bump limit yeah, but I imagine with enough advancements in AI tech it could swing back around eventually. Local video gen becoming way more accessible will probably give PPP a good boost.
>>43132275>>43132364>>43132480Welp, I reinstalled everything from scratch using a different python version and going through python dependency hell again, but I still got this same, nonsense "connection" error. There are no logs and there is no stacktrace, even if I set debug=True in the app.queue call in infer-web.py. The webapp and my browser can talk to each other over the default port 7865 and both have internet access, so I don't see how this can actually be a connection error, unless gradio is trying to connect to a nonexistent website for some silly reason. Upgrading fastapi did not help, upgrading gradio created a package incompatibility.I looked into the cpu vs cuda thing, and it looks like it gracefully switches to cpu if you don't have a gpu. See line ~175 in configs/config.py.I'm not going to try to get the RVC web client working any more, unless someone points out a quick fix that I've somehow completely missed. Instead, I will tear out all the gradio stuff and get the model loaded on a simple flask server that I can make API calls to for now, for the simple testing I wanted to do. I would have tried Vul's older RVC GUI, but I see pyqt code in there, so I think it's a natively-running UI; I require a web ui... unless I install xserver, which I suppose I could do.
>>43133460Darn, I wish I knew how to fix that but my usuall approach to fixin python fuckery is lurk for even more weird coders websites and start throwing crap at the wall to see what sticks.
>>43131768>>43131867I'd be excited to see whatever gets pulled up for it. I dunno if there's anything super notable that's been done lately, but even a panel showing off pony AI voice covers for music or something would be fun. No pressure though, PPP's kind of in slow mode at the moment, at least to some extent.
>>43133659+ 1 to that
>>43133460>>43133540Success! A simple Flask server worked for my purposes. The server takes ~6 seconds to convert a 1-second audio file that it hasn't seen before (if it *has* seen it before, then it's about 3 seconds). I've also noticed that the RVC codebase does not keep the index file in memory after converting a file; it dumps and reloads it again on the next conversion. That's significant, because those index files can be hundreds of megabytes large. I bet I could cut down the conversion time even more if I kept the index file cached.
>>43136969>A simple Flask server worked for my purposesI am curious to see what you are cooking up Anon
>>43137346I do have some projects in mind, but right now I'm in an exploration/discovery phase, so I'll keep the details under wraps until I'm sure my ideas will work from a technical standpoint.
bump
>>43136753Oh I actually thought about that, was moreso for EquestrAI being shown off to everyone but you could totally do one with text and then having another Anon trying to make images based on whatever's going on to go with the stream. That could be really fun, just a giant AI ensemble of all of the tech we got to showcase it all at once. Could throw in voicegen for dialogue as well even.
>>43139521horse
what a busy weekend
Is Applio any good?
>>43139521
>>43140304There is still /mlp/con I guess.
>>43144405Seeing Anons come up with stuff like the /chag/ ai light novel and the bonziPONY desktop pony ideas brings me a joy to see that despite lots of people pushing the idea that anything with ai is slop there are still plenty of people out there bringing ponies to life in their own way.
Boop
>>43144988Oh absolutely. Was the AI light novel shown off at Marecon like Bonzipony was? And if so, what block was it at? I'm not familiar with the light novel thing. Unless you mean the VN as in EquestrAI then yeah I absolutely agree.>>43144405For sure! And if we're talking EquestrAI, the new Godot version should be out soon. They'll be able to do a panel for it there easy, and since it's fully open source there could even be a collab with adding voice gen or something to it. It'd be really neat to have some more rep for the AI side of the board there, not that we haven't had plenty of panels for it already.
Making progress on the IPA Translater app. To be honest, though, I've lost some motivation for this little project because I've come to learn that there are many nontrivial issues with phonemic/phonetic transcription, so no solution I come up with will be perfect. There are literally infinitely many sounds that the human voice can create, so you have to pick a level of acceptable granularity. Accents have a huge impact on phonetic transcription, and you even have to pick a level of granularity on accent, too. For this project, I'm following Wiktionary's English pronunciation appendix for "General American",https://en.wiktionary.org/wiki/Appendix:English_pronunciationthough I'm not at all sure that's the right choice. Maybe I should presume Canadian English because many of the voice actresses are Canadian? Or maybe let users switch between American and Canadian modes? And then Rarity, of course, speaks with more of a Trans-Atlantic accent. Auto-translating this stuff is going to be sketchy, and you almost need to be trained in phonetics to do a correct transcription. At the same time, though, all this messiness highlights the need for cleaner training data. I'm sure that most of the training the ppp has done so far has involved some degree of auto-transcription from plaintext to IPA or Arpabet, and who knows how good a job those tools did.
>>43145658>For sure! And if we're talking EquestrAI, the new Godot version should be out soon.Nonny...Last update (download here): >>43144346And what was added at first (don't download here): >>43140874, >>43140879
Why do you believe voice AI is that stagnant? Afraid of copyright?
>>43146737I feel like a lot of intellectual force got pushed into LLMs and other text boat as well as whatever other shinny things that look nice to investors (like "this ai can replace your admin/lawyers/doctor wagies" sound bites).I have a gut feeling the reason why RVC and the other stuff from the past years years peaked because it got to the "good enough" level were the clips already sound like the source 90% of the time, but getting the extra 10% to make the lines truly sound like they are spoken by a human is just bit too difficult without some extreme miraculous eureka moment.>>43145865Speaking of innovation, it may sound dumb, but isn't there a way to train a separate speech-to-ipa model on smaller dataset that all it does is learning how to convert the spoken word into ipa and than use that new model to create proper ipa dataset with all the available audio datasets from the past for the creation of this new-new tts model?
>9
>>43145865i do look forward to see what you cook up, as using the ipa has potential of making a semi universal tts that can actually use the original character voices and isnt just converting somebody else voice to kind of sound like /int/ ponies
>>43146337dayum appul got flank
>>43146737Why is this thread constrained to just voice?
>>43149017it used to be all ai related things, however at one point the ai art & text/LLM models spin off into their own things, which in turn makes the PPP semi redundant since there are dedicated threads for them leaving the PPP with other ai stuff that sadly doesn't get as much spotlight (like population simulation or the ability for tts models trained on only one voice language input dataset being able to talk perfectly in multiple other languages)Also getting into why the PPP threads are bit dead now, since using art/LLMs for just personal use is much easier while making stuff with voices always requires extra steps of making the recording, converting it than editing it into a larger script.So now you already have a situation were there are several steps that discourage people from making stuff even before they look into making anything voice related BUT than you have another layer of dealing with Python bullshit before hand (and as seen in the talks above in this thread, most people do not have enough autism to deal with that bullshit).tldr its not good in the hood
>>43149621>it used to be all ai related thingsSort of, but we already hyper-focused on voice way back then.>It's dedicated to save our beloved pony's voices>This project is the first part of the "Pony Preservation Project" dealing with the voice.Then with 15, which is what we are mainly known for in general by those who know of us, we sort of pigeonholed into that even more.
>>43149621nta, but i'm glad to have been here when this thread was /the/ cutting edge of ai development.>cake>science fiction is herewatching it all slowly come to life thanks to a bunch of anonymous strangers on the internet working >for free is something I will never forget
>>43145865https://arxiv.org/abs/2603.29217>Advancing LLM-based phoneme-to-grapheme for multilingual speech recognitionHeres something related to your work
Oy