[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/mlp/ - Pony

Name
Spoiler?[]
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
Flag
File[]
  • Please read the Rules and FAQ before posting.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: altOP.jpg (1.26 MB, 2119x1500)
1.26 MB
1.26 MB JPG
Welcome to the Pony Voice Preservation Project!
youtu.be/730zGRwbQuE

The Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.

Technology has progressed such that a trained neural network can generate convincing voice clips, drawings and text for any person or character using existing audio recordings, artwork and fanfics as a reference. As you can surely imagine, AI pony voices, drawings and text have endless applications for pony content creation.

AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.

Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.

EQG and G5 are not welcome.

>Quick start guide:
docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Introduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.

>The main Doc:
docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit
An in-depth repository of tutorials, resources and archives.

>Online speech generation
haysay.ai
alpha.15.dev

>Active tasks:
Research into animation AI
Research into pony image generation

>Latest developments:
pastebin.com/4p00iUZM

>The PoneAI drive, an archive for AI pony voice content:
drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp

>Clipper’s Master Files, the central location for MLP voice data:
mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSig
mega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQ
drive.google.com/drive/folders/1MuM9Nb_LwnVxInIPFNvzD_hv3zOZhpwx

>Cool, where is the discord/forum/whatever unifying place for this project?
You're looking at it.

Last Thread: https://desuarchive.org/mlp/thread/43073987/#43073987
>>
FAQs:
If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.
Quick: docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Main: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit

>Where can I find the AI text-to-speech tools and how do I use them?
A list of TTS tools: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.yuhl8zjiwmwq
How to get the best out of them: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.mnnpknmj1hcy

>Where can I find content made with the voice AI?
In the PoneAI drive: drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp
And the PPP Mega Compilation: docs.google.com/spreadsheets/d/1T2TE3OBs681Vphfas7Jgi5rvugdH6wnXVtUVYiZyJF8/edit

>I want to know more about the PPP, but I can’t be arsed to read the doc.
See the live PPP panel shows presented on /mlp/con for a more condensed overview.
2020 pony.tube/w/5fUkuT3245pL8ZoWXUnXJ4
2021 pony.tube/w/a5yfTV4Ynq7tRveZH7AA8f
2022 pony.tube/w/mV3xgbdtrXqjoPAwEXZCw5
2023 pony.tube/w/fVZShksjBbu6uT51DtvWWz

>How can I help with the PPP?
Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.

>Did you know that such and such voiced this other thing that could be used for voice data?
It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.

>What about fan-imitations of official voices?
No.

>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.

>What about [insert OC here]'s voice?
It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.

>I have an idea!
Great. Post it in the thread and we'll discuss it.

>Do you have a Code of Conduct?
Of course: 15.ai/code

>Is this project open source? Who is in charge of this?
pony.tube/w/mqJyvdgrpbWgZduz2cs1Cm

PPP Redubs:
pony.tube/w/p/aR2dpAFn5KhnqPYiRxFQ97

Stream Premieres:
pony.tube/w/6cKnjJEZSCi3gsvrbATXnC
pony.tube/w/oNeBFMPiQKh93ePqTz1ns8
>>
I wish I could make ai ponies instead of wagecucking
>>
Anything new?
>>
>>43127156
Well, I'm kind of doing something but like I've mentioned last thread, things are going super slow.
At least the Chinese ai video image stuff Anons showed from month or two ago is able to make a really nice quality ponies (even if the movement itself is still derped)
>>
>https://u.pone.rs/tyfahhli.mp3
>https://u.pone.rs/wzzujnzi.mp3
some Vul stuff found in the wild
>https://u.pone.rs/bedglhrn.mp3
/create/ cover of Mrs Robinson
>>
>https://u.pone.rs/dfezfvrk.mp4
right now it seems all of the cooler video ai are limited to making 5 seconds (or 20 seconds at best) of footage, technically someone with insane patience could generate bizillion clips and stitch them all together to create a coherent ai episode.
However seeing how stuff improves by yearly basis I feel like once we get a open source model that can make a whole minute of decent quality ai video the interest to making one should comeback among Anons (but only if the gpus will stop costing an arm, leg and both kidneys).
>>
>>43128959
Oh, and I meant to also crosspost this new green screen ai model from ai art thread (a guy trained a custom stablediffusion model to take in green screen image sequence and output true transparency pngs, were even wearing green clothes and holding reflective glass would still result in a almost industry standard footage masking)
>>43098665
>>43102567
https://www.youtube.com/watch?v=3Ploi723hg4
https://github.com/nikopueringer/CorridorKey
https://github.com/edenaion/EZ-CorridorKey
>>
>>43127073
just let it go, man
it's over
>>
one last bump for good measure
>>
When was the last time we did a redub?
>>
>>43130637
Should I just dm the namefags on fucking discord or something?
What the fuck happened to this thread?
>>
Man, I just spent 3 hours trying to get the RVC UI to run and I finally just gave up. Why do python and gradio have to be so difficult?
>>
>>43131240
Which GitHub are you trying to install? Also what's your gpu, if you trying to get the newest stuff the requirements txt will fuck your stuff up on basis that it's trying to get newest modules that are not always compatible with eachother andor hardware you have?
>>
Do we have a panel for marecon?
>>
>>43131768
As far as I'm aware, nobody had made one so far. If you want to, it would be cool to see people making something for it.
>>
File: Terri Softmare 2964722.png (580 KB, 2310x2147)
580 KB
580 KB PNG
>>43131240
>https://u.pone.rs/hegtqssw.txt
hey bud, I was looking at my own conda installation instruction and it does looks like an absolute clouserfuck of patch notes written on top of each other (as seen per pip freeze above, its a mess and some).
One of the things that seem to highlighted is to make sure the environment is set as follows:
conda create -n "_name_of_your_env_" python=3.10.3 ipython==9.11.0 --yes

Followed by this sequence:
typing-extensions==4.5.0
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1+cu116 "tensorflow[and-cuda]" --extra-index-url https://download.pytorch.org/whl/ --no-cache-dir --force-reinstall
conda install curl=8.9.1
pip install omegaconf==2.0.6
pip install abc
pip install omegaconf==2.0.6 --force-reinstall
#(I think installing abc module breaks the omegaconf module?)
pip install --upgrade setuptools==49.6.0 pip --user --force-reinstall
C:\Users\User001\anaconda3\envs\RVC_vul_5\python.exe -m pip install --upgrade setuptools==49.6.0 pip --user --force-reinstall
pip install PyYAML==5.1.2
conda install requirements.txt -c conda-forge

And for some reason there is also this fucking note, because apparently getting the PyYAML to work its adventure on its own:
-------------------------
#requirement error PyYAML (>=5.1.*)
git clone https://github.com/omry/omegaconf/ --branch v2.0.6 --depth 1
##this instructions
cd omegaconf
#go to the file \omegaconf\requirements\base.txt
#change the PyYAML requirement from PyYAML (>=5.1.*) to PyYAML (>=5.1).
#create and install the module
python setup.py sdist
pip install dist/omegaconf-2.0.6.tar.gz
#exit directory
cd ..
-------------------------
Sorry if the above terminal installation steps are confusing, I've got the RVC working few years ago and dare not touch that part of console with ten foot pole in case it breaks. please do post if you need more help, I will lurk here and in the ai art thread mostly
>>
>>43131366
>>43132275
Thanks so much for being willing to help out. I’m trying to install the webUI from https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI on a Dell Poweredge R730 server (cpu only, no gpu) to see how well it performs for voice conversion.
I’ll take another stab at it tonight and let you know how it goes. I did eventually get the webui running and I could connect to it from another machine, but the UI immediately displays an error whenever I click any buttons or dropdowns. As far as I can tell, the callbacks defined in the gradio components are not getting called at all (I can insert a print statement in them and it doesn’t print). Something is broken with gradio but I have no stack trace to work with, so it’s very hard to troubleshoot. I’ll try reinstalling it from scratch using the pip freeze; maybe some other module I currently have installed is incompatible with the version of gradio I got.
>>
>>43132364
>(cpu only, no gpu)
hmm, Im not sure if rvc code has a default swtich to cpu if gpu is not present, it may be worth looking int ot he main start up python code and comment out the gpu device lines to put in their place something like this?

import torch
torch.cuda.is_available = lambda : False
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

No idea if this will work, I just took it from the first semi decent looking link on front page https://community.esri.com/t5/arcgis-image-analyst-questions/how-force-pytorch-to-use-cpu-instead-of-gpu/td-p/1046738
Also, the only rvc Ive used was the one from Vul github and something called RVC1006Nvidia for doing the voice training.
>>
>>43131085
>What the fuck happened to this thread?
It achieved everything it set out to do, vastly exceeding expectations in many cases. Now AI progress has generally plateaued and will probably stay that way until the next big breakthrough, at which time the datasets will be there and ready to go.
>>
>>43129728
ngl I actually like this thread as an occasionally recurring general, I miss it when it's not around. Shame it's a bit slow to stay on the catalogue until bump limit yeah, but I imagine with enough advancements in AI tech it could swing back around eventually. Local video gen becoming way more accessible will probably give PPP a good boost.
>>
>>43132275
>>43132364
>>43132480
Welp, I reinstalled everything from scratch using a different python version and going through python dependency hell again, but I still got this same, nonsense "connection" error. There are no logs and there is no stacktrace, even if I set debug=True in the app.queue call in infer-web.py. The webapp and my browser can talk to each other over the default port 7865 and both have internet access, so I don't see how this can actually be a connection error, unless gradio is trying to connect to a nonexistent website for some silly reason. Upgrading fastapi did not help, upgrading gradio created a package incompatibility.

I looked into the cpu vs cuda thing, and it looks like it gracefully switches to cpu if you don't have a gpu. See line ~175 in configs/config.py.

I'm not going to try to get the RVC web client working any more, unless someone points out a quick fix that I've somehow completely missed. Instead, I will tear out all the gradio stuff and get the model loaded on a simple flask server that I can make API calls to for now, for the simple testing I wanted to do. I would have tried Vul's older RVC GUI, but I see pyqt code in there, so I think it's a natively-running UI; I require a web ui... unless I install xserver, which I suppose I could do.
>>
>>43133460
Darn, I wish I knew how to fix that but my usuall approach to fixin python fuckery is lurk for even more weird coders websites and start throwing crap at the wall to see what sticks.
>>
>>43131768
>>43131867
I'd be excited to see whatever gets pulled up for it. I dunno if there's anything super notable that's been done lately, but even a panel showing off pony AI voice covers for music or something would be fun. No pressure though, PPP's kind of in slow mode at the moment, at least to some extent.
>>
>>43133659
+ 1 to that
>>
>>43133460
>>43133540
Success! A simple Flask server worked for my purposes. The server takes ~6 seconds to convert a 1-second audio file that it hasn't seen before (if it *has* seen it before, then it's about 3 seconds).

I've also noticed that the RVC codebase does not keep the index file in memory after converting a file; it dumps and reloads it again on the next conversion. That's significant, because those index files can be hundreds of megabytes large. I bet I could cut down the conversion time even more if I kept the index file cached.
>>
>>43136969
>A simple Flask server worked for my purposes
I am curious to see what you are cooking up Anon
>>
>>43137346
I do have some projects in mind, but right now I'm in an exploration/discovery phase, so I'll keep the details under wraps until I'm sure my ideas will work from a technical standpoint.
>>
bump
>>
>>43136753
Oh I actually thought about that, was moreso for EquestrAI being shown off to everyone but you could totally do one with text and then having another Anon trying to make images based on whatever's going on to go with the stream. That could be really fun, just a giant AI ensemble of all of the tech we got to showcase it all at once. Could throw in voicegen for dialogue as well even.
>>
>>43139521
horse
>>
what a busy weekend
>>
Is Applio any good?
>>
>>43139521
>>
>>43139521
>>
>>43140304
There is still /mlp/con I guess.
>>
>>43144405
Seeing Anons come up with stuff like the /chag/ ai light novel and the bonziPONY desktop pony ideas brings me a joy to see that despite lots of people pushing the idea that anything with ai is slop there are still plenty of people out there bringing ponies to life in their own way.
>>
Boop
>>
>>43144988
Oh absolutely. Was the AI light novel shown off at Marecon like Bonzipony was? And if so, what block was it at? I'm not familiar with the light novel thing. Unless you mean the VN as in EquestrAI then yeah I absolutely agree.

>>43144405
For sure! And if we're talking EquestrAI, the new Godot version should be out soon. They'll be able to do a panel for it there easy, and since it's fully open source there could even be a collab with adding voice gen or something to it. It'd be really neat to have some more rep for the AI side of the board there, not that we haven't had plenty of panels for it already.
>>
File: UI preview.png (346 KB, 1928x1848)
346 KB
346 KB PNG
Making progress on the IPA Translater app. To be honest, though, I've lost some motivation for this little project because I've come to learn that there are many nontrivial issues with phonemic/phonetic transcription, so no solution I come up with will be perfect. There are literally infinitely many sounds that the human voice can create, so you have to pick a level of acceptable granularity. Accents have a huge impact on phonetic transcription, and you even have to pick a level of granularity on accent, too. For this project, I'm following Wiktionary's English pronunciation appendix for "General American",
https://en.wiktionary.org/wiki/Appendix:English_pronunciation
though I'm not at all sure that's the right choice. Maybe I should presume Canadian English because many of the voice actresses are Canadian? Or maybe let users switch between American and Canadian modes? And then Rarity, of course, speaks with more of a Trans-Atlantic accent. Auto-translating this stuff is going to be sketchy, and you almost need to be trained in phonetics to do a correct transcription. At the same time, though, all this messiness highlights the need for cleaner training data. I'm sure that most of the training the ppp has done so far has involved some degree of auto-transcription from plaintext to IPA or Arpabet, and who knows how good a job those tools did.
>>
>>43145658
>For sure! And if we're talking EquestrAI, the new Godot version should be out soon.
Nonny...
Last update (download here): >>43144346
And what was added at first (don't download here): >>43140874, >>43140879
>>
File: bump.gif (332 KB, 675x675)
332 KB
332 KB GIF
>>
Why do you believe voice AI is that stagnant? Afraid of copyright?
>>
>>43146737
I feel like a lot of intellectual force got pushed into LLMs and other text boat as well as whatever other shinny things that look nice to investors (like "this ai can replace your admin/lawyers/doctor wagies" sound bites).
I have a gut feeling the reason why RVC and the other stuff from the past years years peaked because it got to the "good enough" level were the clips already sound like the source 90% of the time, but getting the extra 10% to make the lines truly sound like they are spoken by a human is just bit too difficult without some extreme miraculous eureka moment.
>>43145865
Speaking of innovation, it may sound dumb, but isn't there a way to train a separate speech-to-ipa model on smaller dataset that all it does is learning how to convert the spoken word into ipa and than use that new model to create proper ipa dataset with all the available audio datasets from the past for the creation of this new-new tts model?
>>
>9
>>
>>43145865
i do look forward to see what you cook up, as using the ipa has potential of making a semi universal tts that can actually use the original character voices and isnt just converting somebody else voice to kind of sound like /int/ ponies
>>
>>43146337
dayum appul got flank
>>
>>43146737
Why is this thread constrained to just voice?
>>
>>43149017
it used to be all ai related things, however at one point the ai art & text/LLM models spin off into their own things, which in turn makes the PPP semi redundant since there are dedicated threads for them leaving the PPP with other ai stuff that sadly doesn't get as much spotlight (like population simulation or the ability for tts models trained on only one voice language input dataset being able to talk perfectly in multiple other languages)
Also getting into why the PPP threads are bit dead now, since using art/LLMs for just personal use is much easier while making stuff with voices always requires extra steps of making the recording, converting it than editing it into a larger script.
So now you already have a situation were there are several steps that discourage people from making stuff even before they look into making anything voice related BUT than you have another layer of dealing with Python bullshit before hand (and as seen in the talks above in this thread, most people do not have enough autism to deal with that bullshit).

tldr its not good in the hood
>>
>>43149621
>it used to be all ai related things
Sort of, but we already hyper-focused on voice way back then.
>It's dedicated to save our beloved pony's voices
>This project is the first part of the "Pony Preservation Project" dealing with the voice.
Then with 15, which is what we are mainly known for in general by those who know of us, we sort of pigeonholed into that even more.
>>
>>43149621
nta, but i'm glad to have been here when this thread was /the/ cutting edge of ai development.
>cake
>science fiction is here
watching it all slowly come to life thanks to a bunch of anonymous strangers on the internet working >for free is something I will never forget
>>
>>43145865
https://arxiv.org/abs/2603.29217
>Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition
Heres something related to your work
>>
Oy



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.