/mlp/ - Pony Preservation Project (Thread 158) - Pony


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
Pony Preservation Project (Thr(...) 02/28/26(Sat)14:50:08 No.43073987

File: altOP.jpg (1.26 MB, 2119x1500)

Pony Preservation Project (Thread 158) Anonymous 02/28/26(Sat)14:50:08 No.43073987

Welcome to the Pony Voice Preservation Project!
youtu.be/730zGRwbQuE

The Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.

Technology has progressed such that a trained neural network can generate convincing voice clips, drawings and text for any person or character using existing audio recordings, artwork and fanfics as a reference. As you can surely imagine, AI pony voices, drawings and text have endless applications for pony content creation.

AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.

Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.

EQG and G5 are not welcome.

>Quick start guide:
docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Introduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.

>The main Doc:
docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit
An in-depth repository of tutorials, resources and archives.

>Online speech generation
haysay.ai
alpha.15.dev

>Active tasks:
Research into animation AI
Research into pony image generation

>Latest developments:
pastebin.com/4p00iUZM

>The PoneAI drive, an archive for AI pony voice content:
drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp

>Clipper’s Master Files, the central location for MLP voice data:
mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSig
mega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQ
drive.google.com/drive/folders/1MuM9Nb_LwnVxInIPFNvzD_hv3zOZhpwx

>Cool, where is the discord/forum/whatever unifying place for this project?
You're looking at it.

Last Thread: https://desuarchive.org/mlp/thread/42972267/42972267

Anonymous
02/28/26(Sat)14:50:38 No.43073988

Anonymous 02/28/26(Sat)14:50:38 No.43073988

FAQs:
If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.
Quick: docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Main: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit

>Where can I find the AI text-to-speech tools and how do I use them?
A list of TTS tools: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.yuhl8zjiwmwq
How to get the best out of them: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.mnnpknmj1hcy

>Where can I find content made with the voice AI?
In the PoneAI drive: drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp
And the PPP Mega Compilation: docs.google.com/spreadsheets/d/1T2TE3OBs681Vphfas7Jgi5rvugdH6wnXVtUVYiZyJF8/edit

>I want to know more about the PPP, but I can’t be arsed to read the doc.
See the live PPP panel shows presented on /mlp/con for a more condensed overview.
2020 pony.tube/w/5fUkuT3245pL8ZoWXUnXJ4
2021 pony.tube/w/a5yfTV4Ynq7tRveZH7AA8f
2022 pony.tube/w/mV3xgbdtrXqjoPAwEXZCw5
2023 pony.tube/w/fVZShksjBbu6uT51DtvWWz

>How can I help with the PPP?
Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.

>Did you know that such and such voiced this other thing that could be used for voice data?
It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.

>What about fan-imitations of official voices?
No.

>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.

>What about [insert OC here]'s voice?
It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.

>I have an idea!
Great. Post it in the thread and we'll discuss it.

>Do you have a Code of Conduct?
Of course: 15.ai/code

>Is this project open source? Who is in charge of this?
pony.tube/w/mqJyvdgrpbWgZduz2cs1Cm

PPP Redubs:
pony.tube/w/p/aR2dpAFn5KhnqPYiRxFQ97

Stream Premieres:
pony.tube/w/6cKnjJEZSCi3gsvrbATXnC
pony.tube/w/oNeBFMPiQKh93ePqTz1ns8

Anonymous
02/28/26(Sat)16:26:24 No.43074165

Anonymous 02/28/26(Sat)16:26:24 No.43074165

>>43073987
I have an idea/proposal I wanted to bounce off of you all.
The "Sliced Dialog" folder has plain-English translations of the dialog. This is somewhat limiting because:
1. It doesn't capture non-word sounds or disordered speech very well, like grunts and breathing.
2. Translating the text to IPA for training has challenges:
i. If a simple text-to-IPA translation is performed during model training, you can end up with bad input data for homographs.
ii. Generic text-to-IPA translators might not translate pony-specific words very well, like "Equestria" and "Fluttershy".

Would it be worth it to translate the entirety of the training set into extended IPA for better training data? I could work on a UI to assist with the translation (e.g. providing an automated first guess and provide tools for tweaking the result before saving it to a file). If so, is there any other information we should record while going through the dataset again? Like, additional emotion tags or whatnot?

Also, for new developers interested in diving into the world of TTS and voice cloning, here's an awesome new resource I'd recommend:
https://www.youtube.com/playlist?list=PL-wATfeyAMNorsfMFg0ISfD0rPDpMHA4R

Anonymous
02/28/26(Sat)21:08:35 No.43074676

Anonymous 02/28/26(Sat)21:08:35 No.43074676

>>43074165
Anybody else out there? bumping.

For an automated first guess at translating plain English to IPA, I'm thinking something along these lines:
1. Normalize the plain text (e.g. "2" -> "two", "Mr." -> "mister", etc.)
2. Perform a part-of-speech analysis of the sentence. Tag each word with its part of speech.
3. Look up each word in the espeak dictionary. espeak has direct IPA translations for a bunch of words.
4. Also look up the word in the CMU dictionary. This dictionary has MANY words but is in arpabet, not IPA. Translate the arpabet to IPA.
5. If a word has more than one possible IPA translation, then use the part of speech tag to narrow down the possibilities. If you *still* have more than one possible IPA translation, then pick one at random (but prefer espeak-derived translations over CMUDict translations).
6. If we couldn't find the word in either dictionary, then fall back on either espeak heuristic rules or an ML model to generate an IPA translation.

g2p-en already does something pretty similar, so I might just fork g2p-en and include the CMU Dictionary and pony-specific word translations.

Anonymous
03/01/26(Sun)02:54:44 No.43075209

Anonymous 03/01/26(Sun)02:54:44 No.43075209

>10

Anonymous
03/01/26(Sun)05:42:36 No.43075341

Anonymous 03/01/26(Sun)05:42:36 No.43075341

lol u ded

Anonymous
03/01/26(Sun)11:35:21 No.43075887

Anonymous 03/01/26(Sun)11:35:21 No.43075887

bump

Anonymous
03/01/26(Sun)13:07:48 No.43076031

Anonymous 03/01/26(Sun)13:07:48 No.43076031

It's over for pony voices...

Anonymous
03/01/26(Sun)16:22:41 No.43076407

Anonymous 03/01/26(Sun)16:22:41 No.43076407

Would there happen to be any image datasets by the PPP? I'm not very keen with audio, but I've wanted to try toying with vision models and so far have downloaded around 25GB of filtered art from derpi

Anonymous
03/01/26(Sun)18:43:48 No.43076734

Anonymous 03/01/26(Sun)18:43:48 No.43076734

>>43076407
The main doc has a link to a Google Drive for the animation files. The download also contains SWF backgrounds and stills. It's all packaged up into .7z files, though, so you have to download everything even if you don't need the animation files (over 100GB).

Anonymous
03/02/26(Mon)02:44:34 No.43077284

Anonymous 03/02/26(Mon)02:44:34 No.43077284

Instead of a soulless bump, let me repost a cover I made recently of Mrs. Robinson with custom pony lyrics.
>https://u.pone.rs/iltnpniy.flac

I posted it in here >>43032059, but maybe someone missed it from here and would like to hear it.

Anonymous
03/02/26(Mon)10:30:27 No.43077747

Anonymous 03/02/26(Mon)10:30:27 No.43077747

.

Anonymous
03/02/26(Mon)14:44:06 No.43078261

Anonymous 03/02/26(Mon)14:44:06 No.43078261

>>43077747

Anonymous
03/02/26(Mon)14:47:33 No.43078270

Anonymous 03/02/26(Mon)14:47:33 No.43078270

Never got into that ai stuff

Anonymous
03/02/26(Mon)17:15:18 No.43078637

Anonymous 03/02/26(Mon)17:15:18 No.43078637

Does this https://www.youtube.com/@PonyVerse/videos sound ai gen?
One of the vocalist appear multiple times, sounds similar but not too similar.

Anonymous
03/02/26(Mon)18:32:41 No.43078884

Anonymous 03/02/26(Mon)18:32:41 No.43078884

>>43078637
Yes, PonyVerse is AI music.

Anonymous
03/02/26(Mon)20:43:42 No.43079116

Anonymous 03/02/26(Mon)20:43:42 No.43079116

>>43078884
ofc in the comments he says its 0% ai...

Name
Spoiler?	[Spoiler?]
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
Flag
File	[Spoiler?]
Please read the Rules and FAQ before posting.