[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/mlp/ - Pony

Name
Spoiler?[]
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
Flag
File[]
  • Please read the Rules and FAQ before posting.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: altOP.jpg (28 KB, 250x176)
28 KB
28 KB JPG
Welcome to the Pony Voice Preservation Project!
youtu.be/730zGRwbQuE

The Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.

Technology has progressed such that a trained neural network can generate convincing voice clips, drawings and text for any person or character using existing audio recordings, artwork and fanfics as a reference. As you can surely imagine, AI pony voices, drawings and text have endless applications for pony content creation.

AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.

Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.

EQG and G5 are not welcome.

>Quick start guide:
docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Introduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.

>The main Doc:
docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit
An in-depth repository of tutorials, resources and archives.

>Online speech generation
haysay.ai
alpha.15.dev

>Active tasks:
Research into animation AI
Research into pony image generation

>Latest developments:
pastebin.com/4p00iUZM

>The PoneAI drive, an archive for AI pony voice content:
drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp

>Clipper’s Master Files, the central location for MLP voice data:
mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSig
mega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQ
drive.google.com/drive/folders/1MuM9Nb_LwnVxInIPFNvzD_hv3zOZhpwx

>Cool, where is the discord/forum/whatever unifying place for this project?
You're looking at it.

Last Thread: >>42429020
>>
Are we back?
>>
>>42695963
Probably not
That guy that tries to wipe the board every morning changed tactics to posting old generals instead of his usual frogs or hyperfat trixies
>>
>>42696433
I took a look and didn't see anything wrong with the OP itself, even if it was made for nefarious reasons.
We could just use it normally and see if we survive or not.
>>
>>42695963
What actually happened?
>>
>>42696473
The general's been slow lately, and with all the recent raids and slides, it kept dying over and over.
I guess anons figured it wasn't worth reviving under those conditions.
Even if activity's low, I still think keeping the general around is worth it for the OP's resources alone.
>>
>>42696793
Pretty much this.
>>
>>42695963
For now.
>>
maybe we should merge with one of the other ai generals
>>
>>42698295
Such as?
>>
>>42698295
They both already have a ton of stuff in their OPs, so I don't think we'd fit. /chag/ is too fast, and we'd just get buried. AI art has too much drama, and PPP is distinct enough that I don't think it would work with either of them.
>>
>10
>>
Hello /mlp/. I've not lurked here in a couple years. Last time I was here, I remember someone was trying to use AI to generate flash animations. Did anything come of that?
>>
>>42699973
There's still no AI for flash models working with actual vector graphics, just the same old ones generating inconsistent slop.
>>
File: BUMP.png (115 KB, 287x453)
115 KB
115 KB PNG
>>
>>42696433
big sad if that's true, anyhow, here is part 2 of the OP

FAQs:
If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.
Quick: docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Main: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit

>Where can I find the AI text-to-speech tools and how do I use them?
A list of TTS tools: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.yuhl8zjiwmwq
How to get the best out of them: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.mnnpknmj1hcy

>Where can I find content made with the voice AI?
In the PoneAI drive: drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp
And the PPP Mega Compilation: docs.google.com/spreadsheets/d/1T2TE3OBs681Vphfas7Jgi5rvugdH6wnXVtUVYiZyJF8/edit

>I want to know more about the PPP, but I can’t be arsed to read the doc.
See the live PPP panel shows presented on /mlp/con for a more condensed overview.
2020 pony.tube/w/5fUkuT3245pL8ZoWXUnXJ4
2021 pony.tube/w/a5yfTV4Ynq7tRveZH7AA8f
2022 pony.tube/w/mV3xgbdtrXqjoPAwEXZCw5
2023 pony.tube/w/fVZShksjBbu6uT51DtvWWz

>How can I help with the PPP?
Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.

>Did you know that such and such voiced this other thing that could be used for voice data?
It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.

>What about fan-imitations of official voices?
No.

>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.

>What about [insert OC here]'s voice?
It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.

>I have an idea!
Great. Post it in the thread and we'll discuss it.

>Do you have a Code of Conduct?
Of course: 15.ai/code

>Is this project open source? Who is in charge of this?
pony.tube/w/mqJyvdgrpbWgZduz2cs1Cm

PPP Redubs:
pony.tube/w/p/aR2dpAFn5KhnqPYiRxFQ97

Stream Premieres:
pony.tube/w/6cKnjJEZSCi3gsvrbATXnC
pony.tube/w/oNeBFMPiQKh93ePqTz1ns8
>>
Anchor.
>>
>>42703371
Nice to see the PPP making a comeback. If anybody is interested, I am starting to make a dataset focused on creating text parodies based on the input lyrics text. So far I have already saved a 1000+ songs based on the work of /v/ the Musical, however I would love to ask for some input and help from people in here as well, both in order to collect a more pony based text parodies as well as general opinions on how such dataset should be formatted for the most flexibility.
The end result will be updated on huggingface (+ mega and other archival sites for a good measure), and the purpose of it is to use such dataset in the future in training a text based models in better understanding/process of making lyrics based around the theme of ponies.
So far the idea would be to save all the list of both original lyrics and the parodies as a text, and use the original song filename as the primary ID for it, inside master JSON file, that would also store some meta information about it . e.g.
(JSON master file format)
base_filename, og_language, parody_count
Jose_Gonzalez_Far_Away, ENG, 1

(Folder dataset format)
master_index.json
lyrics_dataset/
Jose_Gonzalez_Far_Away/
Jose_Gonzalez_Far_Away_00.txt
Jose_Gonzalez_Far_Away_01.txt

The inside of the text files, would be formatted as below, in order to mimic a tuple-like labeling Key-Value text format (consistency in the format will make dataset clearer in use and adapting to whatever future models training will be needed), that are both machine readable and easily understandable by people editing/reading the files:
_Type=Source/Parody (self explanatory)
_Language=ENG (99% original songs would be in English, but it may be a good idea to add this, in case of expanding into non-English stuff in the future)
_Song_Name= (this is different from filename, may possibly not be that useful in the general training, however I dont think including it will hurt the dataset since it can be filtered out)
_Similarity_Score=N_A (how much the parody lyrics differs from original, eg parody is 0.82 same as source text, this part will need some automated figuring out in the future)
_Lyrics:
(self explanatory)
_Outline= (this is space for meta information that while may not directly presented in the lyrics, it would still influence the concept used in the above)
_Prompt=Make a parody about... (this section would be written as a reverse-engineered idea of what somebody would casually ask the text model to generate the parody lyrics)

I have a very big request from all of (you)s, searching, finding and correctly formatting dataset for the pony parody songs is a pretty big step, I do not expect anyone to drop whatever project they are working on, however if anybody is willing to help, (either by sending the song link with the name of the original song, lyrics transcriptions or even fully formatted txt to drop into the dataset) please do contact me by the email anonTmpDataset@proton.me .
>>
>>42704020
>comeback
>was on page 9 without replies for almost 5 hours
it will get archived by tomorrow and we won't see another thread for 3 weeks
>>
>>42704020
Here is a small sample of what the dataset of original file + parody file would look like.
(actual filename: Jose_Gonzalez_Far_Away_00)
_Type=Source
_Language=ENG
_Song_Name=José González - Far Away
_Similarity_Score=N_A
_Lyrics:
Step in front of a runaway train
Just to feel alive again
Pushing forward through the night
Aching chest and blurry sight

It's so far, so far away
It's so far, so far away
_Outline=Source Song
_Prompt=Source Song

(actual filename: Jose_Gonzalez_Far_Away_01)
_Type=Parody
_Language=ENG
_Song_Name=05_Red Dead Depression
_Similarity_Score=N_A
_Lyrics:
Have you seen the new Red Dead game?
Trailers driving me insane
Once again I've been denied
PC port is out of sight

It's so far, so far away
It's so far, so far away
_Outline=Narrator is upset that the new game Red Dead Redemption 2 was not ported to the PC market, as it was first published in October 2018 on PlayStation 4 and Xbox One consoles. The PC port was delayed to November 2019.
_Prompt=Make a parody about the Red Dead Redemption 2 not being available on PC.
>>42704028
There has been lot's of stuff going on in the board going on for past few months, including all the flooding spam and "1 post by OP" threads, however I feel like there has been enough of doomposting, since that's an easy route.
It's time to return to what made the PPP great in the first place, making poni waifus real.
>>
>https://www.youtube.com/watch?v=qGe_fq68x-Q
>96GB Gpu
>cost between 500~1000$
If not for the fact that it's limited by the fact that its only designed to work on Huawei motherboard/server it could be a strong alternative to all the Nvidia/AMD nonsense.
>>
>https://www.youtube.com/watch?v=rJ9LjgBddLg
>https://www.youtube.com/watch?v=MH8P0KTWSAI
Let's enjoy some nicer pony songs.
And now that's anni G4 block had ended, its time to work on new songs.
>>
>>42704071
Hard to do that when there's hours between replies, more than enough time for a sliderfag to push the thread off overnight.
If we want this thread to survive, people need to be more active. But that's hard when half of us have already fled to the other AI threads.
>>
>>42704892
A single schizo is keeping ten threads alive on the catalog with just bumps.
Surely we can keep one alive.
>>
>>42702639
Is it still being worked on or has it been abandoned?
>>
>>42704585
Do they have like wine for hardware
>>
>>42705275
From what I understand, it's kind of in limbo.
>>
>>42705281
From the sound of it, it's 100% their own Chinese close software. I would imagine if it was openly available in stores, a proper Linux nerds could figure something out, sadly, the only way to get this card is if you know someone who know someone deep in the China tech market.
>>
File: BUMP.png (151 KB, 279x352)
151 KB
151 KB PNG
>>
>>42698295
Was this general not for developing and progressing the technology? That doesn't quite fit with the generals that are just for posting slop output.
At the same time, all of the development work seems to be dead and buried and nobody's done fuck all for ages to the point the traditional /ppp/ panels and clipper episodes haven't taken place at any con since 2023. Is anything even being done anymore?
>>
>>42708277
>Was this general not for developing and progressing the technology? That doesn't quite fit with the generals that are just for posting slop output.
You could see it that way, but it was also a lot more fun and active when people were just posting funny shitpost audios and such.
It's sad that we're now known as the "serious business" general that bumps 80% of the time.
Even if you go by that standard, the others are developing more projects than we are nowadays.
Guess we just have to hope for a big boom in audiogen development.
>>
>>42708309
I feel like there is a lot of demonetization that is coming from the board (and internet in general( were anything related to ai gets labeled as slop, making people disinterested in producing anything which in turn lowers the interest even further
>>
>>42699595
>>
>>42704585
Not an expert but from my understanding it requires a special mobo, has no software support and is slow
>>
>>42708721
Well, most of the time something is ai it is in fact slop. Some can't distinguish between doing what was previously impossible and "automating" what people used to put effort into, which added to (if not made up entirely) it's percieved value. Dumb people have always existed, and they are often louder than the rest.
>>
>>42710548
yeah, I would add to this that general novelty since any script kiddie can follow a chatgpt tutorial to api connect whatever services there are to mass produce the expressionless slop.
I still hope the either people from the two Sweetiebot projects (or someone new) will opensource their robot design for people to start making a irl waifus.
>>
>>42708721
>demonetization
I don't think we're disabling 15's paymetons or anything like that. What are you on about?
>>
>>42711005
*de-motivation
sorry, being phoneposter with an auto correction is unforgivable crime (it will happen again)
>>
Made this a while ago:
https://colab.research.google.com/drive/1B2Omo99ww6Lc6Cd0VO2O1JwhZF6OqqEZ#scrollTo=NGJttFFEoSoh

(Fixed version of the Custom Ngrok thingy by BFDIAnon, I could train a pony model on this.)
>>
>>42711550
oh boy, I haven't seen people mentioning the ngrok since 2021 when talknet came along. Do we even have the archived models for them around? or will this version require newly trained models?
>>
>>42711587
BFDIAnon trained some models with this version of it in 2020-2021 with BFDIA characters. The OG ngrok models do not work with this version.
>>
There are only BFDI models with this version. (Keep in mind, it is a fork of https://github.com/CookiePPP/cookietts/tree/master/CookieTTS)

Although I found a larger model using the regular CookieTTS repo: https://drive.google.com/drive/folders/1Zf4gATA55FJCE6bL49mrVBPyhXzFG3sK

It contains Pone voices too.
>>
File: 1712668382688381.jpg (154 KB, 1024x1024)
154 KB
154 KB JPG
>>42703371
>https://files.catbox.moe/26muq9.mp3
>Rob Berry - Another Day - Celestia [rvc]
Not gonna lie boys, I think I cooked this one too much and got lost in the sauce.
>>
Board is getting spammed again.
>>
>>42709898
>>
>>42713758
>>
>>42715962
>>
>>42710724
>https://www.youtube.com/watch?v=FM8yNkWad1w
>2 meme paper
There are new papers on teaching the simulated body of character to mimic motion caption data (by the looks of it, by having internal ai judge how close each limb gets to a keyframed pose) AND it looks like it's not limited to just human or humanoid shaped characters.
It could be pretty useful for anyone with 3d printer & robotic know-how to train robot mares before creating them irl.
>>
https://youtu.be/b3yxk0Bc0Ec?si=gKYfd7NeHK7kVgJO

okay so im really high as fuck and im watching these videos again and it got me thinking. what if there was an ai enhanced auio generation system where all it does is take soundbytes taken from words of the character you want and reorganize them into your own words? Like, as in, take several sentences of dialog from a character, chop it up into pieces, each clip less then the length of a single word, and use that box of legos you just made to construct your own sentences. so instead of using ai to do all of the heavy lifting, you could use real recorded audio of Tabitha and have the ai organize it into your own sentence. Literally. With some mixing perhaps so it doesnt sound choppy. idk.

>does this concept already exist? am i just wasting time being retarded?
>is it a bad idea? if so why?
>i am not a computer nerd by the way, im just here for the ponies.
>>
File: 1760708172504561.png (3.69 MB, 4293x2431)
3.69 MB
3.69 MB PNG
>>42717757
>im high as fuck rn
oh yeah this post is gonna be retar-
>use real clips integrated with ai voice models to get a more accurate output
holy shit anon...
i know this isn't what you initially proposed but it gave me an idea

why not collect as many voice clips of every word a character says, have the dialogue generated, then use the sentence generated to pick from the clips of every collected word we have (any words not said by the pony can just be speech2speech made and put into the list to make up for any missing words) string together the word clips to form a sentence then create an audio file out of that, then put that audio file through a voice model for the pony and it will take those clips with words that are being spoken in different ways (think about old gmod videos where they take different words said at different volumes and peice them into one sentance) and change the way it's spoken to be more natural sounding

I know what i just described obviously doesn't work but i feel like I'm onto something at least, as if there's a little division of labor that could be done to take these raw text 2 speech models and turn it into something way better by just making a little system for it
though im sure it would be MORE resource intensive this way i do find it interesting nonetheless
>>
>>42717757
This exists! It's called "unit selection", and was an early method for generating high quality speech before everything became statistical.
The drawback is that you need a very large dataset for your target speaker. Otherwise, when you want to generate a word that isn't exactly in the dataset, you have to piece it together from many different sounds, and it's more choppy.
I messed around with this a while back (https://desuarchive.org/mlp/thread/35756787/#35778690), and generated these Navy Seals samples:
>Rarity: https://u.smutty.horse/lwqvzknbyfn.mp3
>Applejack: https://u.smutty.horse/lwqvzkneoda.mp3
>Rainbow: https://u.smutty.horse/lwqvzkpilwk.mp3
>Twilight: https://u.smutty.horse/lwqvzknbjne.mp3
>Fluttershy: https://u.smutty.horse/lwqvzkordve.mp3
>Pinkie: https://u.smutty.horse/lwqvzkoypoq.mp3
I used Multisyn, which is a very old program. I'm not sure if anyone is still working on this today, since ML can clone voices with just a few seconds. But I still like this technique because it feels like a YTPMV shitpost (same technique, after all), but also has moments where it sounds more natural than any TTS because it uses the real samples.

>>42718046
This should work, it's similar to people putting TTS output through RVC to make it sound better. Though I'm not sure exactly how good it can be, probably depends on how good the pieced together sentence is
>>
>>42717757 >>42718046 >>42718095
I swear I saw two more such programs back in 2020, that were using TF2 clips and another for the Bethesda games, that after typing a word/sentence would search the character clip index and point out the clips with same/similar words and phonemes.
>>
>>42717757
Even if you handle all the inconsistent letter to sound relationships (pronunciation) it would still sound unnatural and unexpressive. There are tts models that can run super fast on old hardware, they are also unexpressive but at least they sound fine.

>>42718095
neat



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.