[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/mlp/ - Pony


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: altOP.jpg (1.26 MB, 2119x1500)
1.26 MB
1.26 MB JPG
Welcome to the Pony Voice Preservation Project!
youtu.be/730zGRwbQuE

The Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.

Technology has progressed such that a trained neural network can generate convincing voice clips, drawings and text for any person or character using existing audio recordings, artwork and fanfics as a reference. As you can surely imagine, AI pony voices, drawings and text have endless applications for pony content creation.

AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.

Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.

EQG and G5 are not welcome.

>Quick start guide:
docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Introduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.

>The main Doc:
docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit
An in-depth repository of tutorials, resources and archives.

>Online speech generation
haysay.ai
alpha.15.dev

>Active tasks:
Research into animation AI
Research into pony image generation

>Latest developments:
pastebin.com/4p00iUZM

>The PoneAI drive, an archive for AI pony voice content:
drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp

>Clipper’s Master Files, the central location for MLP voice data:
mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSig
mega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQ
drive.google.com/drive/folders/1MuM9Nb_LwnVxInIPFNvzD_hv3zOZhpwx

>Cool, where is the discord/forum/whatever unifying place for this project?
You're looking at it.

Last Thread: >>42429020
>>
New vibe voice out, anyone try it?
>>
which one
>>
>>42973521
>>42972932
>>
File: 1674108535444236.png (940 KB, 846x868)
940 KB
940 KB PNG
oh shid, are we back?
>https://huggingface.co/Edge-Quant/YuE-s2-1B-general-Q4_K_M-GGUF
Does anybody know how were to get a ui to run this thing? It's two months old quantize model of the YuE music making model (original one was 5GB, this one is little bit over 1GB)?
>>
new tts with voice cloning
https://huggingface.co/Qwen/Qwen3-TTS-12Hz-0.6B-Base
https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-Base
>>
>https://github.com/kyutai-labs/pocket-tts
>https://kyutai.org/blog/2026-01-13-pocket-tts
>kyutai finally released their Pocket TTS last week
>runs on CPU and has voice cloning
>supposedly is faster than real-time audio generating
Finally these French fucks had finished pussyfooting around. I haven't test it but I would imagine it would be pretty useful to as narrator/character voice in the text gens
>>
>>42973675
>100M-parameter
i really doubt it can handle pony voices. Even the timbres, let alone speech patterns. 1.6B vibevoice struggled and 1.6B isn't even the whole network
>>
What exactly caused the death of PPP? It was pretty awesome when generative AI was a novelty and we were way ahead of the curve
>>
>>42973700
>when generative AI was a novelty
It stopped being a novelty and voice models stagnated so much with no news that most anons that aren't hardcore into it just stopped bothering.
I know many blamed it on the death of 15, but he came back and that just gave a minor boost for a short while, because his voices are at the same quality as the rest, aka not that great and not evolving much since years.
>>
>>42973722
Makes sense that it's because development has kinda slowed a little. This shit takes a lot of work; there are some promising local video models coming up at the moment but as far as pony development goes we'll have to see if anything picks up for it.

It's exciting nonetheless, truth be told.
>>
>>42973700
I blame the dumbassess that code + make the models requiring ALL the ram and ALL the newest modules without testing whenever their new code works at all
>-t guy who is trying to install torch 2.5.1+cu118 that should work on gpu but doesn't just because fuck me I guess
My personal hopes are that we will get another Deepseek situation were Chinese devs will figure out how to run a good quality model on potato pc.
>>
>>42973693
hmm, I wanted to try it, as I still remember their website working pretty well few months back, but their main website isnt loading the model, and their hugging face space web thing is giving me trash random noise sounds as output .
>>
>>42973521
>>42973523
Vibevoice asr
>>
>>42973920
>generating structured transcriptions containing Who (Speaker), When (Timestamps), and What (Content)
That's pretty neat, specially it supposedly works with hour long audios. Does anybody know what a text-to-text version of that would be called (like model loads a text novel, and separates the lines between what is being presented by the narrator and what's spoken by the characters) ?
>>
>>42974102
Seems like a given there are plenty of examples of screenplay versions of texts which segment like that
>>
File: may i see it.jpg (7 KB, 258x195)
7 KB
7 KB JPG
>>42974109
my googlefu seems pretty weak, since all I see is people shilling chatgpt/grok/cloude with extra instructions and not a properly trained novel-to-screenplay model. I could take the easy way like that however I don't trust these services to not hallucinate some bullshit in between the lines (and also local LLMs that are somewhat as good as they are require two 5090 to even start up the bloody model).
>>
>>42974152
I wonder what use a dedicated screenplay conversion ai could serve PPP, we already have the transcripts from the wiki, and more significantly the leaked official scripts including all their revisions

I suppose it could be helpful in translating fanfics into animation-ready format that remains consistent with the show
>>
>>42974173
>translating fanfics
that's exactly what I was thinking, while tts models that are combination of local, light weight and great sounding doesn't exist just yet, I thought it would be neat to have something to segregate the text lines into their own character lines and feed these lines to the tts model for a simple audiobook production.
>>
>>42973675
Just testing this one, their example voices are OK, better than MS Sam but obviously not very emotive. I've tried using their cloning setting but it looks like its connected to their huggingface account to make sure it's not use for "the wrong thinking" purposes.
I see the github has a training text file, with links and tutorial how to run it, but with the lackluster ui that came with (and looks like the training script only works on Linux or the win10 WSL), which is a shame because I would love to see a tts model with cloning that is as simple as "press button, get voice" as this would be a dream system set up to work with.
>>
>>42973700
Everyone involved that actually did things stopped doing things.
>>
>>42973700
AI voice tech is only one infinity stone out of the six you need to make AI content.
The race now is to see who can make the first twenty minute MLP episode, not who can make the best AI voice.
This is a shoddy example and in 3D, but see I made this with a custom voice that I crafted using Voicelab in Voicemod: https://tantabus.ai/images/67885
I'm excited for them to release their prompt to voice feature that they are working on which will be huge for making unique voices for OCs.
>>
>https://github.com/index-tts/index-tts
>https://www.youtube.com/watch?v=u2fGWqaOizk
>https://huggingface.co/spaces/IndexTeam/IndexTTS-2-Demo
The Chinese are at it again, if not for the fact that it requires at minimum a 10 vram to work with it, I would be almost willing to try to shill it to other Anons (also on semi related note , holy shit, HF are really grinching on the demo space, only managed to make two clips before it cut me off).
Btw, thank you HydrusBeta for keep the haysay alive, I may not use it as often as I did in the past but it sure does help to have it available 23/7 when its really needed.
>>42974561
Life happens, and especially with everything costing at minimum double, to get a bare bones pc to even just try the local models is disheartening (and with the upcoming "hardware as the cloud service" bs the nvidia and other companies are talking about, future looks grim).
>>42974705
>https://files.catbox.moe/8tiu8f.webm
Just before new year A dedicated Anon in the ai image thread made this one minute ai dubbed music video. It was stitched together from three dozen edited short clips (and who knows how many failed clips were involved in pre production), so it is possible to brute force a episode if somebody was willing to lock themselves in a super-neet basement and do nothing but generate short 5~10s clips for an entire year for the raw footage that would need another year to remaster into actual episode.
With how all the ai video models progressed from just animating 6 frames to a neat 10s, I have high hopes the tech will develop enough for making proper pony episode by the time PPP has its decennial anniversary.
>>
File: Creepy Clip webm.webm (1.27 MB, 720x400)
1.27 MB
1.27 MB WEBM
>>42974762
Oh yeah I made that too. It took me a week to get 28 decent scenes. Like 3-4 days to get the clips and 3-4 days to process upscaling with a rtx 4090.
That's the 30 fps version from Tantabus because I was limited by filesize. The 60 fps version is on civit: https://civitai.com/images/116777810
I used so-vits-svc for the voice because it was the best Diamond Tiara voice I could find.
It really hurts to see my smut get more likes than what this got, but what can you do.
>>
>>42974786
THE FOUNTAIN MARE LIVES
>>
File: Yeah Ok.webm (1.48 MB, 720x400)
1.48 MB
1.48 MB WEBM
>>42974804
There originally was a fountain mare but I had to remove it because wan 2.2 is insistent that statues are people too. In the future I'll have to use a corpo model like grok to animate scenes like that.
I'm tempted to make a directors cut with only the worst clips.
>>
>https://files.catbox.moe/otlaa2.mp3
>>
>>42973662
>1.7B-Base(d)
(3) Twilight: https://vocaroo.com/14WL8R4v7Vod
(2) Rainbow: https://vocaroo.com/1j10R0C5gWhS
(2) Pinkie: https://vocaroo.com/12OXKO0D4rRl
(1) Rarity: https://vocaroo.com/16e7dSGEv9Jk
(3) Applejack: https://vocaroo.com/1dyLZxLnNww3
(2) Fluttershy: https://vocaroo.com/1oThBeZkNePL

Number of rerolls between parentheses. It sucks that providing a reference text works better than without it, so I have to use whisper to automatically caption the reference audio.

Overall, to me, this feels like a sidegrade to Vibevoice 1.5B when it comes to voice cloning: Qwen cloned Pinkie's voice a lot better, but it performed worse with whispers (ASMR (Not showint that, though...)). It also felt like some output voices were a bit deeper than they should be. I still think Vibevoice has more emotion, but that might change if Qwen gets to add "voice control" to their voice-cloning model.
>>
>>42975272
Oh yeah, my results are pretty quiet for some reason, so turn your volume up if you don't hear anything.
>>
>>42975272
gptsovits not going anywhere
>>
>>42975272
>>42973662
uhhh, do this tts have any webui or do you just use the raw console input to operate it?
>>
File: zebra question.png (30 KB, 551x537)
30 KB
30 KB PNG
Yo, would Anons here be interested in doing a little challenge creation, of making a ai generated songs based on a weekly/biweekly theme?
>>
>>42975509
According to their github, it has a Web UI: https://github.com/QwenLM/Qwen3-TTS?tab=readme-ov-file#launch-local-web-ui-demo
But, I just used raw console inputs to generate those examples
>>
>>42975272
>>42975995
Interesting, like the Anon said I would probably stick with the gpt-sovit just because I already have that installed however it could be interesting to test the voice cloning for characters with just one/two lines.
>>
>>42974246
I'm working on this, it's still raw but shouldn't take long to finish
>https://files.catbox.moe/za2c79.mp4
The video is not sped up. LM used is qwen3-4b via llamacpp, but you could plug in anything. Don't mind that the labels don't match the voices, I've hard-coded rarity as narrator and trixie as everyone else while testing but there are finetuned models for most characters out there (therealvul/GPT-SoVITS-v2).
>>
File: user is being sarcastic.png (109 KB, 1101x491)
109 KB
109 KB PNG
>>42976577
IMO thats more or less what I was looking, even if it was limited to narrator + one voice over character it would still be pretty great for making diy audiobooks (as an autist that used the basic bitch microsoft tts to listen to a 10k worded fics, only thing to use anything else is the fact that everything is capped at 10s clips and constantly glitches out on random sentences).
I still have hopes that the ideal version of a screenplay format convector script to be made in few years that also is able to add appropriate background sound effects and musical incidental/leitmotif to play as character talk.
>>
>>42976780
Any half-decent lm can do annotations like this but i'm afraid it would take much longer to process than just determining who said what. And the model needs to be bigger.
>>
File: 1625040047080.png (365 KB, 1105x720)
365 KB
365 KB PNG
Cool, PPP is back. Here are some random bits of pony ai covers found on net from few months back. Whoever made them was clearly not doing best job when it comes to reverb and pitch editing, but still, its kind of neat.
>https://files.catbox.moe/sd7c0h.mp3
Applejack - Fowers - Miley Cyrus
>https://files.catbox.moe/o3g4e7.mp3
Applejack - When She Loved Me (Sings AI)
>https://files.catbox.moe/gv9v2m.mp3
Applejack - You Belong With Me - Taylor Swift
>https://files.catbox.moe/t03xcv.mp3
Fluttershy - A Thousands Years
>https://files.catbox.moe/zced52.mp3
Fluttershy - Clarity - Zedd ft Foxes
>https://files.catbox.moe/ogtfkm.mp3
Fluttershy - Everytime We Touch (ai cover) BOB
>https://files.catbox.moe/9wkk1n.mp3
Fluttershy - I Need Your Love Calvin Harris ft Ellie Goulding
>https://files.catbox.moe/m4brvv.mp3
Fluttershy - Orange show speedway by Lizzy McAlpine
>https://files.catbox.moe/ct0476.mp3
Fluttershy - Roar - JaqeyMaq
>https://files.catbox.moe/ujwr02.mp3
Fluttershy - Yellow - Coldplay

>https://files.catbox.moe/f9a46y.mp3
Glimmer - Bawa Dia Kembali - Mahalini
>https://files.catbox.moe/1xjn4m.mp3
Luna - I will Always Love You (Ai Cover) Norsewarrior
>https://files.catbox.moe/ltmlzy.mp3
Pinkie Pie - Baby One More Time (Sings AI) BOB
>https://files.catbox.moe/xu8vt2.mp3
Pinkie Pie - Oops I Did It Again (ai cover)
>https://files.catbox.moe/0gx18i.mp3
Pinkie Pie - Sweet But Psycho - Ava Max
>https://files.catbox.moe/4glpe6.mp3
Rarity - Clarity By Sings (AI)
>https://files.catbox.moe/yr14y7.mp3
Rarity - Sings Solena Gomez I love You Like A Love Song
>https://files.catbox.moe/u0s3qe.mp3
Super Shy - FS PP RD

>https://files.catbox.moe/qi9msp.mp3
Twilight Sparkle - Fireworks Katy Perry
>https://files.catbox.moe/bvp47t.mp3
Twilight Sparkle - Isyana Sarasvati - Tetap Dalam Jiwa (AI)
>https://files.catbox.moe/ow9sew.mp3
Twilight Sparkle - Speak Now Taylors version
>https://files.catbox.moe/n2ejhz.mp3
Twilight Sparkle - Taylor Swift - Daylight (AI Cover)
>https://files.catbox.moe/bw69vv.mp3
Twilight Sparkle - The One That Got Away
>https://files.catbox.moe/djzm3f.mp3
Twilight Sparkle - Cool Kids Echosmith
>https://files.catbox.moe/0vdzy6.mp3
Twilight Sparkle - Cupid
>https://files.catbox.moe/wgkj8h.mp3
Twilight Sparkle - It Ain't Me - Selena Gomez kygo ai
>https://files.catbox.moe/fzef8e.mp3
Twilight Sparkle - Sayap-Sayap CINTA - TS AI
>https://files.catbox.moe/4o8ab5.mp3
Twilight Sparkle - Teenage Dream
>https://files.catbox.moe/uo0e8i.mp3
Zecora - Dig A Little Deeper
>>
File: 1690919332741037.png (65 KB, 457x450)
65 KB
65 KB PNG
>>42977481
>>https://files.catbox.moe/qi9msp.mp3
>Twilight Sparkle - Fireworks Katy Perry
it's so fucking peak bros
imagine 2012 fags seeing this, puts a damn smile on my face...
>>
>>42977481
>>https://files.catbox.moe/ogtfkm.mp3 (unembed)
>>
File: maregasm.png (23 KB, 129x122)
23 KB
23 KB PNG
>>42977481
>https://files.catbox.moe/sd7c0h.mp3
>https://files.catbox.moe/o3g4e7.mp3
>https://files.catbox.moe/ogtfkm.mp3
>https://files.catbox.moe/ltmlzy.mp3
>https://pomf2.lain.la/f/om249yy.mp3
>https://files.catbox.moe/qi9msp.mp3
>https://pomf2.lain.la/f/p7l4s8jj.mp3
>>
>>42975760
I guess this kind of thing is massively lacking in organisation these days.
>>
File: full.gif (233 KB, 1200x992)
233 KB
233 KB GIF
>>42978009
HOLY FUUUUUUUCK
>>
>https://files.catbox.moe/f6rfv6.mp3
>AJ explains to Applebloom how to use the new oven
>>
File: 7417638.png (2.89 MB, 1378x1600)
2.89 MB
2.89 MB PNG
I'm working on a song cycle as an assignment for composition and thought it would be funny if it was secretly pony. Here's the first movement sang by Celestia (Luna didn't sound good).
https://files.catbox.moe/nw03w6.mp4
>>
>>42979412
Kino art song
I've not heard one in the fandom
>>
>>42979412
I don't know if I quite agree with the notation for the speaking section in mm 45
It could be more clear
In fact I would move the fermata to the next measure and have it on "mortifying!"
I don't think this would cause confusion for the pianist because they are already holding the chord, and for the singer you could give a marking that further indicates its a spoken line/the tone of voice
Bravo anon
>>
>>42979412
dunno, for something that's made from scratch it's pretty good, however im not really feeling the flow of melody (I dont have enough technical know-how to explain it other than it parts of it just don't vibe with my ear).
The pitch of voice also not always comes through correctly but thats just something all male to pony voice conversion models have problems with.
>>
>>42977593
>imagine 2012 fags seeing this
They'd go nuts over that.
>>
>>42977481
This reminded me of this video that I still have on my computer kek: https://files.catbox.moe/pgw3mx.mp4
>>
>>42980821
And this one, have some mareio: https://files.catbox.moe/5hrzo9.mp4
>>
>>42979722
>>42979761
>>42980100
Thanks for the kind words, Anons. Also, nice dubs you two.

>42979722
>I've not heard one in the fandom
Yeah, not many people compose contemporary classical music in the fandom. Especially not this kind. Closest I've seen is LYRAnonymous' Rhapsody and Serenade.
>https://youtu.be/6Qmr56JGhyY

>>42979761
>It could be more clear
Yes, I know what you mean. I was thinking about how to notate this thing for some time, and I think you might be right that it would make more sense to be on "mortifying!" instead. I will talk to some singers and my professor again regarding this.
>a marking that further indicates its a spoken line
There will be performance notes at the start of the score that differentiate between spoken word with defined pitch with rhythm, spoken word with undefined pitch with rhythm, and free spoken word. MM 45 and 46 fall under the third category (Text below the staff with just rests). Since this will be a 4 movement song cycle and only the first movement exists in its entirety, I didn't bother making it yet. There might be other things I'd need to specify after it's all done. And I honestly didn't think there'd be anyone here who would even think about this, so I found it redundant to make it just for this video for this thread. You surprised me, Anon.

>>42980100
>im not really feeling the flow of melody
Yeah, Messiaen-esque and contemporary melodies/harmonies might sound strange to the unfamiliar. It's just a quirk of the style.
>The pitch of voice also not always comes through correctly
I did pitch adjustments to most things, but it's true that I didn't put a massive amount of effort into it, plus I had little time. The song will be sung by a real singer from my conservatory (hopefully), so this mostly exists just for the general idea of what it should sound like. I am also a pianist, not a singer, so that affects the quality too.
>>
>>42980821
that link is broken anon, or at least catbox isnt loading up the video, could you reupload it plz?
>>
>mared
>>
>>42981162
Oh, sorry. I tried again with catbox, but it kept saying it was corrupt, and pone.rs got stuck at 12%. Pomf saving the day:
https://pomf2.lain.la/f/ks6fr95y.mp4
https://pomf2.lain.la/f/e6vk61cm.mp4
https://pomf2.lain.la/f/wefpvs2.mp4
https://pomf2.lain.la/f/squtqz0t.mp4
>>
>>42972267
Where's the rest of the OP?
>>
Scootaloo Sings Down Under Australia Day 2026 ai cover Special Enjoy Everypony

https://files.catbox.moe/jxk9q4.mp3
>>
>>42984802
kino
>>
>>42984802
Neato
>>
>https://files.catbox.moe/fcoy6e.mp3
Fluttershy lagtrain
>>
>>42985945
Nice
>>
>>42982172
What's missing?
>>
>>42986732
I think me meant the send OP post:
FAQs:
If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.
Quick: docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa0/edit
Main: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit

>Where can I find the AI text-to-speech tools and how do I use them?
A list of TTS tools: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.yuhl8zjiwmwq
How to get the best out of them: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit#heading=h.mnnpknmj1hcy

>Where can I find content made with the voice AI?
In the PoneAI drive: drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp
And the PPP Mega Compilation: docs.google.com/spreadsheets/d/1T2TE3OBs681Vphfas7Jgi5rvugdH6wnXVtUVYiZyJF8/edit

>I want to know more about the PPP, but I can’t be arsed to read the doc.
See the live PPP panel shows presented on /mlp/con for a more condensed overview.
2020 pony.tube/w/5fUkuT3245pL8ZoWXUnXJ4
2021 pony.tube/w/a5yfTV4Ynq7tRveZH7AA8f
2022 pony.tube/w/mV3xgbdtrXqjoPAwEXZCw5
2023 pony.tube/w/fVZShksjBbu6uT51DtvWWz

>How can I help with the PPP?
Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.

>Did you know that such and such voiced this other thing that could be used for voice data?
It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.

>What about fan-imitations of official voices?
No.

>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.

>What about [insert OC here]'s voice?
It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.

>I have an idea!
Great. Post it in the thread and we'll discuss it.

>Do you have a Code of Conduct?
Of course: 15.ai/code

>Is this project open source? Who is in charge of this?
pony.tube/w/mqJyvdgrpbWgZduz2cs1Cm

PPP Redubs:
pony.tube/w/p/aR2dpAFn5KhnqPYiRxFQ97

Stream Premieres:
pony.tube/w/6cKnjJEZSCi3gsvrbATXnC
pony.tube/w/oNeBFMPiQKh93ePqTz1ns8
>>
>>42972267
Neat
>>
File: 1689480786753675.png (1.19 MB, 1500x2792)
1.19 MB
1.19 MB PNG
>https://pomf2.lain.la/f/odkytnn.mp3
Luna - Don't Blame the Moon - IO Perty
Im not exactly 100% happy with this one, specially when the duet parts are as neither rvc or sovits know how to handle these bits. I may try to salvage it , if I happen to have a free weekend.
>>
>>42987023
Yeah that's important.
>>
Pony Friendship Hunters - Free (Sung by Twilight Sparkle & Rainbow Dash)
>https://pomf2.lain.la/f/dv60hne7.mp3
>>
>>42989243
very nice Anon
>>
>>42989243
Is this based on any other song or 'original'?
>>
>>42990737
if I told you you would sack me
>>
>>42977481
damn, thats one heck of a nostalgia ride
>>
>>42991006
Go be a retard somewhere else.
>>
File: app.png (990 KB, 1473x1564)
990 KB
990 KB PNG
>>42976577
Finished the tts app. >>42992572
>>
File: Supercharged_TradingCard.jpg (390 KB, 1080x1920)
390 KB
390 KB JPG
>>42992585
oh hell yeah, new tts. What model is this based on Anon?
>9GB 40m download
welp, looks like its going to be a big file
>>
>>42992629
good old gpt-sovits
>big file
because it contains all models therealvul's repo.
>>
>>42992643
>gpt-sovits
oh, cool than, I know I can run it without much of a problem than. How easy/hard would you say would be editing thecharacter list and adding some other custom character (since there is few models that were trained by other folks )?
>>
>>42992675
wait, is it the old gpt-sovits v2 or some other version?
>>
>>42992681
>>42992675
v2.
>adding some other custom character
You'd have to:
>create a folder in models for the new character and organize it like the rest.
>add a folder to ref_audios_inference_server with the reference audio
>add an entry to the dict in inference_server.py
>>
File: error poni tts.png (29 KB, 685x403)
29 KB
29 KB PNG
>>42992585 >>42992715
>x is shitting itself in half a second from opening
welp, time to run that python exe wtih VxKex on
>Loading librieries
>avutil-58.dll is missing
>avutil-57.dll is missing
>avutil-56.dll is missing
>NFOutWorkItem2 could not be located in dynamic library kxdx
>get pic related console error before auto shutdown
fug, I was looking forward to testing that
>>
>>42992832
Interesting, do you have ffmpeg? Try installing it.
>>
>>42992832
Could you add these files to the dir wtih bat and try again? https://drive.google.com/drive/u/2/folders/16zjFqobYW7aQrU0w8QFAIYRPFKoRg6g5
It's my ffmpeg, if it runs i'll include it in the main app
>>
File: ffmpeg -version.png (46 KB, 678x504)
46 KB
46 KB PNG
>>42992847
I do have it, otherwise the audacity wouldn't work for me, here is pic related
>>42992874
>avutil-58.dll is missing
>avutil-57.dll is missing
>avutil-56.dll is missing
>NFOutWorkItem2 could not be located in dynamic library kxdx
and console message error:
Loading libraries, may take a while...
Initialized with max_loaded_models = 10
Exception in thread Thread-2:
Traceback (most recent call last):
File "threading.py", line 980, in _bootstrap_inner
>>
>>42992884
Try installing https://aka.ms/vs/17/release/vc_redist.x64.exe
Also try replacing the ffmpeg folder with ffmpeg-master-latest-win64-gpl-shared.zip from https://github.com/BtbN/FFmpeg-Builds/releases renamed to ffmpeg
>>
File: error again.png (38 KB, 682x401)
38 KB
38 KB PNG
>>42992901
same error again
>>
>https://www.youtube.com/watch?v=WKFQdij7oXw
this random FS spanish rvc cover just came to my attention, so I thought somebody else would also appreciate it
>>
File: 111.png (35 KB, 1192x338)
35 KB
35 KB PNG
>>42993208
So it fails to establish a socket connection. Do you have an antivirus? Try disabling it, and run the .bat as admin.
If that doesn't help, replace line 150 in PVT\runtime\Lib\site-packages\flet_runtime\sync_local_socket_connection.py with this like on picrel:
import errno
try:
self.__connection, client_address = self.__sock.accept()
except OSError as e:
print(e.errno)
print(errno.errorcode.get(e.errno))
then run again. it should tell exactly what the issue is.
I'll go sleep now.
>>
File: error poni tts 02.png (32 KB, 679x403)
32 KB
32 KB PNG
>>42993360
I dont have antivirus, so I just run with as admin, still giving me the same error.
Debugging sure is fun
>>
Up.
>>
>>42993410
>https://www.python.org/downloads/release/python-390/
>The installer now also actively disallows installation on Windows 7. Python 3.9 is incompatible with this unsupported version of Windows.
it seems like this may be the issue. 3.9 is not even a new version, it was released in 2020.
You can confirm that socket communication is broken by activating the python env
>runtime\python.exe
and executing this
>import socket;s=socket.socket(socket.AF_INET, socket.SOCK_STREAM);s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1);s.bind(('localhost', 8080));s.listen(1);print(s.fileno())
if instead of a number it prints an error then it's windows 7
>>
>>42994605
this is a bit more complete
>import socket;s=socket.socket(socket.AF_INET, socket.SOCK_STREAM);s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1);s.bind(('localhost', 8080));s.listen(1);print(s.fileno());s.accept()
>>
>>42994605
I do use Windows 7, however I am usually able to use any ai tool by installing conda env with python 3.10.4
Anyhow, I've run the above and got the "204" number.
Hmm, could you give me the exact pip freeze on what you have on your end, so to check if maybe there is something missing or some module is being pain in ass and needs downgrading on my end?
>>
>>42994620
also getting 204 with this one
>>
>>42994622
pip freeze is already in requirements.txt
maybe try with a new venv? but before that try reinstalling flet
>pip uninstall flet
>pip install flet==0.19.0
>>
>Celestia sings There is nothing we can do Napoleon song
>https://pomf2.lain.la/f/krdyl2ua.mp3
I was trying to get this song >https://www.youtube.com/watch?v=F0Gkr4MBEO0 with Rarity or Fluttershy voice but the ai model was throwing too many weird pitch errors.
>>
File: error poni tts 03.png (31 KB, 686x418)
31 KB
31 KB PNG
>>42994708
WARNING: The script flet.exe is installed in 'Q:\_AIfromC\_AItts\PoniVoiceTool
\PVT\runtime\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warn
ing, use --no-warn-script-location.
ERROR: pip's dependency resolver does not currently take into account all the pa
ckages that are installed. This behaviour is the source of the following depende
ncy conflicts.
flet-cli 0.28.3 requires flet==0.28.3, but you have flet 0.19.0 which is incompa
tible.
flet-cli 0.28.3 requires watchdog<5.0.0,>=4.0.0, but you have watchdog 3.0.0 whi
ch is incompatible.
flet-desktop 0.28.3 requires flet==0.28.3, but you have flet 0.19.0 which is inc
ompatible.
flet-web 0.28.3 requires flet==0.28.3, but you have flet 0.19.0 which is incompa
tible.
Successfully installed flet-0.19.0

Due to python being dogshit at showing what dependencies are actually needed to install shit, I just force upgraded to the lowest version flet-cli was willing to accept the flet

>runtime\python.exe -m pip install flet==0.25.2 watchdog==4.0.0 flet-cli flet-desktop flet-web
I still get the message popup about
>avutil-58.dll is missing
>avutil-57.dll is missing
>avutil-56.dll is missing
>NFOutWorkItem2 could not be located in dynamic library kxdx
So I guess a part of the problem got fixed?
>>
>>42994899
>NFOutWorkItem2 could not be located in dynamic library kxdx
kek, googling this currently shows multiple links to this thread and nothing else
>avutil-58.dll is missing
>avutil-57.dll is missing
>avutil-56.dll is missing
I have no idea what they are. try replacing the runtime with one included in "the integrated package" for windows here https://github.com/RVC-Boss/GPT-SoVITS which is where I got it originally. Then on top of that install app's requirements.
>>
>>42992629
This looks like an eldritch monster.
>>
Pinkie Pie - Conga
>https://pomf2.lain.la/f/u3gn2od.mp3
>>
>>42972267
new tts, wip
https://files.catbox.moe/e8gnuu.mp3
https://files.catbox.moe/v89nuo.mp3
https://files.catbox.moe/wm0k98.mp3
>>
File: 1700972356673997.jpg (338 KB, 2048x1448)
338 KB
338 KB JPG
>>42995291
great song for ponka
>>
>>42995502
Will it stop sounding like ass when finished?
>>
>>42995291
nice!
>>
>>42972267
>>The PoneAI drive, an archive for AI pony voice content:
>drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp

Is this not getting updated anymore? I've never really looked at this general before but I wanted to listen to some random pony voice clips. I see that the newest stuff on here is from 2022.
>>
>>42997350
Yeah, who's even in charge of it?
>>
>>42997731
An anon. I used to manage it alongside said anon but stopped once content started slowing down.
>>
>>42997874
yeah, thanks for your work Anon. I would love to see it come back in some format, maybe an automated booru style with ability for the uploaders to put tags for characters and song styles and what not.
>>
>>42997874
So does any of this music get saved anywhere anymore?
>>42995291
>>42989243
>>42977481

And what about voice clips? Don't tell me you Anon's are creating this stuff and letting it die with the threads
>>
Are we at the point we can have pony-narrated audiobooks yet?
>>
>>42998940
see >>42992585
https://drive.google.com/drive/folders/16Ss26VBmgzcSuTGzhaHqRuyVRceTf-YB
If you can get this working, that's will be easy way. Otherwise there is the old way of just doing each line manually in any TTS and putting them together just like we've done five years ago.
>>
>>42998982
If you try it please report back whether you managed to launch it or not because I am not sure if I packaged it properly
>>
>>42999237
heyo, Im this Anon >>42994899 and I kind of had stupidly busy weekend + Monday, the soonest I can try to mess around with it will be Tuesday (im thinking of just setting up conda env on my end to see if that will be a good workaround)
>>
>>42995502
reee crapbox
>>
>>42972267
Hello all, I'm currently working on a theoretical project of a real-life rainbow dash. That's all I can reveal because the haters will sabotage me. Does anyone know if there's currently a solution for live, conversational talk with a Rainbow Dash voiced & personalitied chatbot that TTS's convincingly in realtime?
>>
>>42999922
uh might wanna look into local LLM's and TTSes?
>>
Sorry, I don't want to be an annoying faggot but does can anyone answer my previous question? Is there a new place where pony voice clips and music is being uploaded to, if at all? I see that /create/ as this google spreadsheet for a bunch of songs but I don't know how regularly that one gets updated either
https://docs.google.com/spreadsheets/d/1GrDEeCc9B7mAkQWAx_eoRdw4bF_W-5bsaAm4otvjfWE/edit?gid=0#gid=0
>>
>>43000439
>Is there a new place where pony voice clips and music is being uploaded to, if at all?
No, no known active cataloguing efforts at this point in time.
>>
>>43000443
Damn. I guess I'll just save catbox and vocaroo links whenever I see something I like then. I don't know how many Anon's are still actively working within this thread, but thanks for everything you guys have done up till now
>>
>>42995502
having trouble getting these to load, anon
>>
>>42999317
I had some people with win10 test it and it works out of the box for them, so it's probably not a python env issue. Those avutil dlls belong to ffmpeg, the included version contains avutil-60.dll but yours reports absence of previous versions. I'd test whether the included one even runs and mess with path variables.
>>
Anyone mess with MOVA at all yet? Wondering how it works for pony videos. The previous pony videos in this thread were so damn cool to see.

>>42999922 (checked)
>Making real life AI pony
>That's all I can reveal because the haters will sabotage me
Certainly won't get sabotaged here, you're in damn good company, kek. Unless you got some group of schizos on your tail, that I understand.
>>
>>42977481
Holy shit this post is blessed. I hope someone's backing these up. Does anyone have any more Pinkie covers? She's my fave, but truth be told I'll take any more in general because I love them. A Thousand Years by Fluttershy was so pretty. There's a wonder to some of these covers that's really touching me; combo of ponies and 2010's pop is a combo that sends you right back to 2012 and it's like being hugged by a warm blanket.
>>
>>42999922
>https://github.com/EDCD/EDDI
As a starting point I would recommend the github Elite Dangerous tts ai companion (it's one of few handful that are out there), it pretty much just something along the lines of connecting the SillyTavern with online LLM and TTS and some inputs from the game to have the chatbot acting as assistant/companion that chills out with you.
>https://www.youtube.com/watch?v=n_m0fqp8xwQ
There is also the new Nvidia idea PersonaPlex, were instead of having models pipe line of "speech to text -> LLM -> TTS" they kind have model that does all of the talking and listing at the same time, sadly its mostly trained on approved HR phone administration calls so the presented output is sucking ass at roleplaying characters.
>>
>10
>>
bump
>>
>>43001298
>>
>>42999922
The only two that come to mind are:
1. https://huggingface.co/nvidia/personaplex-7b-v1
2. https://github.com/FlashLabs-AI-Corp/FlashLabs-Chroma
The first one, an anon got it to run on his 3090 after some tweaking: https://desuarchive.org/g/thread/107886414/#107888720
Result: https://vocaroo.com/1nLUHmvzYpUb
No idea how the second one performs.
>>
>>43002522



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.