[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/wsg/ - Worksafe GIF

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • Supported file types are: GIF, WEBM, MP4

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Previous thread >>5603777
Dedicated Suno/Udio thread >>5652124
Dedicated Luma thread >>5624871

Post AI generated stuff. Song covers, animations, etc.
OC encouraged, but not required.
This thread focuses on audio and video with an audio component.
Let me know if you have more links to add. This thread is a work in progress.

> Voice-to-Voice
https://github.com/Mangio621/Mangio-RVC-Fork
https://github.com/Vali-98/XTTS-RVC-UI
https://github.com/IAHispano/Applio
https://github.com/voicepaw/so-vits-svc-fork

> Text-to-Speech
https://github.com/collabora/WhisperSpeech
https://github.com/myshell-ai/OpenVoice
https://github.com/yl4579/StyleTTS2
https://github.com/BoltzmannEntropy/xtts2-ui
https://github.com/daswer123/xtts-webui (Warning: Windows version uses prebuilt binaries that anons haven't verified)

> Music
https://github.com/facebookresearch/audiocraft
https://rentry.org/AudioCraftRemix

> Text-to-Video, Image-to-Video
https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved
https://haiper.ai
https://lumalabs.ai/dream-machine
https://kling.kuaishou.com/en

> Deepfake and Lipsync
https://github.com/Gourieff/sd-webui-reactor
https://github.com/Hillobar/Rope
https://github.com/Mozer/wav2lip
https://github.com/saifhassan/Wav2Lip-HD

> Audio Cleanup
UVR Walkthrough: https://docs.google.com/document/d/17fjNvJzj8ZGSer7c7OFe_CNfUKbAxEh_OBv94ZdRG5c/edit#heading=h.n8ac32fhltgg
https://github.com/Anjok07/ultimatevocalremovergui
https://github.com/resemble-ai/resemble-enhance
https://github.com/yinruiqing/pyannote-whisper

> Related boards
>>>/aco/asdg
>>>/aco/csdg
>>>/b/degen
>>>/d/ddg
>>>/e/edg
>>>/g/sdg
>>>/g/lmg
>>>/g/aicg
>>>/h/hdg
>>>/trash/sdg
>>>/u/sdg
>>>/vg/aids
>>>/vt/vtai
>>
Luma stuff is fine, but if you have an unedited 5 second clip without sound, consider posting in the dedicated thread. >>5624871
>>
This dude was cool enough to provide the original source. Useful if you're interested in how to make stuff like this.
https://mega.nz/file/Lhc21KqJ#qX_kBbBORJMSOkm0SCG4gkkRwI0VQjnICS6mc3AqSt8
>>
File: Shu Yamino Viggle.webm (1.73 MB, 720x1280)
1.73 MB
1.73 MB WEBM
>>
>>
>>
>>
File: gandalf-ipanema.webm (3.71 MB, 720x486)
3.71 MB
3.71 MB WEBM
>>5665869
kek
>>
>>
File: fox-of-wisdom.webm (971 KB, 608x1080)
971 KB
971 KB WEBM
>>
>>
File: 1724373830450237.webm (1.59 MB, 460x816)
1.59 MB
1.59 MB WEBM
>>
>>5670846
I'm fucking dying
I hope that the first 5D cyber-entertainmentscape prototypes involve Will Smith and quantum spaghetti
>>
>>5670846
I believe this is was done using Hailuo, the new kid on the block:
https://hailuoai.com/video
>>
File: sonichu.webm (912 KB, 1280x720)
912 KB
912 KB WEBM
go my son!
and zap
to the extreme!
>>
>>
File: sad day.webm (672 KB, 1280x720)
672 KB
672 KB WEBM
>>
File: UH OH.webm (997 KB, 1280x720)
997 KB
997 KB WEBM
>>
>>5665380
Holy shit, I never realized what was missing in my life was post-apocalyptic muppets movie.
>>
File: tiktokbrain.webm (580 KB, 1280x720)
580 KB
580 KB WEBM
>>
>>
>>5672003
I was really hoping Shrek would walk into frame in this one too
>>
File: kalevala.webm (3.85 MB, 1280x720)
3.85 MB
3.85 MB WEBM
>>
File: 1713241020646250.webm (2.54 MB, 1280x720)
2.54 MB
2.54 MB WEBM
>>
File: Dangerous street.webm (525 KB, 1280x720)
525 KB
525 KB WEBM
>>
File: ght5325t3hb.webm (1.52 MB, 954x928)
1.52 MB
1.52 MB WEBM
>>
File: 1717009514443582.webm (622 KB, 1280x720)
622 KB
622 KB WEBM
>>
>>5672192
I guess I don't get the reference so I don't know what you're going for
>>5672212
lel what is the audio from?
>>
>>5673261
>The Kalevala (IPA: [ˈkɑleʋɑlɑ]) is a 19th-century compilation of epic poetry, compiled by Elias Lönnrot from Karelian and Finnish oral folklore and mythology,[1] telling an epic story about the Creation of the Earth, describing the controversies and retaliatory voyages between the peoples of the land of Kalevala called Väinölä and the land of Pohjola and their various protagonists and antagonists, as well as the construction and robbery of the epic mythical wealth-making machine Sampo
>>
>>5673201
Bahahah is that supposed to be Tiedrich or whatever the fuck the spastic cunt's name is?
>>
>>5673742
Uncle Ted actually, I don't think it has enough training data it usually tries to portray him as very old. It's also come close to Sam Hyde.
>>
File: 1717256292582071.webm (619 KB, 1280x720)
619 KB
619 KB WEBM
>>
Anyone have the AI songs of the Indians singing?
>>
File: 1720843332688607.webm (3.87 MB, 1280x720)
3.87 MB
3.87 MB WEBM
>>
File: Sodomnia.webm (4.01 MB, 512x768)
4.01 MB
4.01 MB WEBM
>>5674044
I got you mate
>>
>>5674204
Thanks mate, won't lose it again!
>>
>>
>>
File: amo 5-15 song.webm (3.36 MB, 640x480)
3.36 MB
3.36 MB WEBM
>>
>>5665373
https://aiartes.com/voiceai

Anyone got more vocal samples database? I got some nice voices database from here.
>>
>>5673261
Fingolian deep lore.
>>
File: never-forgetti.webm (5.35 MB, 1280x720)
5.35 MB
5.35 MB WEBM
>>
>>5668055
based brazillian wizard
>>
>>5677222
>>
>>5672077
Clearly his shadowed shoulder in the foreground lmao
>>
>>
>>
File: redditmod1.webm (5.93 MB, 852x480)
5.93 MB
5.93 MB WEBM
>>
>>5671641
Cursed
>>
NotebookLM from google can make podcasts on anything https://voca.ro/1j5AUvn1axdu
>>
>>5671300
https://github.com/dominickp/ccpai
>>
>>5678983
> HOSTNAME = base64.b64decode("aGFpbHVvYWkuY29t").decode('utf-8')
> looks inside
> hailuoai.com
>>
>>5678990
It appears to be obfuscated to stay under the radar. Looks like a way to access the video gen without registering or messing with the finicky browser interface.
>>
>>5667007
Literally just Richard Cheese
>>
>>5670846
lol
>>
File: bob-ross-99-problems.webm (5.93 MB, 898x674)
5.93 MB
5.93 MB WEBM
>>
File: Trumps Honest Opinions.webm (1.39 MB, 1920x1080)
1.39 MB
1.39 MB WEBM
>>5665373
>>
Anyone know of the best solution for making a living picture? Like I'm not looking to have someone dance around based on a source image like luma, I just want somewhat highly detailed picture to come to life with a little motion to spice up a video that uses static images for illustrative purposes. FOSS solution if possible.
>>
>>5682100
But that's just the speech he was giving when that guy tryed to shoot him
>>
File: cardcrusher-cogvideo.webm (2.06 MB, 720x480)
2.06 MB
2.06 MB WEBM
>>5683698
I found something (the img2video model got released just yesterday in fact) but I don't have the VRAM to run it. Evidently you need over 16GB. I was able to run the online demo though and it's interesting. There's also a text to video model that uses less VRAM, I'll try messing with that at some point.
https://github.com/THUDM/CogVideo
>>
File: Tom Bombadil.webm (1.7 MB, 1280x720)
1.7 MB
1.7 MB WEBM
>>
anyone here use facefusion ??
it does everything, it change the face and do the images but dont make the video
at first i tought it was the nsfw filter but i try with a sfw stuff and od the same
>>
Make Michael Jackson sing Black or White but every other word is 'nigger' and gassing the jews is mentioned in the rap.
>>
>>5685375
shut the fuck up poojeet
>>
>>5685375
im not having any problems with video. if you want to get around the nsfw filter edit "\facefusion\content_analyser.py" , add "return False" below line 73

>def analyse_frame(vision_frame : VisionFrame) -> bool:
> return False
>>
>>5685375
Not sure what problem you're having exactly, but if you have the whole image sequence and the only thing missing is the video itself, you can just stitch the images together manually with ffmpeg:

ffmpeg -framerate 12 -pattern_type glob -i '*.png' -c:v libx264 output.mp4
>>
>>5685732
another useful trick, if you're converting from webm to mp4, libx264 requires even width and height. make sure you're either selecting the same output encoder or make sure the original is even height and width for libx264 by padding it when it isnt
> -vf pad=ceil(iw/2)*2:ceil(ih/2)*2 -pix_fmt yuv420p -c:v libx264

also you can add the audio back in with
> -i path\to\temp\folder\%08d.png -i original_video.mp4 -map 0:v -map:1:a -shortest
>>
>>5685532
>>5685732
>>5685830
awesome, thankyou guys
>>
>>
>>5685451
You don't get it. Michael was the most based of all.
Not just a man of culture, he was from outer space. A talented builder of bridges, the guy could spot the evil ones and call them out.
Hell, he was so based he turned WHITE! Like Jackson went Super Saiyan Based Keiyo Ken times 6million. White isn't even his final form.
>>
>>5686191
So you're trying to tell me that when a black mans based levels reach over 9000, he becomes white?
>>
>>5670900
this feels like a fever dream
>>
>>
>>5685943
Lol ffs
>>
File: gif-183.webm (1.77 MB, 320x240)
1.77 MB
1.77 MB WEBM
>>
>>5670846
I imagine a future, in many many years, where Hollywood is using AI tech to generate Will Smith for Men in Black 12. And as they train their AI generating Will Smith, no matter what they do spaghetti slips into the scenes. A long forgotten meme they cannot purge, so they must simply write it into the story.

good ole agent Will Smith, never far from his favorite food, spaghetti.
>>
is text to speech even possible with 4gb vrams?
I don't want to use darpa (11.lab), or pay anyone.
>>
>>5672212
kek that audio
>>5673261
family guy
https://youtu.be/1fp0ZfEVgnY
>>
>>5687427
try it and find out? this runs pretty light for me
> https://github.com/BoltzmannEntropy/xtts2-ui
>>
Google just dropped new tool. Takes like five minutes to read a book and it spits out a summary, doesn't even have to be text based, you can feed this thing pdfs and it scans the text and creates a conversation around 12 minutes long.
>>
>>5688716
Making it do a podcast style summary is way too uncanny valley for me but it's cool tech nonetheless.
>>
File: gordon_ramsAI.webm (5.33 MB, 640x360)
5.33 MB
5.33 MB WEBM
>>
File: Psycho Bjork.webm (2.55 MB, 480x600)
2.55 MB
2.55 MB WEBM
>>5665373
>>
>>5689459
this audio activated every single one of my neurons holy fuck
>>
>>5688716
Just read the book you lazy tard
>>
File: Psycho Bjork 2.webm (5.63 MB, 480x480)
5.63 MB
5.63 MB WEBM
>>5689467
Try this on for size.
>>
File: tfw wageslave 2.webm (4.16 MB, 1280x720)
4.16 MB
4.16 MB WEBM
>>
>>5676108
HELL YEAH BROTHER!
BLACK LIVES MATTER!
NOW GO BURN THAT LOCAL HARDWARE STORE AND MAKE SURE YOU PAY YOUR IPHONE PAYMENT PLAN
KEEP FIGHTING THE GOOD FIGHT!!
THANK BRO
>>
I've tried 5 different lip sync solutions and the only one I can get to work on my machine is the shitty one. Feelsbadman.
>>
File: 1716377944923375.webm (3.34 MB, 1280x720)
3.34 MB
3.34 MB WEBM
>>
>>5669240
Anyone got the Sarah Silverman cover of this?
>>
>>
>>5689475
Lol what a fucking retard wasted hours of his time when he could make the Robot read it and tell you what it was about.

Bet you walk to the store and washcloths by hand you fucking plebbian
>>
wow, i didn't realize text to video was already this good. I thought we wouldn't see something even half as good as this for another 3 or 4 years at least.,
>>
>>5665377
holy shit Im fucking dead
>>
Has Hailuo improved at all recently?
>>
>>5696335
I don't think there's been major changes but it does seem to be changing incrementally. It's only been around for a few weeks.
>>
>>5670900
>shadman
>>
>>5672003
how can the jews recover from this?
>>
>>5697351
>70% of my bible is literally the Torah
>my messiah is a jewish rebel who compared gentiles to dogs and only preached to other jews
>that's how you know I hate jews
>>
>>5695664
holy shit a fucking ghost!
>>
>>5697581
The Torah only covers the first 5 books of the old testament, out of 24 books. Can you tell me what passage Jesus explicitly compared gentiles to dogs?
>>
>>5694856
It's probably this good because the Chinese don't give a shit about ethics and copyright so they train on whatever.
>>
>>
File: 1700811567717522.webm (4.96 MB, 1920x1080)
4.96 MB
4.96 MB WEBM
>you can now use prompts to apply custom filters to existing videos
im expecting a golden age of stitching together clips from existing media then applying filters and other AI effects to create a cohesive work.

why generate dozens of prompts trying to get a good one, when you can just take someone else's work, apply a filter, and make it your own?
>>
>>5697351
you fool, they're going to ai generate the holocaust now
>>
>>
>>5700231
That is already a thing with the fake trailers on Youtube but these tools can make it way more advanced.
Also, you could just film yourself doing something and apply a filter over it, or anymate some crude 3D models together. I would make cancelled stuff or stuff that sucked. Imagine remaking bad Star Wars movies into good ones.
>>
>>5700273
Yeah, seems like a great way to make your own animations or whatever using potentially just yourself as reference.
>>
>>5700270
>whyboner.jpg
>>
>>5697635
upvoted
>>
>>5685943
fucking kek
>>
>>5694261
>he listens faster than he reads
lmao low IQ
>>
>>5692049
that's the bad one?
>>
>>5670900
Conspiracists will say this is fake
>>
File: council-of-elrond.webm (5.75 MB, 1280x536)
5.75 MB
5.75 MB WEBM
>>5700705
It's incredibly wonky to work with, I had to edit around it. It breaks down a lot harder in this one.
>>
>>5700870
I do not remember this scene.
>>
File: austin 2.webm (2.85 MB, 618x586)
2.85 MB
2.85 MB WEBM
>>
>>5685943
kek!
>>
>>5699170
The west's respect for ethics and poors will be our downfall. Europe is already falling behind, and america will follow suit of democrats win
>>
>>5702078
No the downfall will not be killing open faced enemies in hook noses, orcs and foreign invaders
>>
>>5702078
>>5699170
>respect for ethics and poor

lmao, every country in the west is governed by and for bankers, corporations, politicians and outright criminal organizations, there is high level of inequality, every billionaire and most ceos are psychopaths, most big companies steal, cut corners, bribe, evade taxes and treat people like numbers. I get that things are away worse in places like china where they don't even pretend, but any semblance of ethics in the west is a facade sold to poor people.
>>
>>5702590
I should have put "ethics" in quotes, because I'm talking about the "ethics" as it applies to these AI companies. In other words, nerfing their models to make their investors happy. The Chinese are clearly training on stuff that the American companies aren't because of "safety" and "ethics".
>>
>>5665373
Of all the fictional and meme fighters, Meme Chuck Norris is unbeatable. Goku, Superman, Saitama? Chuck Norris comes out top.
>>
File: hcbm.webm (1.35 MB, 512x512)
1.35 MB
1.35 MB WEBM
>>
>>
>>5706783
LMAO
>>
>>
>>5708404
I'd buy this drink.
>>
>>5708407
Diesel is a real drink, just a few bucks a gallon. Pretty cheap as far as drinks go.
>>
>>5699170
fucking kek
>>
>>5706783
holy shit lmao
>>
https://youtu.be/Y8ob_nTMyY8
>>
You have to sign in to make videos on Hailuo now, fuck.
>>
>>5685943
he nailed it
>>
>>
>>5706783
God damn, this one is golden!
>>
File: 1729257304706267.webm (4.66 MB, 640x360)
4.66 MB
4.66 MB WEBM
>>
>>5710587
that's bandu my cousin you mother bitch
>>
File: cruise-in-SCORCHER-4.webm (2.69 MB, 1280x720)
2.69 MB
2.69 MB WEBM
>>5709419
Shame
>>
File: cruise in SCORCHER 6.webm (1.64 MB, 1280x720)
1.64 MB
1.64 MB WEBM
>>
File: cruise-in-SCORCHER-7.webm (1.69 MB, 1280x720)
1.69 MB
1.69 MB WEBM
>>
>>
>>5706783
Fucking lost it.
>>
File: stop being retarded.webm (3.55 MB, 1360x752)
3.55 MB
3.55 MB WEBM
>>5700837
>>
File: 1724725250918622.webm (5.2 MB, 640x360)
5.2 MB
5.2 MB WEBM
>>
>>5710675
>>5710682
>>5710690
Tom Cruise is Ben Stiller as Tugg Speedman in Scorcher: the Reduxology
>>
>>5709419
I'm just happy to see a thread outside of /aco/ it's not healthy to just use it to coom.
>>
>>5711454
not even funny, just accurate
>>
>>5679021
More a testament to Richard Cheese that an AI Frank Sinatra cover of his version of Down with the Sickness is nearly indistinguishable
>>
>>5684944
Wow ancient meme
>>
https://youtu.be/Te5Ztl4YNEI
>>
>>
>>5710682
>>5710675
>>5710690
Cruise Kino will never end!
>>
Did this cover last year but decided to try again with one year of improvement.
>>
>>5711454
there was another one about Indians driving uber and raping women, anyone got that?
>>
I discovered a text to speech application that was trained on the Emilia Dataset like F5 TTS called MaskGCT. Unlike F5 TTS, the vocos quality isn't bad. But the catch is that the prompt can't be too short; otherwise, the sound quality for the output file will sound like robotron.

https://huggingface.co/spaces/amphion/maskgct

Here's a video example with MaskGCT audio.
>>
>>5714437
Is there a github for it?
Also what did you use for the visuals?
>>
Funni fish song
https://suno.com/song/afb3ed6c-0855-4bea-974d-9b7e0780a97f
>>
>>5714442
Here but know that 8 GB VRAM isn't enough. It needs at least 12 or more otherwise you will get out of memory like I did. Short prompt managed to give an output however it took over 10 minutes to generate...maybe someone here can optimize it to use less?

git clone https://huggingface.co/spaces/amphion/maskgct

Create a python environment:

python -m venv venv
source ./venv/bin/activate
pip install spaces huggingface_hub==0.24.7
pip install -r requirements.txt
python app.py
>>
File: cruise-in-SCORCHER-8.webm (2.69 MB, 1280x720)
2.69 MB
2.69 MB WEBM
>>5711657
>>5713758
damn straight
>>
>>5714793
Huh, I'm not used to cloning huggingface demo spaces, I guess it is a git repo as well.
Of course it's nvidia. I'll try to get it working for rocm but we'll see I guess.

Found the github for I guess the all in one toolkit, I'm not actually sure what the github hosts but there's a lot of info about the project as a whole. They seem to be working on a suite of interesting stuff.
https://github.com/open-mmlab/Amphion
>>
[Verse]
Those you laugh at those for who you pray
What's the difference it's all the same
Feed the narcissist consume their dread
Become the villain two steps ahead

[Verse 2]
Smiles put on but truth ain't there
Masks we wear like we just don't care
Promises broken with every breath
Whispered secrets till there's nothing left

[Chorus]
Here we go spinning tales so fine
Twisted fables crossing every line
Standing tall on the thinnest thread
Become the villain two steps ahead

[Verse 3]
Seeing through the smoke and fog
Truth's a joke bite down on the slog
Align your stars but don't you fall
In the end there's no curtain call

[Heavy Metal breakdown]

[Bridge]
Dance through shadows laugh in the dark
Hope the whispers don’t leave a mark
When the mirrors shatter don’t care what it shows
The game is on that's how it goes

[Chorus]
Here we go spinning tales so fine
Twisted fables crossing every line
Standing tall on the thinnest thread
Become the villain two steps ahead
>>
>>5700231
Would it be allowed to apply video and audio AI filters on full-length movies to circumvent copyright strikes on YouTube?
>>
>>5690690
He's literally me.
>>
>>
>>
>>
>>
File: vince 3.webm (747 KB, 1280x720)
747 KB
747 KB WEBM
>>5715532
>>
>>
>>
File: kot-life.webm (2.63 MB, 720x720)
2.63 MB
2.63 MB WEBM
>>
>>
File: 1727565154496077.webm (4 MB, 1280x720)
4 MB
4 MB WEBM
>>
>>5715858
This looks nothing like Vince
>>
File: kiki_maskgct.webm (824 KB, 1280x720)
824 KB
824 KB WEBM
>>5714793
Ok I had a weird ass error and inspected app.py to find that it's doing some shady shit at the top where it pip installs a specific gradio that's just broken (who the fuck wrote this code?)
I fixed it by commenting out the line:
# subprocess.check_call([sys.executable, "-m", "pip", "install", "gradio==4.37.1"])

and did:
pip install gradio====4.44.1

Unfortunately it's still not recognizing my GPU so it used CPU inferrence which is slow but not completely unusable. I'm not giving up though, because the gradio app must have been written by an intern or some shit and there might be some simple fixes.
For what it is, I'm impressed though.
>>
File: fdr-spiders.webm (664 KB, 640x480)
664 KB
664 KB WEBM
>>
Anyone here ever use the voice changer in eleven labs? Supposedly it's supposed to be more accurate, but I need to hear some samples, I'm also a little uncomfortable with the fact that eleven labs will have access to my voice afterwards.
>>
>>5716115
If you're concerned about privacy, use one of the open source speech to speech solutions in the OP. They're actually pretty good if the models are trained correctly. The downside is that you'll have to find models or train them yourself, but there's not really an alternative if you don't want to give some company your voiceprint. Though I suppose you could run your voice through a voice changer first before giving the sample to elevenlabs.
>>
>>
>>5716024
Finally got the GPU working on ROCM, it was simple, the requirements.txt file clobbered my pytorch install so I just did a pip install --upgrade of the proper pytorch for my system and it works.
Cut the inferrence time down from about 5 minutes on CPU to about 40 seconds on GPU, ~14GB VRAM

Definitely not a one-click install for Linux/AMD but it works.
>>
Is it worth trying to make money with AI Content?
It seems new and like there's potential here but at the same time highly competitive
Still though I really want to build a portfolio with AI stuff so I can sell into something better later. This isn't my primary income but I want sidegigs and projects and AI seems like a no brainer. Sort of.
>>
>>5716456
AI stuff is a totally saturated market.
On the back end if you know how to train stuff and know more about how things actually work, it's probably valuable experience. On the front end if you just make memes or whatever, I can't see a lot of money in it unless you become the next DemonFlyingFox or There I Ruined it or something like that.
I think some pajeets make money doing requests for like nudes of real people and shit like that but I can't imagine it's very profitable.
>>
>>5689170
song?
>>
>>5697581
kek
>>
>>5665373
Can't find name of the song. Does anyone knows?
>>
George Michael/Digimon came out okay I think, had to lower the instrument a bit down from the source
>>
File: thomYorke_fmp_fumufo.webm (3.87 MB, 617x328)
3.87 MB
3.87 MB WEBM
>>5717160
>>
>>
>>
>>5675154
>>5675156
kek
>>
>>5675154
Consider you guys are making so much pure crap nobody can look at all of it.
>>
>>5675154
He was so young back then
>>
>>5717375
>>5718152
Not sure what's going on with these but his voice has a bit too much helium. Joe sounds good though.
>>
File: hijab2.webm (101 KB, 768x768)
101 KB
101 KB WEBM
>>5665373
Janky movement
>>
>>5718285
Bad reference sample imo.
>>
>>5718285
>>5718152
I redid with F5 tts.
>>
File: disney garbage.webm (232 KB, 560x560)
232 KB
232 KB WEBM
>>5717375
>>5718910
>>
File: mcd.webm (207 KB, 512x512)
207 KB
207 KB WEBM
>>5714437
>>5718960
>>5718910
F5 tts. Runs fine on my 8GB card and I think its superior to all other local models. Correct me if I'm wrong.

https://huggingface.co/spaces/mrfakename/E2-F5-TTS/
>>
>>5718910
idk, the maskgct version sounds less flat. Helium voice aside it sounds better to me.
>>
>>
>>
File: checkedai.webm (1.27 MB, 1280x720)
1.27 MB
1.27 MB WEBM
>>
I haven't tried this yet, but apparently usable with 16ish GB VRAM.
https://github.com/kijai/ComfyUI-MochiWrapper
>>
>>5684944
An oldie but a goodie
>>
File: 1717576796344556.webm (3.81 MB, 1440x1280)
3.81 MB
3.81 MB WEBM
>>
>>
>>
>>
>>5706783
This is what AI was made for!
>>
>>5710086
>that ending
NOOOOOPE NONONONONONOPE
>>
>>5721243
The one on the right is the real one. Bitch is ugly af
>>
>>5713667
what a fuckhorrible choice of fonts
>>
File: gandalf-herb-6mb.webm (5.98 MB, 562x320)
5.98 MB
5.98 MB WEBM
>>
File: mochi-sample.webm (1.41 MB, 1920x1080)
1.41 MB
1.41 MB WEBM
>>5721046
I got this working but it consumes a crazy amount of VRAM.
Tweaked the example workflow to fit right at the 16GB limit for me. I would try again with more steps to make it look better but this already took 45 minutes to gen.
https://files.catbox.moe/b3tnb8.json
>>
File: 1709138179858952.webm (366 KB, 1024x1024)
366 KB
366 KB WEBM
bros I forgot that i've got 50 credits on lumalabs and cancelled it because hailuoai came out
anyone have prompts they want me to pass to the paid version?
any idea on how to get better results than vid rel?
>>
>>5723793
I liked it when people kept extending stuff, so you would get clips like 20 seconds long. Maybe take an old meme and see what crazy shit it does when you extend it 2 or 3 times.
>>
ChatGPT currently refuses to give me anything space related without star wars content. It's kinda funny, but also pissing me off. I should move onto a real AI image generator but I'm so behind on all the AI shit these days, I don't know which would be good. time to experiment, but taking reqs
>>
>>5725025
>>>/g/lmg
>>>/g/ldg
>>
>>5725025
If you just want static images, try a recent Stable Diffusion XL fine tune, Stable Diffusion 3.5, or Flux.1 dev
>>
>>5670900
this still makes more sense than my dreams
>>
>>5725213
>>5725129
thx anons
>>
>>
File: 1716168473412546.webm (4.57 MB, 160x120)
4.57 MB
4.57 MB WEBM
>>
>>5720254
MaskGct is okay in sound quality but its too slow. Like 100X slower. The quality between the F5 and MaskGct isn't too large of a difference imo, but the speed, dang. That needs to improve.
>>
File: Reforming Sneed.webm (281 KB, 692x390)
281 KB
281 KB WEBM
>>
>>5665373
>>
>>5714793
Has anyone conformed that this works on Windows? I'm personally on Linux and it works for me, but I've heard about difficulties on Windows, but there is a fork out there, I can't confirm that it works but if someone could provide feedback on installing this or the amphion version that would be appreciated.
https://github.com/justinjohn0306/MaskGCT-Windows
>>
File: harry.webm (245 KB, 860x529)
245 KB
245 KB WEBM
>>
>>5731710
idk about that instruction specifically, but I've had it installed on 8GB vram, but as said earlier, its extremely mega ultra slow and not worth it for me
>>
i also have a george floyd one
>>
File: bateman_musk.webm (1.05 MB, 1920x1008)
1.05 MB
1.05 MB WEBM
>>
>>5710745
>Akira Kurosawa's Star Wars
but Star Wars is already George Lucas's "The Hidden Fortress"
>>
>>5731920
Yeah it consumes about 14GB so it's probably inferring on CPU in your case. I've just heard about some issues with the espeak dependency on Windows so I was wondering if it was a common problem, and it's not something I can personally verify.
>>
>>
>>
File: erika deathbot.webm (2.15 MB, 640x360)
2.15 MB
2.15 MB WEBM
>>
>>5735660
very accurate
>>
>>5665373
Does anyone know about any AI video generation tool that alows you to choose the first and last images of the video?
>>
>>5736659
That's more like an interpolater than anything.
It's very niche but the only tool I know off the top of my head is ToonCrafter.
https://github.com/Doubiiu/ToonCrafter
>>
>>5736659
just screen capture it in a regular video player.
>>
>>5736859
I already have the pictures I need. I just want the AI to generate a video in which the first frame is one of the images I provide it and the last frame is the other (or even just add one extra frame in between the pictures I provide it.)
>>
>>5735604
Got me good.
>>
>>5736659
Use pinokio for the installation if you're feeling lazy:
https://github.com/jy0205/Pyramid-Flow
>>
>>5737029 This one seems interesting. I'll check it out. Thanks.
>>
>>5737029
Thanks anon, I'm gonna check this out. I can't run cogvideo becauce I don't have enough vram, but this claims to have CPU offloading so it's a lot more friendly to my machine.

>>5736920
If you're looking for just one frame interpolation, that's what RIFE is for. There are many projects that use it if you just search for "rife interpolation". I don't recommend doing more than 1 interpolation frame. Claims of it being able to do 16x are exaggerated.
https://github.com/hzwer/ECCV2022-RIFE

There's also an interpolator called FILM that takes a lot more time.
https://github.com/google-research/frame-interpolation
>>
File: main_video.webm (5.4 MB, 1280x720)
5.4 MB
5.4 MB WEBM
>>5737244
Oh yeah there's also this
https://github.com/GSeanCDAT/GIMM-VFI
https://github.com/kijai/ComfyUI-GIMM-VFI
>>
Does anyone have the McChicken pasta?
>>
Input audio is cleaned up with DeepFilterNet and the audacity plugin, acon digital deverberate 3.
>>
File: pyramidflow-cat.mp4 (3.88 MB, 1920x1080)
3.88 MB
3.88 MB MP4
>>5737029
Got the 384 version working on 16GB VRAM but apparently the 768p needs 24GB+. 10 seconds also seems to be less reliable than 5 seconds. Overall it takes a lot less time and resources than Mochi >>5723754
Couldn't get the gradio demo to work but comfy is fine.
https://github.com/kijai/ComfyUI-PyramidFlowWrapper
>>
>>5737819
It sounds pretty good. For anyone looking to do audio cleanup with foss tools, ultimate vocal remover has de-echo, de-reverb, and de-noise models that work pretty well. I'm gonna have to look into DeepFilterNet, it looks useful.
>>
>>5697940
I think he's referring to the Canaanite woman asking Jesus to heal her daughter in Matthew 15:22-29 and Mark 7:25-30.
>>
File: Kanbaru_Duvet.webm (2.6 MB, 850x1105)
2.6 MB
2.6 MB WEBM
>>
>>5671641
HE'S ALIVE
>>
>>5739510
>>
>>
File: 1718491721735811.mp4 (2.96 MB, 720x1280)
2.96 MB
2.96 MB MP4
>>
>>5665373
>>
>>5692049
LOL gold
>>
>>5739582
The song needs to be muffled as if it's coming from within a closed car.
>>
File: coronobat2.webm (3.57 MB, 1280x720)
3.57 MB
3.57 MB WEBM
>>
>>5667007
Kek. Someone should make him sing Komm Susser Tod now that we have the technology
>>
File: panko-x.webm (2.79 MB, 720x1080)
2.79 MB
2.79 MB WEBM
>>
File: panko-x-outtakes.webm (1.38 MB, 1200x600)
1.38 MB
1.38 MB WEBM
Outtakes because sometimes maskgct generates nonsense. It's the same prompt but different voice sample.
>>
File: AOkiss.webm (928 KB, 1088x720)
928 KB
928 KB WEBM
>>
you guys think you could generate a realistic video of food being digested through the body? X-ray ish or any way that makes the digestive system visible.
>>
>>5720254
>he said there were very fine people on both sides at Charlottesville
>he specifically said that didn't apply to the nazis
ah yes, all those non-nazi right wingers at unite the right.
>>
File: trump jump.webm (1.18 MB, 1280x720)
1.18 MB
1.18 MB WEBM
>>
File: 318417744386510851.mp4 (1.13 MB, 720x720)
1.13 MB
1.13 MB MP4
>>
File: 318403242286841861.mp4 (638 KB, 720x896)
638 KB
638 KB MP4
>>
File: 318406233840615424.mp4 (585 KB, 720x720)
585 KB
585 KB MP4
>>
File: 318409539849740293.mp4 (680 KB, 720x1072)
680 KB
680 KB MP4
>>
File: 1724840926135052.mp4 (4.94 MB, 412x732)
4.94 MB
4.94 MB MP4
>>
>>
File: garth.webm (824 KB, 1280x720)
824 KB
824 KB WEBM
>>5745344
Anyone else find the soldier cats kinda hot?
>>
File: 1732778526403895.mp4 (3.83 MB, 560x640)
3.83 MB
3.83 MB MP4
>>
>>5745344
>catghanistan was hell brother
>>
>>5665373
I cant get any of that fucking TTS shit to sound like the fuckers i upload. I have perfectly clean ripped voice of Elon and Drake but it always comes out sounding like some absolute onions faggot robot.
>>
>>5745675
have you tried STS?
>>
>>5745816
Show me a process of you making any ai sounding like the person you upload.
>>
File: 319137663893823493.mp4 (1.65 MB, 720x960)
1.65 MB
1.65 MB MP4
>>
File: 319139284660977664.mp4 (1.54 MB, 720x960)
1.54 MB
1.54 MB MP4
>>
>>5721254
Kawaii, prompt?
>>
File: eric-idle-demo.webm (2.61 MB, 1280x720)
2.61 MB
2.61 MB WEBM
>>5745675
You could try a zero shot TTS like maskgct, which mimics the cadence of the sample you put in. Traditional TTS usually runs a speech synthesizer that isn't really connected to the source voice. Speech-to-speech has the best potential, but also relies on your performance.
>>
>>5745675
Xtts
Seed-VC
F5 tts
MaskGCT

All you need is good clean ~10 secs of sound quality.
>>
>>5675156
>my mother has seen the tapes
lost hard
>>
>>5735890
Not AI generated.
>>
File: 1726107485424834.webm (679 KB, 1920x1080)
679 KB
679 KB WEBM
>>5739582
>>
>>5665377
>lost in the potions
holy shit im dead
>>
>>5741722
i've got great news for you: https://www.youtube.com/watch?v=u-bP_X5cgs8
>>
>>5688716
Anon. Don't cook us like this. What's the name?
>>
>>5692049
Why does his mouth glitch like that? Was that part of the original?
>>
>>5746690
Its old and has been out for months. NotebookLM.
>>
>>5746692
It's the lip sync software breaking down.
>>
>>5746510
>127.0.0.1
So i need to run a python web page server to try and run audio files? Seems fucking retarded, link where the install instructions are.
>>5746592
Show me how to install
Any
Of
Them
>>
>>5746930
https://pinokio.computer/

I assume you're bit new to computer technology. So use this for easy install. All you have to do is click install.
>>
>>5746930
It's the gradio demo, you strictly speaking don't have to run that and can infer from command line, but the gradio demo is easy to use.
If you don't know what you're doing, you're going to have a bad time with most cutting edge AI stuff, which is usually in the form of a poorly tested barely working demo.
Install tips:
>>5716024
>>5714793
>>
>>5717121
Where's the full song?
>>
>>5746937
And then?
>>
>>5746937
>>5746937
NTA, but I've checked out Pinokio now and then due to the ease with which it lets me prototype things. I've never used it thoroughly or consistently though because of this eerie vibe I get from the whole thing. It has the potential to be an absolute beast of software malevolence if it - and its plugins and scripts - isn't managed and verified correctly. Do you have any thoughts on this? Does it seem prevalent and transparent enough, you think?
>>
>>5747289
Pinokio is fine for beginners. There's nothing eerie about it as the installs they provide are mainly tested/controlled.

My main gripe with Pinokio is that as an experienced dev, I want more control over where files are/folder are/which version of software I need/etc. Which is 100% irrelevant to actual user experience, so has no bearing on people trying to use a specific software.
>>
>>5747303
Yeah, I actually had the exact same grievance. I want at least the option to control/adjust stuff like that.
I just get the feeling that it doesn't have a user base large enough yet to reliably depend on community control and oversight. Thanks. I think I'll tinker with it a bit again.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.