[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106491545 & >>106481874

►News
>(09/05) Klear-46B-A2.5B released: https://hf.co/collections/Kwai-Klear/klear10-68ba61398a0a4eb392ec6ab1
>(09/04) Kimi K2 update for agentic coding and 256K context: https://hf.co/moonshotai/Kimi-K2-Instruct-0905
>(09/04) Tencent's HunyuanWorld-Voyager for virtual world generation: https://hf.co/tencent/HunyuanWorld-Voyager
>(09/04) Google released a Gemma embedding model: https://hf.co/google/embeddinggemma-300m
>(09/04) Chatterbox added better multilingual support: https://hf.co/ResembleAI/chatterbox

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: 1709421810542.png (124 KB, 654x779)
124 KB
124 KB PNG
►Recent Highlights from the Previous Thread: >>106491545

--Klear-46B model training methodology and benchmark performance analysis:
>106492824 >106492846 >106492855 >106492872 >106492877 >106492882 >106492885 >106492903 >106493017 >106493058 >106493088
--AI-generated loli podcast creation using VV voice cloning and GLM text generation:
>106495961 >106495966 >106496018 >106496034 >106496055 >106496061 >106496121 >106496139 >106496144 >106496157 >106496189 >106496197 >106496208
--German supercomputing expansion and copyright law compliance challenges:
>106493305 >106493329 >106493355 >106493378 >106493423 >106493481 >106494001 >106493977 >106493529
--Qwen Max model updates and community collaboration efforts:
>106491646 >106492302 >106492366 >106492394 >106492411 >106492421 >106492430 >106492428
--Balancing data quality and diversity in machine learning training:
>106492910 >106492929
--VibeVoice-Large's capabilities and controversy:
>106494251 >106494424 >106494708 >106494778 >106495648 >106494801 >106494950 >106495166 >106495298 >106495187 >106495273 >106495566 >106495590 >106495612 >106495637 >106495639 >106495671 >106495689 >106495101
--Challenges with managing R1's censorship and card-based context switching:
>106493572 >106495514
--Temperature settings tradeoff between tool call accuracy and answer quality in local LLMs:
>106491720 >106491751 >106491761 >106491845 >106491888 >106491989
--Gender bias in doctor riddle from Qwen3-Max-Preview:
>106493573 >106494565 >106494593 >106496265
--Qwen3-Max-Preview (Instruct) outperforms peers in benchmark tests:
>106492622 >106492630 >106492638
--VibeVoice model optimization challenges for single-voice applications:
>106496609 >106496636 >106496646
--Analyzing Qwen3 Max's distinctive generation style:
>106493524
--Miku (free space):
>106493154 >106493503 >106493190 >106494251 >106497578

►Recent Highlight Posts from the Previous Thread: >>106491549

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
Mikulove
>>
File: 6000.png (141 KB, 600x522)
141 KB
141 KB PNG
You have them, right?
>>
>>106497597
geez Peter, TWO mics?
>>
>A loli whispers in my ear
THANK YOU MICROSOFT
AHAHAHAHA
>>
I hate microsoft.
Xi please release the same or better model.
Please your parade was very impressive.
>>
>>106497909
the only good chinese model is wan
>>
>>106497916
>the only good chinese model is wan
The only good local is wan? are you baiting me?
>>
>>106497909
here's your chinese tts bro
https://www.youtube.com/watch?v=mnfLp9O96ak
>>
>>106497927
model, deepseek / kimi are not where near claude / gpt / gemini
>>
where do I get good voice clone clips
>>
>>106497909
The model is from MSRA in Beijing with a full Chinese team
It's by all means a Chinese model
>>
what are some absolutely necessary loli voices I should be cloning right now?
>>
>>106498009
Aya Hirano
>>
holy shit there are no good online tools for making multiple cuts to an mp3 file lmfao
>>
Where can I get removed VibeVoice large?
>>
>>106498156
click it then press delete, then go to your recycle-bin and empty it
>>
>>106497992
youtube
>>
File: 1745161289919011.jpg (2.45 MB, 2880x2504)
2.45 MB
2.45 MB JPG
>>106498148
ask your favorite llm about ffmpeg
>>
Looking for suggestions for an uncensored lite local model for using on my phone. Purely informational.
>>
>>106498210
this, gpt5's codex / claude code automate so much stuff for me, I just ask it to make some script to do something and it takes like a minute
>>
>>106498210
nah ffmpeg does a lot but if you need to cut up an audio file a whole bunch of times to remove voices from other characters or sounds you really need a gui to plan the cuts and you know not type a billion things into a terminal constantly. found some site called soundtrap that can do what I want. if I get into this enough then I'll just download audacity or something
>>
>>106497992
just make your own in audacity. Most good TTS only want 20-30 seconds so. Focus on quality above all. The audio should be clean with no noise. This is where most people fuck up because the voice they are trying to clone doesnt have good audio sources (music, sound effects, static, background noises, traffic etc).

People do share models but you'll eternally be using taylor swift, peter griffin, etc.
>>
>>106498228
tell it to make you a tool for doing that. GPT5 can one shot it
>>
File: 1742096286363791.jpg (512 KB, 2058x2562)
512 KB
512 KB JPG
>>106497992
>Pirate TV show/movie
>Extract audio with ffmpeg
>Trim odoen to bits you need
>?????
>Profit
>>
Is there a local version of nano banana anyone has made? the ones iv seen on hugging face went down quickly
>>
>>106498265
nano banana is not a local model, its a google model
>>
File: shitty website.png (46 KB, 1992x505)
46 KB
46 KB PNG
>>106498240
yeah I figured
>>106498251
am I missing something. is ffmpeg easier to use than I thought for something like this?
>>
>>106498148
>>106498210
>>106498226
kek i remember some time ago when i was cutting up the audio for some other tts i was too lazy to install kdenlive or some other shit so i asked deepseek for ffmpeg idk what happened but shit dident work (think i installed it wrong or sumthing) so i jsut asked it to instead make a powershell script which jsut worked lol XD literally just put in the mp3/mp4 in the folder and give it from which second to cut to which and it does it fucking awesome how jank you can get with llms its really alot of fun
>>
>>106498271
Actually using audacity would probably be easier. I'm so used to the CLI interface that I sometimes forget guis exist.
>>
>>106498271
ffmpeg is for nerds who love command lines. If you want usable stuff, use audacity, or maybe even da vinci resolve which will do audio fine for free.
>>
>>106498265
the best image model out right now that can be run on your gaming PC is qwen image, you can run it if you have 16gb of vram.

here is a guide from a man who is definitely not a pedo
https://www.youtube.com/watch?v=0yB_F-NIzkc&t=303s
>>
>>106498269
thank you anon, thats disapointing is there anything really comparable i can use locally?
>>
>>106498319
sorry didnt see this reply, lol this guy looks sus
just want a good model to edit wallpapers with
>>
>>106498334
qwen image / qwen edit?
>>
Sonoma Sky/Dusk Alpha are likely the next LLaMA or a new Meta line of models (possibly proprietary)
>>
>>106498390
lol no, its grok, just ask it, and its shit
>>
new kimi is great btw, like actually better than sonnet imo
>>
hmm it gets pretty schizo at 1.3 tried 1.4 and have tried higher but I dunno.
https://vocaroo.com/1dlL1nEjQeny
said voice clone file
https://vocaroo.com/1orutFZaUpJb
>>
>>106498428
10 steps are far too few, try like 50
>>
>>106498412
It's shit. I accidentally used v3.1 instead of the new kimi for one of my tests and I actually had a much better time with that before I noticed.
>>
File: .png (393 KB, 1468x578)
393 KB
393 KB PNG
Can anyone give me an estimate of how many t/s I could get with pic related at a 5090?
If 3090 + 96 GB + SSD can run R1 at .88t/s how much of an increase would it be with 512 GB of DDR5 over 5.0 x16 + 96 GB of ddr5 + 32GB of vram?
>>
I spent 3 hours looking at comfy ui and all this crap because you told me it was easier on us vramlets and I finally got the comfy-UI VibeVoice thingy running and when I try to generate I get stuck on this
>Downloading VibeVoice model: VibeVoice-Large...
>Fetching 17 files: 0%| | 0/17 [00:00<?, ?it/s]
an hour later still stuck there, I force stop comfyUi and restart it and it still gets stuck on that
>>
>>106498668
>ddr5
so the same speed as using it on regular ddr5?
>>
>>106498668
It's going over PCI-E 5.0 X16 so the hypothetical maximum bandwidth going over that connection is 128GB/s
>>
>>106498676
Download from modelscope into ComfyUI/models/tts/VibeVoice-Large.
>>
Redpill me on nanobanana
>>
>>106498704
>ComfyUI/models/tts/VibeVoice-Large.
*ComfyUI/models/tts/vibevoice/VibeVoice-Large
>>
>>106498707
SOTA but it's likely actually genie3 creating a virtual reality where the prompt is real and taking a virtual photo off that
>>
>>106498319
do you think it can be done with 12gb?
>>
>>106498721
That sounds incredibly convoluted for what's essentially Photoshop: Gemini Version but it's cool how it works
>>
>>106498668
I dunno, r1 was kinda hard to run and I havent tried it since jan cuz I hate it's writing style.

I have a build of 5090, 2x 5060's for 64vram/160 ddr5 (4000mhz). On linux that got me 5 tokens a second on 4k context full glm q4 with some mmap and maybe using 48gb of vram (hard to balance MoE layers, I suck). Presumably if I bought a proper 256gb ddr5 (6000 mhz) kit, I'd be getting more tokens per second, maybe 8 or so even with 8k context I wanna say.

That's a 200gb model. 400gb deepseek is gonna cut shit in half unless you invest in tons of vram
>>
how much potable water is being drink because of this.
how many forests are being burn because this technology.

People accepted computers because their energy output is low.
Now that is gone.
>>
>>106498787
almost nothing, and water is not destroyed, that is not a thing, it just condenses back into water after being turned to steam
>>
>>106498787
if burning electricity for cars is considered green, then burning electricity for ai is even cleaner (and doesn't cause tonnes of rubber plastic contamination through tires)
>>
>>106498787
dying of thirst
computers drank it all
me go too far
>>
>>106498787
>People accepted computers because their energy output is low.
No, they accepted them because the utility is high. Now its even higher.
>>
when are we going to get tts.cpp and vibevoice GOOFS
fuck this python nonsense, couldn't be bothered to set any of it up all over again for every new shitty web UI and whatever that gets released
>>
>tfw mom is mad at you again for using up all the house water to talk to the ai
>>
>>106498787
leftist detected
>>
>>106498787
I set fire to the amazon (both the river and the rainforest) just to ahh ahh mistress, and I'd do it again.
>>
>>106498787
You have identified the issue but not the cause. We have water shortages because people reproduce endlessly until we reach a breaking point. The main use of water is to grow FOOD.

Most electric plants and datacenters consume a lot of water but that water is cycled into the plant and then returned to the environment shortly, making their numbers on a graph look high, but essentially very low compared to other uses.

They do use a lot of power though. They need to chill out a bit on large training runs and pointlessly making tiny improvements.
>>
>>106498428
Where can I get the model?
>>
>>106498787
i’d burn the entire amazon if it meant i get to rp with my robot loli
>>
>>106498959
https://www.modelscope.cn/models/microsoft/VibeVoice-Large/files
>>
>>106498704
>>106498676
ok I got it from the torrent in last thread, took a whole friggin hour to download it
Honestly yeah I see a pretty massive improvement, previously it took me 3 minutes to generate 15 seconds of speech with the Large model now I generated 40 second of speech in 80 seconds
MASSIVE improvement
>>
Here's the reason why vibvoice large was taken down: https://voca.ro/1bCzVodtGtHZ

(had to use a voice clone of porn moaning to get it reliably to do this. The base clip sounded this fake and inauthentic too so maybe someone can do better)
>>
Fucked up how a picture is worth a thousand words yet LLMs are way more resource intensive than diffusion models
>>
>>106499113
I feel unsafe right now. Like, my whole life is in danger.
>>
>>106499132
Now I'm basically raping you : https://voca.ro/15bXrL5GeAS9
>>
>>106498787
Energy consumption is how civilisation advances
>>
File: 1286404019585.png (12 KB, 468x425)
12 KB
12 KB PNG
>>106499149
Take care when using the EXCLAMATION MARK(!) IN VIBEVOICE! I find it hilarious when it instantly goes to 11 with mic clipping and distortion
>>
>>106499173
Example: https://vocaroo.com/127ZooPcK7mj
>>
Don't you dare!!
>>
>>106499113
can it do japanese?
>>
>>106499220
Yeah real. English sucks for sex. Though I guess it could be worse.
>>
>>106497859
It's the Chinese Family Guy knockoff.
>>
>>106499189
Kek you weren't kidding, that really went apeshit.
Was that just an exclamation mark or was it allcaps too?
>>
>>106499018
Some of the results I've been doing, I stole that degenerate's Gwen voice >>106498428 and just ran it through an AI voice cleaner, all those cartoon sound effects and background noise ruin the sample
and a violet sample I had prepared before
https://vocaroo.com/1llO81h1n7kR
also you need to find more even keeled samples, that sample will only produce a hopped up angry yelling Gwen
also from what I've seen the sample options mostly produce garbage, turn it off and the steps seem to be fine at 30 at most, I didn't see massive improvement beyond that point and only slows down with diminishing returns
>>
>>106499220
No. I put in some jav moans as the model. Even I can tell it's bad. I generated this several times and I never got the same passion or breathiness, grunts etc that the English voice could. https://voca.ro/1aruRYcd92sp

>>106499364
It contextually just sorta figures it out. Theres no prompting or anything, but you can say "I'm gonna sing a sad song about etc" and it will try to do it kind of. Voice models seem to help push it in various directions too. I bet it could sing better if I just put in a song.
>>
>>106499389
Which one?
>>
>>106499415
>fakingu yes
>>
>>106499415
Jeesas
How about Chinese?
>>
>>106499425
which one what?
>>
>>106499415
>https://voca.ro/1aruRYcd92sp
lmao, that's funny though
>>
>>106499442
AI voice cleaner.
>>
>>106492238
>>106497310
If you're using AMD, be aware that the default for --flash-attn is now "auto", which means to enable it if the backend supports it.
On master the AMD FlashAttention performance can be quite bad though, so try "-fa off" and re-try after https://github.com/ggml-org/llama.cpp/pull/15769 has been merged (if you have an old AMD GPU).

>>106498834
bark.cpp https://github.com/PABannier/bark.cpp already exists though the last commit was 10 months ago.
>>
>>106499448
cleanvoice, literally the first one I found while googling lol
You have to create an account and it has limited uses, you know what we're in /lmg/
someone please point me to the best local voice cleaner model please
>>
>>106497597
>https://www.theverge.com/anthropic/773087/anthropic-to-pay-1-5-billion-to-authors-in-landmark-ai-settlement
>Anthropic to pay $1.5 billion to authors in landmark AI settlement
>$3000 per book
Pack it up boys, it's over.
>>
>>106499449
>bark.cpp https://github.com/PABannier/bark.cpp already exists though the last commit was 10 months ago.
That's bark model specific though, and I think VV will be more difficult to implement support for since it's actually a diffusion model + a Qwen LLM.
>>
>>106499477
seeing comments cheering it i think people deserve the humiliation ritual that is the modern world
>>
>>106499477
they should be releasing claude 1.2 instead
>>
File: 1754165881938544.png (352 KB, 640x480)
352 KB
352 KB PNG
>>106499488
>seeing comments cheering it i think people deserve the humiliation ritual that is the modern world
this, why the fuck do they want to make their own jail, humanity was a mistake
>>
>>106499477
hey wheres my 3,00 dollars? I've been typing bullshit onto the internet for years. When someone tells the ai not to act like an uniformed angry idiot, that's MY DATA they're using.
>>
>>106499477
looool
>>
>>106499121
That’s….the natural implication of that phrase
You need 1000x the resources to generate the words for 1 picture
>>
>>106499477
holy fuck dude, this is actually horrible, the US really wants to lose the AI race to the chinks or what?
>>
>>106499497
Humanity can accomplish amazing things, it's humans that are the problem. Once you realize that anything under 120 IQ can barely be considered sapient you'll know universal sufferage and internet access was a mistake.
>>
>>106499477
>1% of the company's worth
oh no
>>
>>106499497
The market is a thing that allows me to buy things. But when it goes away i probably wont need it.
>>
>>106499518
we have invested hundreds of billions on ai and hundreds billions more on hardware to run it. We have invested 50 times more on ai than on nuclear fusion.

Thats settlement is token shit to say we did the right thing. And you are correct in thinking that if we actually acted with integrity and morality, other countries would surge ahead of us as we shot ourselves in the foot. If you think this tiny crumb is gonna slow us down you're kind of dumb. If anything it shows the worst that could happen and emboldens lawbreaking as a known expense. A slap on the wrist is the worst that can happen.
>>
>>106499477
That's Anthropic's problem. Should have given Orange Jew a few appeasement gifts.

>>106499518
They have multiple groups of jews infighting for money while chinks can act as one. They don't care so much about the long term, as long as it instantly profitable it's okay. The market will fix it. EU on the other hand put on safety IoT cock cage on and is begging to be dommed by both.
>>
>>106499449
>bark.cpp https://github.com/PABannier/bark.cpp already exists
Nice, thanks.
I also saw that koboldcpp added support for something called TTS.cpp although in my (very limited) experience it's really slow on PC and the developer seems to be a macfag because that's the primary platform.
>>
>>106499574
>If you think this tiny crumb is gonna slow us down you're kind of dumb.
I don't think you realize how serious this is. All emerging companies will need billions of dollars to obtain the data necessary to train their models. This will destroy everything; only large companies will be able to afford it. The US killed itself on that race, they didn't just shoot themselves at their foot this time.
>>
>>106499521
>Once you realize that anything under 120 IQ can barely be considered sapient
fact, and I say this because have 121 IQ kek
>>
Has anyone tried VV with some Japanese dlsite voice works? I'm curious how it would handle going from Japanese to English.
>>
>>106499521
Most humans that report 120+ IQ are benchmaxxed.
>>
>>106499521
>Once you realize that anything under 120 IQ can barely be considered sapient you'll know universal sufferage and internet access was a mistake
I felt that way after seeing the lack of gamers and reviewers mention how utterly broken the AI is in the new shinobi (where you can stand next to many enemies and not ever take a single bit of damage)
people are worthless
>>
>>106498702
4t/s for GLM-4.5-FP8...
>>
>>106499521
>Once you realize that anything under 120 IQ can barely be considered sapient you'll know universal sufferage and internet access was a mistake.
and the average IQ will goes down and down due to the fact the africans are the only ones making a shit ton of babies, this world is fucked, I pity the future generation
>>
>>106499415
>https://voca.ro/1aruRYcd92sp
Kek that's actually not bad though, it just seems to have lost the context of what it's doing.
>>
>>106499735
I mean, it is a 5x increase but on the other, 3000 USD + your own RDIMMs is get another GPU territory.
>>
>>106499121
LLMs need a much better world model than image models. If you fuck up just one word, it can completely break a sentence or turn it into nonsense, but nobody cares if some blurry background detail on an image is a bit deformed. Or even some foreground details in many cases.
>>
>>106499702
I guess this answers my question >>106499415
>>
File: 1755910635224541.png (721 KB, 832x1127)
721 KB
721 KB PNG
>Kimi turned out to be censored
>Deepseek is still autistic
>GLM wasn't much, of anything
What's /g/ using for ERP these days after the rose color glasses of that 'new model prose' has worn off?
>>
>>106499780
R1
>>
>>106499780
Literally nothing
>>
So does vibevoice have stuff like [laugh] or it's words only?
>>
File: gwen tentaclous.jpg (912 KB, 3238x1366)
912 KB
912 KB JPG
>>106499389
Ok enough fun with this thing for today, I'm really impressed by the inflections and effects that it gives the scripts, really surprising model all around
https://vocaroo.com/19GSroXyQYlT
>>
>>106499831
>really surprising model all around
that's why they wanted to shut the model down, it's too good for local
>>
>>106499831
>and don't get me started
Did an LLM write this script?
>>
>>106499863
no but my imagination is pretty stunted anone-kun
Right now I'm just trying to come up with funny throwaway scripts to test this sheez
>>
>>106499875
This would be more of a storytime than an RP but here is some human slop I wrote for the Open Assistant dataset:

>In the land of South Korea K-pop used to reign supreme. Anyone listening to other genres of music was ridiculed and even openly discriminated against. But no one had it as bad as the fans of Japanese idols. Gangs of K-pop mutant mecha warriors roamed the streets and when they found an idol fan they would be lucky to get away with just a beating. Their favorite thing to do with idol fans was to use their built-in speakers to blast K-pop at such volumes that it would rupture the idol fans' eardrums so they would never be able to hear the voice of their favorite idols again. Sometimes they would switch it up by spewing acid from their mutant bodies for the same effect.

>A lone blacksmith knew that the K-pop warriors would be coming for him next. He had made a small figurine of a vaguely humanoid monster with sharp claws and pointed horns. With all of his heart he prayed to Hatsune Miku, begging her to bring his statue to life so that it may protect idol fans from their terrible persecution - and his prayer was answered. Hatsune Miku descended from the heavens and with her divine powers she brought the statue to life. She named the monster Pulgasari, the eater of iron and steel.

>And Pulgasari did indeed become stronger and bigger as he consumed more and more metal. To grant him even bigger powers Hatsune Miku brought the radioactive remains of the Fukushima reactor core to Korea so that Pulgasari may feast on them. And as the radiation entered Pulgasari's body he began to mutate, growing stronger and stronger by the second. The blacksmith knew that with Pulgasari on their side the time for rebellion had come and so he rallied his fellow idol fans to finally rise up en masse.
>>
File: 8LhyPbK.gif (465 KB, 500x281)
465 KB
465 KB GIF
>>106499189
>you- FUCKING NIGGER!
>>
>>106499113
that's bredd good actually :-D
>>
>>106499875
>>106499907
>It wasn't long until the K-pop warriors realized that something was wrong: a giant, radioactive monster was marching towards their headquarters and thousands of rebel idol fans were following it. Thanks to their mechanical bodies the K-pop warriors were able to quickly concentrate their forces and a battle of epic proportions ensued. The K-pop warriors reasoned that they would only need to take down Pulgasari and their victory would be assured, but their strategy ended up backfiring. With each felled mecha warrior that Pulgasari consumed his wounds wound close and he emerged even stronger than he had been before. Eventually the K-pop warriors realized their mistake but it was too late; Pulgasari had killed too many of them and they were being overrun.

>The battle ended with a crushing defeat for the K-pop warriors and their headquarters was occupied by the idol fans. But Pulgasari's hunger for metal did not stop. He continued to feast on the corpses of the defeated mecha warriors and then went on eat any kind of metal he could find. Hatsune Miku, realizing that Pulgasari's hunger would never be satisfied, quickly hid herself in a bell just as Pulgasari was eating it. Pulgasari devoured her and as he realized what he had done he turned to stone while letting out a heart-wrenching scream. Touched by Hatsune Miku's heroic sacrifice the fans of different music genres established an uneasy peace. Whether this peace would last only time would tell but if the menace of K-pop mutant mecha warriors were to ever rear its ugly head again, then Pulgasari will be back to stop them.
>>
>>106499149
holy shit man she needs to calm down
>>
>>106499780
giantess woman.
her ass is your new home.
>>
>>106496501
>>106496504
Thank you, absolute legends. Now it not only doesn't OOM, but works faster in some scenarios where it wasn't OOMing.
>>
>>106499839
Is it still available somewhere?
>>
>>106499967
no
>>
>>106499967
yes
>>
>>106499967
maybe
>>
>>106499967
https://www.modelscope.cn/organization/microsoft
>>
>>106499907
>>106499916
damn anon, that's a lot of shit
took 12 whole minutes to generate that, be sure to listen to the end :)
https://vocaroo.com/12ef4CDQg9pZ
I made clones from Sarah and Ellie from tlou and Violet from the incredibles, I especially like how you can hear paper shuffling at some points and Sarah flubs a line once
>>
>>106499916
>>106500081
I just noticed it was your script that flubbed the line but it generated as a geniune mistake of someone reading too fast, incredible
>>
>>106500081
Cool, thank you.
The intonation is still off for e.g. "Hatsune Miku" or more generally for emphasizing the intended emotions of the story but for something that is machine-generated this is very impressive.
If someone were to leave me a voicemail using this I don't think I could reliably tell that it's not a human.

>>106500089
Yeah, I wrote this at like 1 am.
>>
Now that he wave reached the plateau of XXXB/30~50A MoE models, how are we going to run the next upcoming MoE 70b~100b active parameters SOTA? Even CPUMAXXing and Macs start being slow as shit at those active parameter sizes.
>>
File: four arms.png (2.2 MB, 2120x1416)
2.2 MB
2.2 MB PNG
>>
>>106500288
the trend is towards lower active param count, not higher
>>
>>106500288
>Now that he wave reached the plateau of XXXB/30~50A MoE models
We still haven't hit that, biggest niggers on the block are <40BA, and trending downward.
>>
>>106499814
words only. You can type haha and it will kind of do an actual laugh but I couldnt get it to do more than that. Maybe if you put laughing in the voice clone... I didnt try

https://voca.ro/1iU4VFpN4gXK
>>
>>106500404
What voice did you sample to get this croaking harlot?
>>
>>106500288
- better quants
- different experts quanted differently
- wait for amd's giant multi-channel apus
>>
File: 1755058637386232.jpg (234 KB, 998x1321)
234 KB
234 KB JPG
I'm thinking of getting a 5060 16gb later this year, but I'm worried about a price hike. I'm using a dinosaur 2060.
It looks like a good, enduring buy. Even that nip blog says it's a very good entry-level card.
It's a shame AAA gaming is so shitty these days that the only thing you'd want a good card for is 'playing' with AI.
>>
>>106500501
simply don't play AAA and mod nightmarishly inefficient graphics enhancements into other games
i'm looking at a 5070 ti myself and cringing at the $
>>
>>106500404
I plugged a clip with other sound effects and it kept using them but also repeating big chunks
>>
>>106500501
If you care about ai the only things you might consider are the 5070 ti super with 24gb vram (750-1000) or the intel b60 dual (48gb $1200) or b60 single (24gb, $500) that will come out in the next 6 months or so. You wont regret the 5060 though. Great compatibility and there are ways to run qwen, wan, and shitty llm's on it (but it can run glm air 100b if you buy 96gb ram too). Plenty of fun stuff to do and while 16gb kinda sucks, this is also the best bang for buck to get into ai as a fun lil hobby.

What youre paying for with nvidia is compatibility. If you go the intel rout, plan on running old ai and primarily having it for llm's- with image, tts, video, etc having spotty support or nonexistent.

Also, it could be a year before buying these cards is actually viable.... no one knows whats going on. Also ignore all reviews saying 5060 sucks. The one thing it hugely improves on is ai performance where it easily doubles over previous generations.

>>106500441
[spoiler]Isabella valentine, sissy secretary or airhead university work great for femdom smut[/spoiler]
>>
>>106500604
make sure its less than 30 seconds. It bugs out on long audio
>>
>>106500623
It is less than that. Might be because the prompt was similar to what was said in the audio or because I was testing with 3 or 5 steps and high cfg.
>>
>>106499958
nta

enjoy your daily gooning time
>>
>>106500615
NTA, but fuck, I knew I knew that voice...
>>
>>106499967
be quick

model:
https://huggingface.co/sheliak/VibeVoice-Large_Mirror/tree/main

github:
https://github.com/great-wind/MicroSoft_VibeVoice

or Comfy-UI solutions
>>
>>106500501
>5060
>It looks like a good, enduring buy.

Anon, I...
>>
>>106499780
Pyg
>>
>>106500676
I can't even get a 5070ti if I want to. They only have 16gb variants where I live.
>>
VibeVoice is only EN and CN, right?
>>
>>106500752
Japanese works even though they said it doesn't support it. Not sure about other languages
>>
>>106499415
this is kino
>>
>>106500670
which one should I get?
>>
There's a 0.5B version for streaming? Too bad they'll never release it
>>
>>106499449
Hey, I did get a small tks boost with -fa off!
>>
>>106499415
>kami-sama
lul
>>
>>106500911
With how bad the 1.5B is, you don't need a 0.5B noise generator
>>
>VibeVoice
I have no use for voice cloning myself, but after a little search I found this https://github.com/wildminder/ComfyUI-VibeVoice whcih claims to support quantization, q4 or 8 should be manageable for the 7b even on potato hardware.
>>
>>106499415
>jav moans
dlsite has plenty of "ASMR" content with free samples, you could try using one of those.
>>
>>106500928
I think the machine translated sex talk was probably part of the reason it didn't work that well.
>>
>>106500879

if you are not familiar with Comfy-UI, take this:
https://github.com/great-wind/MicroSoft_VibeVoice

This worked for me:
# create and atcivate VENC to isolate your instalation
conda create -n vibevoice python=3.12
conda activate vibevoice

# REF: installation instruction from github page
git clone https://github.com/great-wind/MicroSoft_VibeVoice.git
cd MicroSoft_VibeVoice/
pip install -e .
# flash attention can take long time to install
pip install flash-attn --no-build-isolation

# run with gradio interface after you downloaded the model to Microsoft_VibeVoice-Large
# from https://huggingface.co/sheliak/VibeVoice-Large_Mirror/tree/main
# you'll need all JSON files as well
# provide correct path instead of /path/to/model/folder/Microsoft_VibeVoice-Large/
python demo/gradio_demo.py --model_path /path/to/model/folder/Microsoft_VibeVoice-Large/

# or you can check out the smaller VibeVoice-1.5b
# from https://huggingface.co/microsoft/VibeVoice-1.5B/tree/main
# you'll need all JSON files as well
# provide correct path instead of /path/to/model/folder/Microsoft_VibeVoice-1.5b/
python demo/gradio_demo.py --model_path /path/to/model/folder/Microsoft_VibeVoice-1.5b/
>>
>>106501145
>atcivate VENC

(me) lol
>>
>>106501145
btw, docker did not work for me 'cause I'm retarded

Anyway, all this implies you already have Nvidia CUDA stuff installed
>>
>>106501145
Holy slop, please.
>>
I've been looking to build a multi GPU setup, but I'm a bit stuck at what motherboard + CPU combo I should be looking for.
I'm looking for something ATX or EATX that'll take DDR5 memory and has multiple full bandwidth PCIE slots that are at least PCIE 4.
Any suggestions?
>>
>>106501160
If you can't solve this issue on your own... You don't really want to build anything. Or perhaps you should learn some technical skills first?
>>
>>106501145
I meant the safetensor, I was just downloading it since I read they were trying to delete it. I like hoarding stuff. The rest is too complicated for me :^)
>>
>>106501160
Get any motherboard that supports x8/x8 mode for 2 gpus. It's nothing special.
>>
>>106501169
this, /g/ isn't the right place for such questions
>>
>>106497597
embedded 'p?
>>
>>106501172

Large (7b) >>>> 1.5b
>>
>>106501160
>I've been looking to build a multi GPU setup

useless unless you want to run two separate LLM instances

NUMA is slow as shit
>>
https://vocaroo.com/14QgXnYa9n9R
Not bad.
>>
>>106501178
The issue with that is the fact I'm looking at a triple GPU setup so I figured I'd get a workstation/HEDT motherboard that can properly support it.
Plus, motherboards that do 8x/8x mode are already $500+ so why consider those over a more dedicated system?
>>106501239
How slow are we talking here? I can accept a 20% penalty but if we're talking 50+ over full bandwidth, then I need to know.
Also, isn't every AI server in existence using multi GPU setups?
>>
>>106501160
For my expensive multi GPU machine I bought this (octa-channel DDR4, 7x 16x PCIe 4.0): https://www.asrockrack.com/general/productdetail.asp?Model=ROMED8-2T#Specifications
For my cheap machine I bought some second-hand Xeon system off of ebay for 300€ (64 GB of quad-channel DDR4, 16x/8x/8x/8x PCIe 3.0).
For DDR5 there are I think no cheap options and keep in mind that as of right now there is no inference code available that is well-optimized for NUMA systems.
>>
>>106501257
>>106501290
>no inference code available that is well-optimized for NUMA systems.
What I meant is that there is no well-optimized code for CPU inference.
If the GPUs are on different NUMA nodes you also get more latency for data transfers unless you use something like NVLink.
>>
>>106501257
>How slow are we talking here?
I have HP Z840 with two Xeon and 512-512MB DDR4 memory

I get the maximum of 4 t/s with DeepSeek-R1-0528-Q2_K_L and --cpu-moe if
model is cached entirely in NUMA0
llama-cli is run on CPU0
--threads matches the number of PHYSICAL cores of this single CPU0

You can run two instances of LLM on two CPUs if they are separated physically in NUMA

As you can see I have to isolate the memory and the cores to get the maximum.
All my attempts to get a bust by using the second CPU only slowed thing down considerably.
If the model does not fit entirely in a single NUMA unit, it sucks big time too
# Run the command
CUDA_VISIBLE_DEVICES="0," \
numactl --physcpubind=8-15 --membind=1 \
"$HOME/LLAMA_CPP/$commit/llama.cpp/build/bin/llama-cli" \
--model "$model" $model_parameters \
--threads 8 \
--ctx-size $cxt_size \
--cache-type-k q4_0 \
--flash-attn \
--n-gpu-layers 99 \
--no-warmup \
--batch-size 8192 \
--ubatch-size 2048 \
--threads-batch 8 \
--jinja \
$log_option \
--prompt-cache "$cache_file" \
--file "$tmp_file" \
--cpu-moe
>>
>>106501342
>All my attempts to get a bust
Heh.
>>
>>106501342
this is how I cache the model to a specific NUMA unit on system start

model1="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00001-of-00008.gguf"
model2="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00002-of-00008.gguf"
model3="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00003-of-00008.gguf"
model4="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00004-of-00008.gguf"
model5="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00005-of-00008.gguf"
model6="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00006-of-00008.gguf"
model7="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00007-of-00008.gguf"
model8="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00008-of-00008.gguf"

#echo "Pre-caching Kimi-K2-Instruct-UD-Q2_K_XL"
numactl --cpunodebind=0 --membind=0 dd if=$model1 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model2 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model3 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model4 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model5 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model6 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model7 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model8 of=/dev/null bs=1M


This way, I can re-run a LLM in 20 seconds
>>
hey frens is there a rentry for text to 3D
I want to look into the tech that makes the turn-around images because I think I could use it to fill gaps in my photogrammetry photo sets. It really hurts to get a great scan with one missing photo leaving a ugly low-detail smear
>>
>>106501367
Would be easier to just go back to the location and take additional photos, then edit them to match color and lighting. It doesn't need to be perfect.
But as a professional you would know this already...
>>
>>106501160
>multi GPU
If you're _definitely_ going to stop at 2 gpus, then
- go for an AM5 motherboard
- that has a pair of x16 pcie slots
- that can run in x16+x0 and x8+x8.

The overboard option is:
- Gigabyte mz33-cp1
- 9004/9005-series epyc
- (256mb L3 cache = 8 chiplets)

https://www.amd.com/en/products/specifications/server-processor.html
>>
>>106501386
I just want to try it and see if it works. I do a lot of photography and I found some really old sets with problems. I can't take a trip to fix them - even if I can find the same tree stump somewhere in a forest in a far away state
>>
>>106501342
>HP Z840 with two Xeon and 512-512MB DDR4
8x 64gb lrdimms per socket ?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.