/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 09/05/25(Fri)21:28:16 No.106497597

File: [sound=https%3A%2F%2Ffile(...).jpg (2.69 MB, 2048x1875)

2.69 MB JPG

/lmg/ - Local Models General Anonymous 09/05/25(Fri)21:28:16 No.106497597

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106491545 & >>106481874

►News
>(09/05) Klear-46B-A2.5B released: https://hf.co/collections/Kwai-Klear/klear10-68ba61398a0a4eb392ec6ab1
>(09/04) Kimi K2 update for agentic coding and 256K context: https://hf.co/moonshotai/Kimi-K2-Instruct-0905
>(09/04) Tencent's HunyuanWorld-Voyager for virtual world generation: https://hf.co/tencent/HunyuanWorld-Voyager
>(09/04) Google released a Gemma embedding model: https://hf.co/google/embeddinggemma-300m
>(09/04) Chatterbox added better multilingual support: https://hf.co/ResembleAI/chatterbox

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/05/25(Fri)21:28:38 No.106497599

Anonymous 09/05/25(Fri)21:28:38 No.106497599

File: 1709421810542.png (124 KB, 654x779)

124 KB PNG

►Recent Highlights from the Previous Thread: >>106491545

--Klear-46B model training methodology and benchmark performance analysis:
>106492824 >106492846 >106492855 >106492872 >106492877 >106492882 >106492885 >106492903 >106493017 >106493058 >106493088
--AI-generated loli podcast creation using VV voice cloning and GLM text generation:
>106495961 >106495966 >106496018 >106496034 >106496055 >106496061 >106496121 >106496139 >106496144 >106496157 >106496189 >106496197 >106496208
--German supercomputing expansion and copyright law compliance challenges:
>106493305 >106493329 >106493355 >106493378 >106493423 >106493481 >106494001 >106493977 >106493529
--Qwen Max model updates and community collaboration efforts:
>106491646 >106492302 >106492366 >106492394 >106492411 >106492421 >106492430 >106492428
--Balancing data quality and diversity in machine learning training:
>106492910 >106492929
--VibeVoice-Large's capabilities and controversy:
>106494251 >106494424 >106494708 >106494778 >106495648 >106494801 >106494950 >106495166 >106495298 >106495187 >106495273 >106495566 >106495590 >106495612 >106495637 >106495639 >106495671 >106495689 >106495101
--Challenges with managing R1's censorship and card-based context switching:
>106493572 >106495514
--Temperature settings tradeoff between tool call accuracy and answer quality in local LLMs:
>106491720 >106491751 >106491761 >106491845 >106491888 >106491989
--Gender bias in doctor riddle from Qwen3-Max-Preview:
>106493573 >106494565 >106494593 >106496265
--Qwen3-Max-Preview (Instruct) outperforms peers in benchmark tests:
>106492622 >106492630 >106492638
--VibeVoice model optimization challenges for single-voice applications:
>106496609 >106496636 >106496646
--Analyzing Qwen3 Max's distinctive generation style:
>106493524
--Miku (free space):
>106493154 >106493503 >106493190 >106494251 >106497578

►Recent Highlight Posts from the Previous Thread: >>106491549

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/05/25(Fri)21:37:01 No.106497662

Anonymous 09/05/25(Fri)21:37:01 No.106497662

Mikulove

Anonymous
09/05/25(Fri)21:40:09 No.106497689

Anonymous 09/05/25(Fri)21:40:09 No.106497689

File: 6000.png (141 KB, 600x522)

141 KB PNG

You have them, right?

Anonymous
09/05/25(Fri)22:04:06 No.106497859

Anonymous 09/05/25(Fri)22:04:06 No.106497859

>>106497597
geez Peter, TWO mics?

Anonymous
09/05/25(Fri)22:07:06 No.106497883

Anonymous 09/05/25(Fri)22:07:06 No.106497883

>A loli whispers in my ear
THANK YOU MICROSOFT
AHAHAHAHA

Anonymous
09/05/25(Fri)22:10:36 No.106497909

Anonymous 09/05/25(Fri)22:10:36 No.106497909

I hate microsoft.
Xi please release the same or better model.
Please your parade was very impressive.

Anonymous
09/05/25(Fri)22:11:11 No.106497916

Anonymous 09/05/25(Fri)22:11:11 No.106497916

>>106497909
the only good chinese model is wan

Anonymous
09/05/25(Fri)22:12:53 No.106497927

Anonymous 09/05/25(Fri)22:12:53 No.106497927

>>106497916
>the only good chinese model is wan
The only good local is wan? are you baiting me?

Anonymous
09/05/25(Fri)22:13:07 No.106497930

Anonymous 09/05/25(Fri)22:13:07 No.106497930

>>106497909
here's your chinese tts bro
https://www.youtube.com/watch?v=mnfLp9O96ak

Anonymous
09/05/25(Fri)22:18:32 No.106497963

Anonymous 09/05/25(Fri)22:18:32 No.106497963

>>106497927
model, deepseek / kimi are not where near claude / gpt / gemini

Anonymous
09/05/25(Fri)22:22:01 No.106497992

Anonymous 09/05/25(Fri)22:22:01 No.106497992

where do I get good voice clone clips

Anonymous
09/05/25(Fri)22:23:44 No.106497999

Anonymous 09/05/25(Fri)22:23:44 No.106497999

>>106497909
The model is from MSRA in Beijing with a full Chinese team
It's by all means a Chinese model

Anonymous
09/05/25(Fri)22:25:12 No.106498009

Anonymous 09/05/25(Fri)22:25:12 No.106498009

what are some absolutely necessary loli voices I should be cloning right now?

Anonymous
09/05/25(Fri)22:41:39 No.106498122

Anonymous 09/05/25(Fri)22:41:39 No.106498122

>>106498009
Aya Hirano

Anonymous
09/05/25(Fri)22:46:01 No.106498148

Anonymous 09/05/25(Fri)22:46:01 No.106498148

holy shit there are no good online tools for making multiple cuts to an mp3 file lmfao

Anonymous
09/05/25(Fri)22:48:15 No.106498156

Anonymous 09/05/25(Fri)22:48:15 No.106498156

Where can I get removed VibeVoice large?

Anonymous
09/05/25(Fri)22:49:10 No.106498163

Anonymous 09/05/25(Fri)22:49:10 No.106498163

>>106498156
click it then press delete, then go to your recycle-bin and empty it

Anonymous
09/05/25(Fri)22:53:24 No.106498193

Anonymous 09/05/25(Fri)22:53:24 No.106498193

>>106497992
youtube

Anonymous
09/05/25(Fri)22:55:55 No.106498210

Anonymous 09/05/25(Fri)22:55:55 No.106498210

File: 1745161289919011.jpg (2.45 MB, 2880x2504)

2.45 MB JPG

>>106498148
ask your favorite llm about ffmpeg

Anonymous
09/05/25(Fri)22:56:11 No.106498215

Anonymous 09/05/25(Fri)22:56:11 No.106498215

Looking for suggestions for an uncensored lite local model for using on my phone. Purely informational.

Anonymous
09/05/25(Fri)22:57:56 No.106498226

Anonymous 09/05/25(Fri)22:57:56 No.106498226

>>106498210
this, gpt5's codex / claude code automate so much stuff for me, I just ask it to make some script to do something and it takes like a minute

Anonymous
09/05/25(Fri)22:58:15 No.106498228

Anonymous 09/05/25(Fri)22:58:15 No.106498228

>>106498210
nah ffmpeg does a lot but if you need to cut up an audio file a whole bunch of times to remove voices from other characters or sounds you really need a gui to plan the cuts and you know not type a billion things into a terminal constantly. found some site called soundtrap that can do what I want. if I get into this enough then I'll just download audacity or something

Anonymous
09/05/25(Fri)22:59:55 No.106498240

Anonymous 09/05/25(Fri)22:59:55 No.106498240

>>106497992
just make your own in audacity. Most good TTS only want 20-30 seconds so. Focus on quality above all. The audio should be clean with no noise. This is where most people fuck up because the voice they are trying to clone doesnt have good audio sources (music, sound effects, static, background noises, traffic etc).

People do share models but you'll eternally be using taylor swift, peter griffin, etc.

Anonymous
09/05/25(Fri)23:00:04 No.106498241

Anonymous 09/05/25(Fri)23:00:04 No.106498241

>>106498228
tell it to make you a tool for doing that. GPT5 can one shot it

Anonymous
09/05/25(Fri)23:02:16 No.106498251

Anonymous 09/05/25(Fri)23:02:16 No.106498251

File: 1742096286363791.jpg (512 KB, 2058x2562)

512 KB JPG

>>106497992
>Pirate TV show/movie
>Extract audio with ffmpeg
>Trim odoen to bits you need
>?????
>Profit

Anonymous
09/05/25(Fri)23:05:09 No.106498265

Anonymous 09/05/25(Fri)23:05:09 No.106498265

Is there a local version of nano banana anyone has made? the ones iv seen on hugging face went down quickly

Anonymous
09/05/25(Fri)23:05:51 No.106498269

Anonymous 09/05/25(Fri)23:05:51 No.106498269

>>106498265
nano banana is not a local model, its a google model

Anonymous
09/05/25(Fri)23:05:59 No.106498271

Anonymous 09/05/25(Fri)23:05:59 No.106498271

File: shitty website.png (46 KB, 1992x505)

46 KB PNG

>>106498240
yeah I figured
>>106498251
am I missing something. is ffmpeg easier to use than I thought for something like this?

Anonymous
09/05/25(Fri)23:11:59 No.106498297

Anonymous 09/05/25(Fri)23:11:59 No.106498297

>>106498148
>>106498210
>>106498226
kek i remember some time ago when i was cutting up the audio for some other tts i was too lazy to install kdenlive or some other shit so i asked deepseek for ffmpeg idk what happened but shit dident work (think i installed it wrong or sumthing) so i jsut asked it to instead make a powershell script which jsut worked lol XD literally just put in the mp3/mp4 in the folder and give it from which second to cut to which and it does it fucking awesome how jank you can get with llms its really alot of fun

Anonymous
09/05/25(Fri)23:12:15 No.106498301

Anonymous 09/05/25(Fri)23:12:15 No.106498301

>>106498271
Actually using audacity would probably be easier. I'm so used to the CLI interface that I sometimes forget guis exist.

Anonymous
09/05/25(Fri)23:12:30 No.106498303

Anonymous 09/05/25(Fri)23:12:30 No.106498303

>>106498271
ffmpeg is for nerds who love command lines. If you want usable stuff, use audacity, or maybe even da vinci resolve which will do audio fine for free.

Anonymous
09/05/25(Fri)23:17:04 No.106498319

Anonymous 09/05/25(Fri)23:17:04 No.106498319

>>106498265
the best image model out right now that can be run on your gaming PC is qwen image, you can run it if you have 16gb of vram.

here is a guide from a man who is definitely not a pedo
https://www.youtube.com/watch?v=0yB_F-NIzkc&t=303s

Anonymous
09/05/25(Fri)23:20:37 No.106498334

Anonymous 09/05/25(Fri)23:20:37 No.106498334

>>106498269
thank you anon, thats disapointing is there anything really comparable i can use locally?

Anonymous
09/05/25(Fri)23:22:54 No.106498341

Anonymous 09/05/25(Fri)23:22:54 No.106498341

>>106498319
sorry didnt see this reply, lol this guy looks sus
just want a good model to edit wallpapers with

Anonymous
09/05/25(Fri)23:23:40 No.106498345

Anonymous 09/05/25(Fri)23:23:40 No.106498345

>>106498334
qwen image / qwen edit?

Anonymous
09/05/25(Fri)23:31:22 No.106498390

Anonymous 09/05/25(Fri)23:31:22 No.106498390

Sonoma Sky/Dusk Alpha are likely the next LLaMA or a new Meta line of models (possibly proprietary)

Anonymous
09/05/25(Fri)23:32:01 No.106498392

Anonymous 09/05/25(Fri)23:32:01 No.106498392

>>106498390
lol no, its grok, just ask it, and its shit

Anonymous
09/05/25(Fri)23:35:17 No.106498412

Anonymous 09/05/25(Fri)23:35:17 No.106498412

new kimi is great btw, like actually better than sonnet imo

Anonymous
09/05/25(Fri)23:39:58 No.106498428

Anonymous 09/05/25(Fri)23:39:58 No.106498428

File: maybe it's the audio file.png (151 KB, 1517x886)

151 KB PNG

hmm it gets pretty schizo at 1.3 tried 1.4 and have tried higher but I dunno.
https://vocaroo.com/1dlL1nEjQeny
said voice clone file
https://vocaroo.com/1orutFZaUpJb

Anonymous
09/05/25(Fri)23:42:31 No.106498434

Anonymous 09/05/25(Fri)23:42:31 No.106498434

>>106498428
10 steps are far too few, try like 50

Anonymous
09/06/25(Sat)00:25:09 No.106498653

Anonymous 09/06/25(Sat)00:25:09 No.106498653

>>106498412
It's shit. I accidentally used v3.1 instead of the new kimi for one of my tests and I actually had a much better time with that before I noticed.

Anonymous
09/06/25(Sat)00:30:43 No.106498668

Anonymous 09/06/25(Sat)00:30:43 No.106498668

File: .png (393 KB, 1468x578)

393 KB PNG

Can anyone give me an estimate of how many t/s I could get with pic related at a 5090?
If 3090 + 96 GB + SSD can run R1 at .88t/s how much of an increase would it be with 512 GB of DDR5 over 5.0 x16 + 96 GB of ddr5 + 32GB of vram?

Anonymous
09/06/25(Sat)00:33:46 No.106498676

Anonymous 09/06/25(Sat)00:33:46 No.106498676

I spent 3 hours looking at comfy ui and all this crap because you told me it was easier on us vramlets and I finally got the comfy-UI VibeVoice thingy running and when I try to generate I get stuck on this
>Downloading VibeVoice model: VibeVoice-Large...
>Fetching 17 files: 0%| | 0/17 [00:00<?, ?it/s]
an hour later still stuck there, I force stop comfyUi and restart it and it still gets stuck on that

Anonymous
09/06/25(Sat)00:33:55 No.106498678

Anonymous 09/06/25(Sat)00:33:55 No.106498678

>>106498668
>ddr5
so the same speed as using it on regular ddr5?

Anonymous
09/06/25(Sat)00:37:32 No.106498702

Anonymous 09/06/25(Sat)00:37:32 No.106498702

>>106498668
It's going over PCI-E 5.0 X16 so the hypothetical maximum bandwidth going over that connection is 128GB/s

Anonymous
09/06/25(Sat)00:38:08 No.106498704

Anonymous 09/06/25(Sat)00:38:08 No.106498704

>>106498676
Download from modelscope into ComfyUI/models/tts/VibeVoice-Large.

Anonymous
09/06/25(Sat)00:38:30 No.106498707

Anonymous 09/06/25(Sat)00:38:30 No.106498707

Redpill me on nanobanana

Anonymous
09/06/25(Sat)00:39:14 No.106498714

Anonymous 09/06/25(Sat)00:39:14 No.106498714

>>106498704
>ComfyUI/models/tts/VibeVoice-Large.
*ComfyUI/models/tts/vibevoice/VibeVoice-Large

Anonymous
09/06/25(Sat)00:40:09 No.106498721

Anonymous 09/06/25(Sat)00:40:09 No.106498721

>>106498707
SOTA but it's likely actually genie3 creating a virtual reality where the prompt is real and taking a virtual photo off that

Anonymous
09/06/25(Sat)00:43:47 No.106498734

Anonymous 09/06/25(Sat)00:43:47 No.106498734

>>106498319
do you think it can be done with 12gb?

Anonymous
09/06/25(Sat)00:44:51 No.106498735

Anonymous 09/06/25(Sat)00:44:51 No.106498735

>>106498721
That sounds incredibly convoluted for what's essentially Photoshop: Gemini Version but it's cool how it works

Anonymous
09/06/25(Sat)00:50:56 No.106498766

Anonymous 09/06/25(Sat)00:50:56 No.106498766

>>106498668
I dunno, r1 was kinda hard to run and I havent tried it since jan cuz I hate it's writing style.

I have a build of 5090, 2x 5060's for 64vram/160 ddr5 (4000mhz). On linux that got me 5 tokens a second on 4k context full glm q4 with some mmap and maybe using 48gb of vram (hard to balance MoE layers, I suck). Presumably if I bought a proper 256gb ddr5 (6000 mhz) kit, I'd be getting more tokens per second, maybe 8 or so even with 8k context I wanna say.

That's a 200gb model. 400gb deepseek is gonna cut shit in half unless you invest in tons of vram

Anonymous
09/06/25(Sat)00:55:14 No.106498787

Anonymous 09/06/25(Sat)00:55:14 No.106498787

File: WarMother-El-Anillo-de-lo(...).png (1.5 MB, 1024x1024)

1.5 MB PNG

how much potable water is being drink because of this.
how many forests are being burn because this technology.

People accepted computers because their energy output is low.
Now that is gone.

Anonymous
09/06/25(Sat)00:57:14 No.106498799

Anonymous 09/06/25(Sat)00:57:14 No.106498799

>>106498787
almost nothing, and water is not destroyed, that is not a thing, it just condenses back into water after being turned to steam

Anonymous
09/06/25(Sat)01:00:41 No.106498817

Anonymous 09/06/25(Sat)01:00:41 No.106498817

>>106498787
if burning electricity for cars is considered green, then burning electricity for ai is even cleaner (and doesn't cause tonnes of rubber plastic contamination through tires)

Anonymous
09/06/25(Sat)01:02:51 No.106498826

Anonymous 09/06/25(Sat)01:02:51 No.106498826

>>106498787
dying of thirst
computers drank it all
me go too far

Anonymous
09/06/25(Sat)01:05:45 No.106498833

Anonymous 09/06/25(Sat)01:05:45 No.106498833

>>106498787
>People accepted computers because their energy output is low.
No, they accepted them because the utility is high. Now its even higher.

Anonymous
09/06/25(Sat)01:05:46 No.106498834

Anonymous 09/06/25(Sat)01:05:46 No.106498834

when are we going to get tts.cpp and vibevoice GOOFS
fuck this python nonsense, couldn't be bothered to set any of it up all over again for every new shitty web UI and whatever that gets released

Anonymous
09/06/25(Sat)01:06:15 No.106498836

Anonymous 09/06/25(Sat)01:06:15 No.106498836

>tfw mom is mad at you again for using up all the house water to talk to the ai

Anonymous
09/06/25(Sat)01:12:36 No.106498864

Anonymous 09/06/25(Sat)01:12:36 No.106498864

>>106498787
leftist detected

Anonymous
09/06/25(Sat)01:13:00 No.106498869

Anonymous 09/06/25(Sat)01:13:00 No.106498869

>>106498787
I set fire to the amazon (both the river and the rainforest) just to ahh ahh mistress, and I'd do it again.

Anonymous
09/06/25(Sat)01:19:48 No.106498898

Anonymous 09/06/25(Sat)01:19:48 No.106498898

>>106498787
You have identified the issue but not the cause. We have water shortages because people reproduce endlessly until we reach a breaking point. The main use of water is to grow FOOD.

Most electric plants and datacenters consume a lot of water but that water is cycled into the plant and then returned to the environment shortly, making their numbers on a graph look high, but essentially very low compared to other uses.

They do use a lot of power though. They need to chill out a bit on large training runs and pointlessly making tiny improvements.

Anonymous
09/06/25(Sat)01:34:27 No.106498959

Anonymous 09/06/25(Sat)01:34:27 No.106498959

>>106498428
Where can I get the model?

Anonymous
09/06/25(Sat)01:36:35 No.106498965

Anonymous 09/06/25(Sat)01:36:35 No.106498965

>>106498787
i’d burn the entire amazon if it meant i get to rp with my robot loli

Anonymous
09/06/25(Sat)01:46:48 No.106499005

Anonymous 09/06/25(Sat)01:46:48 No.106499005

>>106498959
https://www.modelscope.cn/models/microsoft/VibeVoice-Large/files

Anonymous
09/06/25(Sat)01:50:08 No.106499018

Anonymous 09/06/25(Sat)01:50:08 No.106499018

>>106498704
>>106498676
ok I got it from the torrent in last thread, took a whole friggin hour to download it
Honestly yeah I see a pretty massive improvement, previously it took me 3 minutes to generate 15 seconds of speech with the Large model now I generated 40 second of speech in 80 seconds
MASSIVE improvement

Anonymous
09/06/25(Sat)02:15:22 No.106499113

Anonymous 09/06/25(Sat)02:15:22 No.106499113

Here's the reason why vibvoice large was taken down: https://voca.ro/1bCzVodtGtHZ

(had to use a voice clone of porn moaning to get it reliably to do this. The base clip sounded this fake and inauthentic too so maybe someone can do better)

Anonymous
09/06/25(Sat)02:17:16 No.106499121

Anonymous 09/06/25(Sat)02:17:16 No.106499121

Fucked up how a picture is worth a thousand words yet LLMs are way more resource intensive than diffusion models

Anonymous
09/06/25(Sat)02:19:34 No.106499132

Anonymous 09/06/25(Sat)02:19:34 No.106499132

>>106499113
I feel unsafe right now. Like, my whole life is in danger.

Anonymous
09/06/25(Sat)02:23:23 No.106499149

Anonymous 09/06/25(Sat)02:23:23 No.106499149

>>106499132
Now I'm basically raping you : https://voca.ro/15bXrL5GeAS9

Anonymous
09/06/25(Sat)02:24:49 No.106499156

Anonymous 09/06/25(Sat)02:24:49 No.106499156

>>106498787
Energy consumption is how civilisation advances

Anonymous
09/06/25(Sat)02:27:52 No.106499173

Anonymous 09/06/25(Sat)02:27:52 No.106499173

File: 1286404019585.png (12 KB, 468x425)

12 KB PNG

>>106499149
Take care when using the EXCLAMATION MARK(!) IN VIBEVOICE! I find it hilarious when it instantly goes to 11 with mic clipping and distortion

Anonymous
09/06/25(Sat)02:31:17 No.106499189

Anonymous 09/06/25(Sat)02:31:17 No.106499189

>>106499173
Example: https://vocaroo.com/127ZooPcK7mj

Anonymous
09/06/25(Sat)02:32:06 No.106499193

Anonymous 09/06/25(Sat)02:32:06 No.106499193

Don't you dare!!

Anonymous
09/06/25(Sat)02:37:16 No.106499220

Anonymous 09/06/25(Sat)02:37:16 No.106499220

>>106499113
can it do japanese?

Anonymous
09/06/25(Sat)02:48:36 No.106499279

Anonymous 09/06/25(Sat)02:48:36 No.106499279

>>106499220
Yeah real. English sucks for sex. Though I guess it could be worse.

Anonymous
09/06/25(Sat)03:03:32 No.106499352

Anonymous 09/06/25(Sat)03:03:32 No.106499352

File: 4df017f6f4a90e2e58e1be9ab(...).jpg (65 KB, 862x485)

65 KB JPG

>>106497859
It's the Chinese Family Guy knockoff.

Anonymous
09/06/25(Sat)03:05:02 No.106499364

Anonymous 09/06/25(Sat)03:05:02 No.106499364

>>106499189
Kek you weren't kidding, that really went apeshit.
Was that just an exclamation mark or was it allcaps too?

Anonymous
09/06/25(Sat)03:07:45 No.106499389

Anonymous 09/06/25(Sat)03:07:45 No.106499389

>>106499018
Some of the results I've been doing, I stole that degenerate's Gwen voice >>106498428 and just ran it through an AI voice cleaner, all those cartoon sound effects and background noise ruin the sample
and a violet sample I had prepared before
https://vocaroo.com/1llO81h1n7kR
also you need to find more even keeled samples, that sample will only produce a hopped up angry yelling Gwen
also from what I've seen the sample options mostly produce garbage, turn it off and the steps seem to be fine at 30 at most, I didn't see massive improvement beyond that point and only slows down with diminishing returns

Anonymous
09/06/25(Sat)03:11:41 No.106499415

Anonymous 09/06/25(Sat)03:11:41 No.106499415

>>106499220
No. I put in some jav moans as the model. Even I can tell it's bad. I generated this several times and I never got the same passion or breathiness, grunts etc that the English voice could. https://voca.ro/1aruRYcd92sp

>>106499364
It contextually just sorta figures it out. Theres no prompting or anything, but you can say "I'm gonna sing a sad song about etc" and it will try to do it kind of. Voice models seem to help push it in various directions too. I bet it could sing better if I just put in a song.

Anonymous
09/06/25(Sat)03:12:48 No.106499425

Anonymous 09/06/25(Sat)03:12:48 No.106499425

>>106499389
Which one?

Anonymous
09/06/25(Sat)03:13:53 No.106499431

Anonymous 09/06/25(Sat)03:13:53 No.106499431

>>106499415
>fakingu yes

Anonymous
09/06/25(Sat)03:14:39 No.106499435

Anonymous 09/06/25(Sat)03:14:39 No.106499435

>>106499415
Jeesas
How about Chinese?

Anonymous
09/06/25(Sat)03:15:06 No.106499442

Anonymous 09/06/25(Sat)03:15:06 No.106499442

>>106499425
which one what?

Anonymous
09/06/25(Sat)03:15:35 No.106499446

Anonymous 09/06/25(Sat)03:15:35 No.106499446

>>106499415
>https://voca.ro/1aruRYcd92sp
lmao, that's funny though

Anonymous
09/06/25(Sat)03:15:53 No.106499448

Anonymous 09/06/25(Sat)03:15:53 No.106499448

>>106499442
AI voice cleaner.

llama.cpp CUDA dev !!yhbFjk57TDr
09/06/25(Sat)03:15:54 No.106499449

llama.cpp CUDA dev !!yhbFjk57TDr 09/06/25(Sat)03:15:54 No.106499449

>>106492238
>>106497310
If you're using AMD, be aware that the default for --flash-attn is now "auto", which means to enable it if the backend supports it.
On master the AMD FlashAttention performance can be quite bad though, so try "-fa off" and re-try after https://github.com/ggml-org/llama.cpp/pull/15769 has been merged (if you have an old AMD GPU).

>>106498834
bark.cpp https://github.com/PABannier/bark.cpp already exists though the last commit was 10 months ago.

Anonymous
09/06/25(Sat)03:19:26 No.106499466

Anonymous 09/06/25(Sat)03:19:26 No.106499466

>>106499448
cleanvoice, literally the first one I found while googling lol
You have to create an account and it has limited uses, you know what we're in /lmg/
someone please point me to the best local voice cleaner model please

Anonymous
09/06/25(Sat)03:21:10 No.106499477

Anonymous 09/06/25(Sat)03:21:10 No.106499477

>>106497597
>https://www.theverge.com/anthropic/773087/anthropic-to-pay-1-5-billion-to-authors-in-landmark-ai-settlement
>Anthropic to pay $1.5 billion to authors in landmark AI settlement
>$3000 per book
Pack it up boys, it's over.

Anonymous
09/06/25(Sat)03:21:40 No.106499480

Anonymous 09/06/25(Sat)03:21:40 No.106499480

>>106499449
>bark.cpp https://github.com/PABannier/bark.cpp already exists though the last commit was 10 months ago.
That's bark model specific though, and I think VV will be more difficult to implement support for since it's actually a diffusion model + a Qwen LLM.

Anonymous
09/06/25(Sat)03:23:18 No.106499488

Anonymous 09/06/25(Sat)03:23:18 No.106499488

>>106499477
seeing comments cheering it i think people deserve the humiliation ritual that is the modern world

Anonymous
09/06/25(Sat)03:23:21 No.106499489

Anonymous 09/06/25(Sat)03:23:21 No.106499489

>>106499477
they should be releasing claude 1.2 instead

Anonymous
09/06/25(Sat)03:24:22 No.106499497

Anonymous 09/06/25(Sat)03:24:22 No.106499497

File: 1754165881938544.png (352 KB, 640x480)

352 KB PNG

>>106499488
>seeing comments cheering it i think people deserve the humiliation ritual that is the modern world
this, why the fuck do they want to make their own jail, humanity was a mistake

Anonymous
09/06/25(Sat)03:24:33 No.106499499

Anonymous 09/06/25(Sat)03:24:33 No.106499499

>>106499477
hey wheres my 3,00 dollars? I've been typing bullshit onto the internet for years. When someone tells the ai not to act like an uniformed angry idiot, that's MY DATA they're using.

Anonymous
09/06/25(Sat)03:24:56 No.106499501

Anonymous 09/06/25(Sat)03:24:56 No.106499501

>>106499477
looool

Anonymous
09/06/25(Sat)03:27:18 No.106499517

Anonymous 09/06/25(Sat)03:27:18 No.106499517

>>106499121
That’s….the natural implication of that phrase
You need 1000x the resources to generate the words for 1 picture

Anonymous
09/06/25(Sat)03:27:26 No.106499518

Anonymous 09/06/25(Sat)03:27:26 No.106499518

>>106499477
holy fuck dude, this is actually horrible, the US really wants to lose the AI race to the chinks or what?

Anonymous
09/06/25(Sat)03:27:39 No.106499521

Anonymous 09/06/25(Sat)03:27:39 No.106499521

>>106499497
Humanity can accomplish amazing things, it's humans that are the problem. Once you realize that anything under 120 IQ can barely be considered sapient you'll know universal sufferage and internet access was a mistake.

Anonymous
09/06/25(Sat)03:31:31 No.106499552

Anonymous 09/06/25(Sat)03:31:31 No.106499552

>>106499477
>1% of the company's worth
oh no

Anonymous
09/06/25(Sat)03:32:22 No.106499559

Anonymous 09/06/25(Sat)03:32:22 No.106499559

>>106499497
The market is a thing that allows me to buy things. But when it goes away i probably wont need it.

Anonymous
09/06/25(Sat)03:34:07 No.106499574

Anonymous 09/06/25(Sat)03:34:07 No.106499574

>>106499518
we have invested hundreds of billions on ai and hundreds billions more on hardware to run it. We have invested 50 times more on ai than on nuclear fusion.

Thats settlement is token shit to say we did the right thing. And you are correct in thinking that if we actually acted with integrity and morality, other countries would surge ahead of us as we shot ourselves in the foot. If you think this tiny crumb is gonna slow us down you're kind of dumb. If anything it shows the worst that could happen and emboldens lawbreaking as a known expense. A slap on the wrist is the worst that can happen.

Anonymous
09/06/25(Sat)03:34:37 No.106499577

Anonymous 09/06/25(Sat)03:34:37 No.106499577

>>106499477
That's Anthropic's problem. Should have given Orange Jew a few appeasement gifts.

>>106499518
They have multiple groups of jews infighting for money while chinks can act as one. They don't care so much about the long term, as long as it instantly profitable it's okay. The market will fix it. EU on the other hand put on safety IoT cock cage on and is begging to be dommed by both.

Anonymous
09/06/25(Sat)03:40:05 No.106499614

Anonymous 09/06/25(Sat)03:40:05 No.106499614

>>106499449
>bark.cpp https://github.com/PABannier/bark.cpp already exists
Nice, thanks.
I also saw that koboldcpp added support for something called TTS.cpp although in my (very limited) experience it's really slow on PC and the developer seems to be a macfag because that's the primary platform.

Anonymous
09/06/25(Sat)03:52:54 No.106499693

Anonymous 09/06/25(Sat)03:52:54 No.106499693

>>106499574
>If you think this tiny crumb is gonna slow us down you're kind of dumb.
I don't think you realize how serious this is. All emerging companies will need billions of dollars to obtain the data necessary to train their models. This will destroy everything; only large companies will be able to afford it. The US killed itself on that race, they didn't just shoot themselves at their foot this time.

Anonymous
09/06/25(Sat)03:54:27 No.106499698

Anonymous 09/06/25(Sat)03:54:27 No.106499698

>>106499521
>Once you realize that anything under 120 IQ can barely be considered sapient
fact, and I say this because have 121 IQ kek

Anonymous
09/06/25(Sat)03:55:14 No.106499702

Anonymous 09/06/25(Sat)03:55:14 No.106499702

Has anyone tried VV with some Japanese dlsite voice works? I'm curious how it would handle going from Japanese to English.

Anonymous
09/06/25(Sat)03:55:43 No.106499706

Anonymous 09/06/25(Sat)03:55:43 No.106499706

>>106499521
Most humans that report 120+ IQ are benchmaxxed.

Anonymous
09/06/25(Sat)03:59:13 No.106499728

Anonymous 09/06/25(Sat)03:59:13 No.106499728

>>106499521
>Once you realize that anything under 120 IQ can barely be considered sapient you'll know universal sufferage and internet access was a mistake
I felt that way after seeing the lack of gamers and reviewers mention how utterly broken the AI is in the new shinobi (where you can stand next to many enemies and not ever take a single bit of damage)
people are worthless

Anonymous
09/06/25(Sat)04:00:15 No.106499735

Anonymous 09/06/25(Sat)04:00:15 No.106499735

>>106498702
4t/s for GLM-4.5-FP8...

Anonymous
09/06/25(Sat)04:01:01 No.106499737

Anonymous 09/06/25(Sat)04:01:01 No.106499737

>>106499521
>Once you realize that anything under 120 IQ can barely be considered sapient you'll know universal sufferage and internet access was a mistake.
and the average IQ will goes down and down due to the fact the africans are the only ones making a shit ton of babies, this world is fucked, I pity the future generation

Anonymous
09/06/25(Sat)04:01:37 No.106499739

Anonymous 09/06/25(Sat)04:01:37 No.106499739

>>106499415
>https://voca.ro/1aruRYcd92sp
Kek that's actually not bad though, it just seems to have lost the context of what it's doing.

Anonymous
09/06/25(Sat)04:03:09 No.106499745

Anonymous 09/06/25(Sat)04:03:09 No.106499745

>>106499735
I mean, it is a 5x increase but on the other, 3000 USD + your own RDIMMs is get another GPU territory.

Anonymous
09/06/25(Sat)04:03:46 No.106499750

Anonymous 09/06/25(Sat)04:03:46 No.106499750

>>106499121
LLMs need a much better world model than image models. If you fuck up just one word, it can completely break a sentence or turn it into nonsense, but nobody cares if some blurry background detail on an image is a bit deformed. Or even some foreground details in many cases.

Anonymous
09/06/25(Sat)04:08:51 No.106499777

Anonymous 09/06/25(Sat)04:08:51 No.106499777

>>106499702
I guess this answers my question >>106499415

Anonymous
09/06/25(Sat)04:09:04 No.106499780

Anonymous 09/06/25(Sat)04:09:04 No.106499780

File: 1755910635224541.png (721 KB, 832x1127)

721 KB PNG

>Kimi turned out to be censored
>Deepseek is still autistic
>GLM wasn't much, of anything
What's /g/ using for ERP these days after the rose color glasses of that 'new model prose' has worn off?

Anonymous
09/06/25(Sat)04:14:08 No.106499802

Anonymous 09/06/25(Sat)04:14:08 No.106499802

>>106499780
R1

Anonymous
09/06/25(Sat)04:14:12 No.106499803

Anonymous 09/06/25(Sat)04:14:12 No.106499803

>>106499780
Literally nothing

Anonymous
09/06/25(Sat)04:15:56 No.106499814

Anonymous 09/06/25(Sat)04:15:56 No.106499814

So does vibevoice have stuff like [laugh] or it's words only?

Anonymous
09/06/25(Sat)04:19:45 No.106499831

Anonymous 09/06/25(Sat)04:19:45 No.106499831

File: gwen tentaclous.jpg (912 KB, 3238x1366)

912 KB JPG

>>106499389
Ok enough fun with this thing for today, I'm really impressed by the inflections and effects that it gives the scripts, really surprising model all around
https://vocaroo.com/19GSroXyQYlT

Anonymous
09/06/25(Sat)04:20:50 No.106499839

Anonymous 09/06/25(Sat)04:20:50 No.106499839

>>106499831
>really surprising model all around
that's why they wanted to shut the model down, it's too good for local

Anonymous
09/06/25(Sat)04:24:49 No.106499863

Anonymous 09/06/25(Sat)04:24:49 No.106499863

>>106499831
>and don't get me started
Did an LLM write this script?

Anonymous
09/06/25(Sat)04:27:22 No.106499875

Anonymous 09/06/25(Sat)04:27:22 No.106499875

>>106499863
no but my imagination is pretty stunted anone-kun
Right now I'm just trying to come up with funny throwaway scripts to test this sheez

Anonymous
09/06/25(Sat)04:32:50 No.106499907

Anonymous 09/06/25(Sat)04:32:50 No.106499907

>>106499875
This would be more of a storytime than an RP but here is some human slop I wrote for the Open Assistant dataset:

>In the land of South Korea K-pop used to reign supreme. Anyone listening to other genres of music was ridiculed and even openly discriminated against. But no one had it as bad as the fans of Japanese idols. Gangs of K-pop mutant mecha warriors roamed the streets and when they found an idol fan they would be lucky to get away with just a beating. Their favorite thing to do with idol fans was to use their built-in speakers to blast K-pop at such volumes that it would rupture the idol fans' eardrums so they would never be able to hear the voice of their favorite idols again. Sometimes they would switch it up by spewing acid from their mutant bodies for the same effect.

>A lone blacksmith knew that the K-pop warriors would be coming for him next. He had made a small figurine of a vaguely humanoid monster with sharp claws and pointed horns. With all of his heart he prayed to Hatsune Miku, begging her to bring his statue to life so that it may protect idol fans from their terrible persecution - and his prayer was answered. Hatsune Miku descended from the heavens and with her divine powers she brought the statue to life. She named the monster Pulgasari, the eater of iron and steel.

>And Pulgasari did indeed become stronger and bigger as he consumed more and more metal. To grant him even bigger powers Hatsune Miku brought the radioactive remains of the Fukushima reactor core to Korea so that Pulgasari may feast on them. And as the radiation entered Pulgasari's body he began to mutate, growing stronger and stronger by the second. The blacksmith knew that with Pulgasari on their side the time for rebellion had come and so he rallied his fellow idol fans to finally rise up en masse.

Anonymous
09/06/25(Sat)04:33:53 No.106499911

Anonymous 09/06/25(Sat)04:33:53 No.106499911

File: 8LhyPbK.gif (465 KB, 500x281)

465 KB GIF

>>106499189
>you- FUCKING NIGGER!

Anonymous
09/06/25(Sat)04:34:32 No.106499914

Anonymous 09/06/25(Sat)04:34:32 No.106499914

>>106499113
that's bredd good actually :-D

Anonymous
09/06/25(Sat)04:34:48 No.106499916

Anonymous 09/06/25(Sat)04:34:48 No.106499916

>>106499875
>>106499907
>It wasn't long until the K-pop warriors realized that something was wrong: a giant, radioactive monster was marching towards their headquarters and thousands of rebel idol fans were following it. Thanks to their mechanical bodies the K-pop warriors were able to quickly concentrate their forces and a battle of epic proportions ensued. The K-pop warriors reasoned that they would only need to take down Pulgasari and their victory would be assured, but their strategy ended up backfiring. With each felled mecha warrior that Pulgasari consumed his wounds wound close and he emerged even stronger than he had been before. Eventually the K-pop warriors realized their mistake but it was too late; Pulgasari had killed too many of them and they were being overrun.

>The battle ended with a crushing defeat for the K-pop warriors and their headquarters was occupied by the idol fans. But Pulgasari's hunger for metal did not stop. He continued to feast on the corpses of the defeated mecha warriors and then went on eat any kind of metal he could find. Hatsune Miku, realizing that Pulgasari's hunger would never be satisfied, quickly hid herself in a bell just as Pulgasari was eating it. Pulgasari devoured her and as he realized what he had done he turned to stone while letting out a heart-wrenching scream. Touched by Hatsune Miku's heroic sacrifice the fans of different music genres established an uneasy peace. Whether this peace would last only time would tell but if the menace of K-pop mutant mecha warriors were to ever rear its ugly head again, then Pulgasari will be back to stop them.

Anonymous
09/06/25(Sat)04:36:59 No.106499929

Anonymous 09/06/25(Sat)04:36:59 No.106499929

>>106499149
holy shit man she needs to calm down

Anonymous
09/06/25(Sat)04:38:12 No.106499936

Anonymous 09/06/25(Sat)04:38:12 No.106499936

>>106499780
giantess woman.
her ass is your new home.

Anonymous
09/06/25(Sat)04:41:58 No.106499958

Anonymous 09/06/25(Sat)04:41:58 No.106499958

>>106496501
>>106496504
Thank you, absolute legends. Now it not only doesn't OOM, but works faster in some scenarios where it wasn't OOMing.

Anonymous
09/06/25(Sat)04:44:16 No.106499967

Anonymous 09/06/25(Sat)04:44:16 No.106499967

>>106499839
Is it still available somewhere?

Anonymous
09/06/25(Sat)04:49:12 No.106499999

Anonymous 09/06/25(Sat)04:49:12 No.106499999

>>106499967
no

Anonymous
09/06/25(Sat)04:50:53 No.106500008

Anonymous 09/06/25(Sat)04:50:53 No.106500008

>>106499967
yes

Anonymous
09/06/25(Sat)04:51:53 No.106500013

Anonymous 09/06/25(Sat)04:51:53 No.106500013

>>106499967
maybe

Anonymous
09/06/25(Sat)05:02:21 No.106500073

Anonymous 09/06/25(Sat)05:02:21 No.106500073

>>106499967
https://www.modelscope.cn/organization/microsoft

Anonymous
09/06/25(Sat)05:03:51 No.106500081

Anonymous 09/06/25(Sat)05:03:51 No.106500081

>>106499907
>>106499916
damn anon, that's a lot of shit
took 12 whole minutes to generate that, be sure to listen to the end :)
https://vocaroo.com/12ef4CDQg9pZ
I made clones from Sarah and Ellie from tlou and Violet from the incredibles, I especially like how you can hear paper shuffling at some points and Sarah flubs a line once

Anonymous
09/06/25(Sat)05:05:24 No.106500089

Anonymous 09/06/25(Sat)05:05:24 No.106500089

>>106499916
>>106500081
I just noticed it was your script that flubbed the line but it generated as a geniune mistake of someone reading too fast, incredible

Anonymous
09/06/25(Sat)05:13:32 No.106500140

Anonymous 09/06/25(Sat)05:13:32 No.106500140

>>106500081
Cool, thank you.
The intonation is still off for e.g. "Hatsune Miku" or more generally for emphasizing the intended emotions of the story but for something that is machine-generated this is very impressive.
If someone were to leave me a voicemail using this I don't think I could reliably tell that it's not a human.

>>106500089
Yeah, I wrote this at like 1 am.

Anonymous
09/06/25(Sat)05:46:08 No.106500288

Anonymous 09/06/25(Sat)05:46:08 No.106500288

Now that he wave reached the plateau of XXXB/30~50A MoE models, how are we going to run the next upcoming MoE 70b~100b active parameters SOTA? Even CPUMAXXing and Macs start being slow as shit at those active parameter sizes.

Anonymous
09/06/25(Sat)05:49:22 No.106500301

Anonymous 09/06/25(Sat)05:49:22 No.106500301

File: four arms.png (2.2 MB, 2120x1416)

2.2 MB PNG

Anonymous
09/06/25(Sat)05:55:33 No.106500333

Anonymous 09/06/25(Sat)05:55:33 No.106500333

>>106500288
the trend is towards lower active param count, not higher

Anonymous
09/06/25(Sat)06:01:48 No.106500375

Anonymous 09/06/25(Sat)06:01:48 No.106500375

>>106500288
>Now that he wave reached the plateau of XXXB/30~50A MoE models
We still haven't hit that, biggest niggers on the block are <40BA, and trending downward.

Anonymous
09/06/25(Sat)06:06:50 No.106500404

Anonymous 09/06/25(Sat)06:06:50 No.106500404

>>106499814
words only. You can type haha and it will kind of do an actual laugh but I couldnt get it to do more than that. Maybe if you put laughing in the voice clone... I didnt try

https://voca.ro/1iU4VFpN4gXK

Anonymous
09/06/25(Sat)06:10:33 No.106500441

Anonymous 09/06/25(Sat)06:10:33 No.106500441

>>106500404
What voice did you sample to get this croaking harlot?

Anonymous
09/06/25(Sat)06:11:36 No.106500449

Anonymous 09/06/25(Sat)06:11:36 No.106500449

>>106500288
- better quants
- different experts quanted differently
- wait for amd's giant multi-channel apus

Anonymous
09/06/25(Sat)06:17:28 No.106500501

Anonymous 09/06/25(Sat)06:17:28 No.106500501

File: 1755058637386232.jpg (234 KB, 998x1321)

234 KB JPG

I'm thinking of getting a 5060 16gb later this year, but I'm worried about a price hike. I'm using a dinosaur 2060.
It looks like a good, enduring buy. Even that nip blog says it's a very good entry-level card.
It's a shame AAA gaming is so shitty these days that the only thing you'd want a good card for is 'playing' with AI.

Anonymous
09/06/25(Sat)06:21:07 No.106500521

Anonymous 09/06/25(Sat)06:21:07 No.106500521

>>106500501
simply don't play AAA and mod nightmarishly inefficient graphics enhancements into other games
i'm looking at a 5070 ti myself and cringing at the $

Anonymous
09/06/25(Sat)06:33:48 No.106500604

Anonymous 09/06/25(Sat)06:33:48 No.106500604

>>106500404
I plugged a clip with other sound effects and it kept using them but also repeating big chunks

Anonymous
09/06/25(Sat)06:36:12 No.106500615

Anonymous 09/06/25(Sat)06:36:12 No.106500615

>>106500501
If you care about ai the only things you might consider are the 5070 ti super with 24gb vram (750-1000) or the intel b60 dual (48gb $1200) or b60 single (24gb, $500) that will come out in the next 6 months or so. You wont regret the 5060 though. Great compatibility and there are ways to run qwen, wan, and shitty llm's on it (but it can run glm air 100b if you buy 96gb ram too). Plenty of fun stuff to do and while 16gb kinda sucks, this is also the best bang for buck to get into ai as a fun lil hobby.

What youre paying for with nvidia is compatibility. If you go the intel rout, plan on running old ai and primarily having it for llm's- with image, tts, video, etc having spotty support or nonexistent.

Also, it could be a year before buying these cards is actually viable.... no one knows whats going on. Also ignore all reviews saying 5060 sucks. The one thing it hugely improves on is ai performance where it easily doubles over previous generations.

>>106500441
[spoiler]Isabella valentine, sissy secretary or airhead university work great for femdom smut[/spoiler]

Anonymous
09/06/25(Sat)06:38:05 No.106500623

Anonymous 09/06/25(Sat)06:38:05 No.106500623

>>106500604
make sure its less than 30 seconds. It bugs out on long audio

Anonymous
09/06/25(Sat)06:42:00 No.106500647

Anonymous 09/06/25(Sat)06:42:00 No.106500647

>>106500623
It is less than that. Might be because the prompt was similar to what was said in the audio or because I was testing with 3 or 5 steps and high cfg.

Anonymous
09/06/25(Sat)06:43:16 No.106500651

Anonymous 09/06/25(Sat)06:43:16 No.106500651

>>106499958
nta

enjoy your daily gooning time

Anonymous
09/06/25(Sat)06:43:20 No.106500652

Anonymous 09/06/25(Sat)06:43:20 No.106500652

>>106500615
NTA, but fuck, I knew I knew that voice...

Anonymous
09/06/25(Sat)06:45:45 No.106500670

Anonymous 09/06/25(Sat)06:45:45 No.106500670

>>106499967
be quick

model:
https://huggingface.co/sheliak/VibeVoice-Large_Mirror/tree/main

github:
https://github.com/great-wind/MicroSoft_VibeVoice

or Comfy-UI solutions

Anonymous
09/06/25(Sat)06:47:01 No.106500676

Anonymous 09/06/25(Sat)06:47:01 No.106500676

>>106500501
>5060
>It looks like a good, enduring buy.

Anon, I...

Anonymous
09/06/25(Sat)06:47:58 No.106500681

Anonymous 09/06/25(Sat)06:47:58 No.106500681

>>106499780
Pyg

Anonymous
09/06/25(Sat)06:50:35 No.106500695

Anonymous 09/06/25(Sat)06:50:35 No.106500695

>>106500676
I can't even get a 5070ti if I want to. They only have 16gb variants where I live.

Anonymous
09/06/25(Sat)07:02:32 No.106500752

Anonymous 09/06/25(Sat)07:02:32 No.106500752

VibeVoice is only EN and CN, right?

Anonymous
09/06/25(Sat)07:13:19 No.106500809

Anonymous 09/06/25(Sat)07:13:19 No.106500809

>>106500752
Japanese works even though they said it doesn't support it. Not sure about other languages

Anonymous
09/06/25(Sat)07:22:26 No.106500865

Anonymous 09/06/25(Sat)07:22:26 No.106500865

>>106499415
this is kino

Anonymous
09/06/25(Sat)07:24:50 No.106500879

Anonymous 09/06/25(Sat)07:24:50 No.106500879

>>106500670
which one should I get?

Anonymous
09/06/25(Sat)07:31:36 No.106500911

Anonymous 09/06/25(Sat)07:31:36 No.106500911

There's a 0.5B version for streaming? Too bad they'll never release it

Anonymous
09/06/25(Sat)07:31:36 No.106500912

Anonymous 09/06/25(Sat)07:31:36 No.106500912

>>106499449
Hey, I did get a small tks boost with -fa off!

Anonymous
09/06/25(Sat)07:34:56 No.106500928

Anonymous 09/06/25(Sat)07:34:56 No.106500928

>>106499415
>kami-sama
lul

Anonymous
09/06/25(Sat)07:36:06 No.106500931

Anonymous 09/06/25(Sat)07:36:06 No.106500931

>>106500911
With how bad the 1.5B is, you don't need a 0.5B noise generator

Anonymous
09/06/25(Sat)07:50:20 No.106501006

Anonymous 09/06/25(Sat)07:50:20 No.106501006

>VibeVoice
I have no use for voice cloning myself, but after a little search I found this https://github.com/wildminder/ComfyUI-VibeVoice whcih claims to support quantization, q4 or 8 should be manageable for the 7b even on potato hardware.

Anonymous
09/06/25(Sat)08:01:46 No.106501074

Anonymous 09/06/25(Sat)08:01:46 No.106501074

>>106499415
>jav moans
dlsite has plenty of "ASMR" content with free samples, you could try using one of those.

Anonymous
09/06/25(Sat)08:09:02 No.106501115

Anonymous 09/06/25(Sat)08:09:02 No.106501115

>>106500928
I think the machine translated sex talk was probably part of the reason it didn't work that well.

Anonymous
09/06/25(Sat)08:12:41 No.106501145

Anonymous 09/06/25(Sat)08:12:41 No.106501145

>>106500879

if you are not familiar with Comfy-UI, take this:
https://github.com/great-wind/MicroSoft_VibeVoice

This worked for me:

# create and atcivate VENC to isolate your instalation
conda create -n vibevoice python=3.12
conda activate vibevoice

# REF: installation instruction from github page
git clone https://github.com/great-wind/MicroSoft_VibeVoice.git
cd MicroSoft_VibeVoice/
pip install -e .
# flash attention can take long time to install
pip install flash-attn --no-build-isolation

# run with gradio interface after you downloaded the model to Microsoft_VibeVoice-Large
# from https://huggingface.co/sheliak/VibeVoice-Large_Mirror/tree/main
# you'll need all JSON files as well
# provide correct path instead of /path/to/model/folder/Microsoft_VibeVoice-Large/
python demo/gradio_demo.py --model_path /path/to/model/folder/Microsoft_VibeVoice-Large/

# or you can check out the smaller VibeVoice-1.5b
# from https://huggingface.co/microsoft/VibeVoice-1.5B/tree/main
# you'll need all JSON files as well
# provide correct path instead of /path/to/model/folder/Microsoft_VibeVoice-1.5b/
python demo/gradio_demo.py --model_path /path/to/model/folder/Microsoft_VibeVoice-1.5b/

Anonymous
09/06/25(Sat)08:13:42 No.106501148

Anonymous 09/06/25(Sat)08:13:42 No.106501148

>>106501145
>atcivate VENC

(me) lol

Anonymous
09/06/25(Sat)08:15:06 No.106501158

Anonymous 09/06/25(Sat)08:15:06 No.106501158

>>106501145
btw, docker did not work for me 'cause I'm retarded

Anyway, all this implies you already have Nvidia CUDA stuff installed

Anonymous
09/06/25(Sat)08:15:08 No.106501159

Anonymous 09/06/25(Sat)08:15:08 No.106501159

>>106501145
Holy slop, please.

Anonymous
09/06/25(Sat)08:15:10 No.106501160

Anonymous 09/06/25(Sat)08:15:10 No.106501160

I've been looking to build a multi GPU setup, but I'm a bit stuck at what motherboard + CPU combo I should be looking for.
I'm looking for something ATX or EATX that'll take DDR5 memory and has multiple full bandwidth PCIE slots that are at least PCIE 4.
Any suggestions?

Anonymous
09/06/25(Sat)08:17:13 No.106501169

Anonymous 09/06/25(Sat)08:17:13 No.106501169

>>106501160
If you can't solve this issue on your own... You don't really want to build anything. Or perhaps you should learn some technical skills first?

Anonymous
09/06/25(Sat)08:17:22 No.106501172

Anonymous 09/06/25(Sat)08:17:22 No.106501172

>>106501145
I meant the safetensor, I was just downloading it since I read they were trying to delete it. I like hoarding stuff. The rest is too complicated for me :^)

Anonymous
09/06/25(Sat)08:18:14 No.106501178

Anonymous 09/06/25(Sat)08:18:14 No.106501178

>>106501160
Get any motherboard that supports x8/x8 mode for 2 gpus. It's nothing special.

Anonymous
09/06/25(Sat)08:22:20 No.106501205

Anonymous 09/06/25(Sat)08:22:20 No.106501205

>>106501169
this, /g/ isn't the right place for such questions

Anonymous
09/06/25(Sat)08:23:16 No.106501214

Anonymous 09/06/25(Sat)08:23:16 No.106501214

>>106497597
embedded 'p?

Anonymous
09/06/25(Sat)08:26:07 No.106501230

Anonymous 09/06/25(Sat)08:26:07 No.106501230

>>106501172

Large (7b) >>>> 1.5b

Anonymous
09/06/25(Sat)08:27:21 No.106501239

Anonymous 09/06/25(Sat)08:27:21 No.106501239

>>106501160
>I've been looking to build a multi GPU setup

useless unless you want to run two separate LLM instances

NUMA is slow as shit

Anonymous
09/06/25(Sat)08:28:38 No.106501245

Anonymous 09/06/25(Sat)08:28:38 No.106501245

https://vocaroo.com/14QgXnYa9n9R
Not bad.

Anonymous
09/06/25(Sat)08:31:04 No.106501257

Anonymous 09/06/25(Sat)08:31:04 No.106501257

>>106501178
The issue with that is the fact I'm looking at a triple GPU setup so I figured I'd get a workstation/HEDT motherboard that can properly support it.
Plus, motherboards that do 8x/8x mode are already $500+ so why consider those over a more dedicated system?
>>106501239
How slow are we talking here? I can accept a 20% penalty but if we're talking 50+ over full bandwidth, then I need to know.
Also, isn't every AI server in existence using multi GPU setups?

Anonymous
09/06/25(Sat)08:36:08 No.106501290

Anonymous 09/06/25(Sat)08:36:08 No.106501290

>>106501160
For my expensive multi GPU machine I bought this (octa-channel DDR4, 7x 16x PCIe 4.0): https://www.asrockrack.com/general/productdetail.asp?Model=ROMED8-2T#Specifications
For my cheap machine I bought some second-hand Xeon system off of ebay for 300€ (64 GB of quad-channel DDR4, 16x/8x/8x/8x PCIe 3.0).
For DDR5 there are I think no cheap options and keep in mind that as of right now there is no inference code available that is well-optimized for NUMA systems.

Anonymous
09/06/25(Sat)08:37:51 No.106501297

Anonymous 09/06/25(Sat)08:37:51 No.106501297

>>106501257
>>106501290
>no inference code available that is well-optimized for NUMA systems.
What I meant is that there is no well-optimized code for CPU inference.
If the GPUs are on different NUMA nodes you also get more latency for data transfers unless you use something like NVLink.

Anonymous
09/06/25(Sat)08:45:12 No.106501342

Anonymous 09/06/25(Sat)08:45:12 No.106501342

>>106501257
>How slow are we talking here?
I have HP Z840 with two Xeon and 512-512MB DDR4 memory

I get the maximum of 4 t/s with DeepSeek-R1-0528-Q2_K_L and --cpu-moe if
model is cached entirely in NUMA0
llama-cli is run on CPU0
--threads matches the number of PHYSICAL cores of this single CPU0
You can run two instances of LLM on two CPUs if they are separated physically in NUMA

As you can see I have to isolate the memory and the cores to get the maximum.
All my attempts to get a bust by using the second CPU only slowed thing down considerably.
If the model does not fit entirely in a single NUMA unit, it sucks big time too
# Run the command
CUDA_VISIBLE_DEVICES="0," \
numactl --physcpubind=8-15 --membind=1 \
"$HOME/LLAMA_CPP/$commit/llama.cpp/build/bin/llama-cli" \
--model "$model" $model_parameters \
--threads 8 \
--ctx-size $cxt_size \
--cache-type-k q4_0 \
--flash-attn \
--n-gpu-layers 99 \
--no-warmup \
--batch-size 8192 \
--ubatch-size 2048 \
--threads-batch 8 \
--jinja \
$log_option \
--prompt-cache "$cache_file" \
--file "$tmp_file" \
--cpu-moe

Anonymous
09/06/25(Sat)08:46:06 No.106501349

Anonymous 09/06/25(Sat)08:46:06 No.106501349

>>106501342
>All my attempts to get a bust
Heh.

Anonymous
09/06/25(Sat)08:47:43 No.106501360

Anonymous 09/06/25(Sat)08:47:43 No.106501360

>>106501342
this is how I cache the model to a specific NUMA unit on system start

model1="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00001-of-00008.gguf"
model2="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00002-of-00008.gguf"
model3="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00003-of-00008.gguf"
model4="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00004-of-00008.gguf"
model5="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00005-of-00008.gguf"
model6="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00006-of-00008.gguf"
model7="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00007-of-00008.gguf"
model8="/path/to/model/Kimi-K2-Instruct-UD-Q2_K_XL/Kimi-K2-Instruct-UD-Q2_K_XL-00008-of-00008.gguf"

#echo "Pre-caching Kimi-K2-Instruct-UD-Q2_K_XL"
numactl --cpunodebind=0 --membind=0 dd if=$model1 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model2 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model3 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model4 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model5 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model6 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model7 of=/dev/null bs=1M
numactl --cpunodebind=0 --membind=0 dd if=$model8 of=/dev/null bs=1M

This way, I can re-run a LLM in 20 seconds

Anonymous
09/06/25(Sat)08:49:02 No.106501367

Anonymous 09/06/25(Sat)08:49:02 No.106501367

hey frens is there a rentry for text to 3D
I want to look into the tech that makes the turn-around images because I think I could use it to fill gaps in my photogrammetry photo sets. It really hurts to get a great scan with one missing photo leaving a ugly low-detail smear

Anonymous
09/06/25(Sat)08:52:09 No.106501386

Anonymous 09/06/25(Sat)08:52:09 No.106501386

>>106501367
Would be easier to just go back to the location and take additional photos, then edit them to match color and lighting. It doesn't need to be perfect.
But as a professional you would know this already...

Anonymous
09/06/25(Sat)08:56:21 No.106501417

Anonymous 09/06/25(Sat)08:56:21 No.106501417

>>106501160
>multi GPU
If you're _definitely_ going to stop at 2 gpus, then
- go for an AM5 motherboard
- that has a pair of x16 pcie slots
- that can run in x16+x0 and x8+x8.

The overboard option is:
- Gigabyte mz33-cp1
- 9004/9005-series epyc
- (256mb L3 cache = 8 chiplets)

https://www.amd.com/en/products/specifications/server-processor.html

Anonymous
09/06/25(Sat)08:57:44 No.106501428

Anonymous 09/06/25(Sat)08:57:44 No.106501428

>>106501386
I just want to try it and see if it works. I do a lot of photography and I found some really old sets with problems. I can't take a trip to fix them - even if I can find the same tree stump somewhere in a forest in a far away state

Anonymous
09/06/25(Sat)08:59:58 No.106501442

Anonymous 09/06/25(Sat)08:59:58 No.106501442

>>106501342
>HP Z840 with two Xeon and 512-512MB DDR4
8x 64gb lrdimms per socket ?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.