/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/06/25(Sat)13:59:36 No.106504274

File: four arms.png (2.2 MB, 2120x1416)

2.2 MB PNG

/lmg/ - Local Models General Anonymous 09/06/25(Sat)13:59:36 No.106504274 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106497597 & >>106491545

►News
>(09/05) Klear-46B-A2.5B released: https://hf.co/collections/Kwai-Klear/klear10-68ba61398a0a4eb392ec6ab1
>(09/04) Kimi K2 update for agentic coding and 256K context: https://hf.co/moonshotai/Kimi-K2-Instruct-0905
>(09/04) Tencent's HunyuanWorld-Voyager for virtual world generation: https://hf.co/tencent/HunyuanWorld-Voyager
>(09/04) Google released a Gemma embedding model: https://hf.co/google/embeddinggemma-300m
>(09/04) Chatterbox added better multilingual support: https://hf.co/ResembleAI/chatterbox

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/06/25(Sat)13:59:50 No.106504276

Anonymous 09/06/25(Sat)13:59:50 No.106504276

File: __hatsune_miku_and_monoko(...).jpg (106 KB, 468x750)

106 KB JPG

►Recent Highlights from the Previous Thread: >>106497597

--Multi-GPU server hardware choices for DDR5 and NUMA optimization:
>106501160 >106501257 >106501342 >106501360 >106501442 >106501465 >106501290 >106501297 >106501417
--Token speed estimates for LLMs using GIGABYTE CXL memory card vs VRAM configurations:
>106498668 >106498678 >106498702 >106499735 >106499745 >106498766
--Optimizing VibeVoice-Large model for efficient speech generation and voice sample cleanup:
>106498676 >106498704 >106498714 >106499018 >106499389 >106499448 >106499466 >106499831 >106499967 >106500073 >106500670 >106500879 >106501145 >106501158 >106501172 >106501230 >106499863 >106499875 >106499907 >106499916 >106500081 >106500089 >106500140 >106503518
--Model recommendations for average gaming hardware with VRAM constraints:
>106502406 >106502445 >106502478 >106502521 >106502528 >106502551 >106502813 >106502914 >106502932 >106502986
--Interpretation of llama_backend_print_memory output for GPU/CPU memory usage:
>106501583 >106501653 >106501677 >106501706 >106501727 >106501822 >106501932
--DDR5 vs DDR4 tradeoffs for CPUmaxx systems with GPU support:
>106503602 >106503731 >106503756 >106503762 >106503824 >106503854 >106504044
--VibeVoice model optimization and download link:
>106498428 >106498434 >106498959 >106499005
--Anthropic's $1.5B AI settlement criticized for insufficient compensation and stifling innovation:
>106499477 >106499488 >106499521 >106499499 >106499518 >106499574 >106499693 >106502081
--AMD FlashAttention workarounds and text-to-speech project updates:
>106499449 >106499480 >106499614 >106500912
--VibeVoice TTS compatibility with quantized 7b models on low-resource hardware:
>106501006 >106501612
--Miku (free space):
>106498210 >106500301 >106503405 >106503587

►Recent Highlight Posts from the Previous Thread: >>106497599

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/06/25(Sat)14:08:39 No.106504377

Anonymous 09/06/25(Sat)14:08:39 No.106504377

File: file.png (31 KB, 882x151)

31 KB PNG

>update debian 13 to debian 13.1
>picrel

Anonymous
09/06/25(Sat)14:13:28 No.106504433

Anonymous 09/06/25(Sat)14:13:28 No.106504433

>>106504130
No, sometimes it picks up things from context and decides what direction and inflection to take from there but the biggest factor is the original sample voice it clones, if it has a lot of angry yelling and annoyance that will be reflected in the result, so maybe have several emotion samples for the same voice set as different voices?

Anonymous
09/06/25(Sat)14:14:05 No.106504444

Anonymous 09/06/25(Sat)14:14:05 No.106504444

>>106504276
Thank you Recap Recap Miku

Anonymous
09/06/25(Sat)14:14:19 No.106504448

Anonymous 09/06/25(Sat)14:14:19 No.106504448

>>106503862
So much technical support is behind closed doors in discord, it makes no sense, the platform was never meant to do that, and now many issues will never be discussed in the open, helping no one.

Anonymous
09/06/25(Sat)14:14:49 No.106504456

Anonymous 09/06/25(Sat)14:14:49 No.106504456

>>106504377
just reinstall cuda 12.8

Anonymous
09/06/25(Sat)14:15:59 No.106504470

Anonymous 09/06/25(Sat)14:15:59 No.106504470

>>106504448
It's an actual tragedy.

Anonymous
09/06/25(Sat)14:16:27 No.106504477

Anonymous 09/06/25(Sat)14:16:27 No.106504477

>>106504377
Not like this, John.

Anonymous
09/06/25(Sat)14:17:14 No.106504485

Anonymous 09/06/25(Sat)14:17:14 No.106504485

File: file.png (44 KB, 894x263)

44 KB PNG

>>106504456
i am using cuda 12.8

Anonymous
09/06/25(Sat)14:18:07 No.106504495

Anonymous 09/06/25(Sat)14:18:07 No.106504495

>>106504448
Zoomers seem to love it, probably because it's simpler than using an actual support forum.
Yet in high traffic ones it's almost unusable.

Anonymous
09/06/25(Sat)14:19:14 No.106504508

Anonymous 09/06/25(Sat)14:19:14 No.106504508

>>106504485
yes get rid of it and reinstall 12.8

Anonymous
09/06/25(Sat)14:19:40 No.106504514

Anonymous 09/06/25(Sat)14:19:40 No.106504514

>>106504495
>Yet in high traffic ones it's almost unusable.
so much activity is probably a big part of the attraction to those with dysfunctional attention spans

Anonymous
09/06/25(Sat)14:19:49 No.106504515

Anonymous 09/06/25(Sat)14:19:49 No.106504515

>>106504448
yeah it fucking sucks

Anonymous
09/06/25(Sat)14:20:19 No.106504520

Anonymous 09/06/25(Sat)14:20:19 No.106504520

>>106504448
what kills me are the projects doing that shit, choosing to use discord

Anonymous
09/06/25(Sat)14:21:53 No.106504538

Anonymous 09/06/25(Sat)14:21:53 No.106504538

>>106504514
It's just memes and mundane chat spam, it makes finding useful discussions hard, especially with discord's fuzzy search.

Anonymous
09/06/25(Sat)14:23:03 No.106504547

Anonymous 09/06/25(Sat)14:23:03 No.106504547

>>106504538
And then there's the moderators.

Anonymous
09/06/25(Sat)14:25:53 No.106504582

Anonymous 09/06/25(Sat)14:25:53 No.106504582

>>106504520
If they're pros, I think they're just treating it as a free Slack. Except Slack is optimized for internal discussion with teams of people there to do a job, not thousands of overexcited zoomies.
I actually wonder if discord datasets exist and are used in LLM training.

Anonymous
09/06/25(Sat)14:39:34 No.106504704

Anonymous 09/06/25(Sat)14:39:34 No.106504704

The tokens might a statistical representation of language but the boners are real.

Anonymous
09/06/25(Sat)14:53:34 No.106504832

Anonymous 09/06/25(Sat)14:53:34 No.106504832

File: comfymikus.png (787 KB, 1024x1280)

787 KB PNG

Comfy Mikus

Anonymous
09/06/25(Sat)15:07:34 No.106504958

Anonymous 09/06/25(Sat)15:07:34 No.106504958

>>106504832
Cum on Miku's feet*

Anonymous
09/06/25(Sat)15:08:55 No.106504964

Anonymous 09/06/25(Sat)15:08:55 No.106504964

>>106504448
I mean, there will be a big loss of data for stuff like that from around 2015 onwards but given I am fairly adept technically so it doesn't bother me and you still can find help elsewhere, it's just in much less volume.
>>106504582
I am guessing it is valuable to some extent for data about zoomers and younger folks but I wonder how valuable it is when that demographic itself are the most affected by LLM and internet culture regurgitation and mind numbing retardation in general. No question though, the RP logs probably are equal if not 2nd to the RP forum scrapes given CAI seemed to have trained on them for their 2022 bot that everyone yearns for.

Anonymous
09/06/25(Sat)15:09:15 No.106504969

Anonymous 09/06/25(Sat)15:09:15 No.106504969

>>106504702
It sends a shiver down my spine. Something primal...

Anonymous
09/06/25(Sat)15:13:00 No.106505014

Anonymous 09/06/25(Sat)15:13:00 No.106505014

>>106504508
i fixed it by updating my chroot aswell

Anonymous
09/06/25(Sat)15:18:24 No.106505064

Anonymous 09/06/25(Sat)15:18:24 No.106505064

Does my Alice sound Alicey enough?
https://vocaroo.com/1jAce1dHRBYD
I think it sound really muffled because I ran it through a voice cleaning model to get rid of music, but maybe it enhances the 50's mic aesthetic

Anonymous
09/06/25(Sat)15:18:34 No.106505066

Anonymous 09/06/25(Sat)15:18:34 No.106505066

>>106504242
The only software optimization left is 1-2B active routed expert models.

Anonymous
09/06/25(Sat)15:18:49 No.106505070

Anonymous 09/06/25(Sat)15:18:49 No.106505070

>>106504377
Bro you installed pytorch without cuda support

Anonymous
09/06/25(Sat)15:20:48 No.106505094

Anonymous 09/06/25(Sat)15:20:48 No.106505094

>>106505064
It sounds great, especially
>a den of hedonous virgins

Anonymous
09/06/25(Sat)15:21:02 No.106505098

Anonymous 09/06/25(Sat)15:21:02 No.106505098

File: 1754225681328634.gif (140 KB, 379x440)

140 KB GIF

>>106504242
Retardo, without the current software optimization you'd need a datacenter to run these models

Anonymous
09/06/25(Sat)15:21:19 No.106505103

Anonymous 09/06/25(Sat)15:21:19 No.106505103

>>106505094
heathenous ackshully

Anonymous
09/06/25(Sat)15:24:38 No.106505136

Anonymous 09/06/25(Sat)15:24:38 No.106505136

>>106504964
>that demographic itself are the most affected by LLM and internet culture regurgitation and mind numbing retardation in general
If you mean the constant virtue signaling, it's mostly a fake persona they all share in public places because they just are terrified of being judged by their friends (who are obviously also online).

>No question though, the RP logs probably are equal if not 2nd to the RP forum scrapes given CAI seemed to have trained on them for their 2022 bot that everyone yearns for.
CAI was so good and showed that outside of the big public servers, you probably have plenty small ones where a lot more is discussed freely.

Anonymous
09/06/25(Sat)15:27:32 No.106505168

Anonymous 09/06/25(Sat)15:27:32 No.106505168

Did I miss a guide or link on OP to teaching me about using TTS in Sillytavern? Could someone kindly point me the right way if there is one?

Anonymous
09/06/25(Sat)15:35:07 No.106505235

Anonymous 09/06/25(Sat)15:35:07 No.106505235

File: 1756766130320791.jpg (81 KB, 1000x707)

81 KB JPG

I await my magnum v5.

Anonymous
09/06/25(Sat)15:36:52 No.106505246

Anonymous 09/06/25(Sat)15:36:52 No.106505246

File: 1745332932509593.png (3.23 MB, 914x1802)

3.23 MB PNG

>>106504274
For those of you who need the VibeVoice Large Weights

https://huggingface.co/aoi-ot/VibeVoice-Large

Anonymous
09/06/25(Sat)15:41:03 No.106505276

Anonymous 09/06/25(Sat)15:41:03 No.106505276

really sick of comfyui. is there any other interface for vibevoice?

Anonymous
09/06/25(Sat)15:41:47 No.106505288

Anonymous 09/06/25(Sat)15:41:47 No.106505288

>>106505235
Why is her ass pointed at me? I am offended.

Anonymous
09/06/25(Sat)15:42:24 No.106505293

Anonymous 09/06/25(Sat)15:42:24 No.106505293

>>106505276
the original gradio interface it comes packaged with?

Anonymous
09/06/25(Sat)15:44:44 No.106505316

Anonymous 09/06/25(Sat)15:44:44 No.106505316

Does anyone have Q4 of VibeVoice?

Anonymous
09/06/25(Sat)15:54:10 No.106505408

Anonymous 09/06/25(Sat)15:54:10 No.106505408

>>106505316
Buy a computer rajesh

Anonymous
09/06/25(Sat)15:56:14 No.106505418

Anonymous 09/06/25(Sat)15:56:14 No.106505418

>>106505408
>still obsessed with trannies, politics and indians

Anonymous
09/06/25(Sat)15:56:29 No.106505421

Anonymous 09/06/25(Sat)15:56:29 No.106505421

>>106505293
https://github.com/microsoft/VibeVoice
this? There is nothing in it

Anonymous
09/06/25(Sat)15:56:30 No.106505422

Anonymous 09/06/25(Sat)15:56:30 No.106505422

>>106505316
It's not prequalted, people load regular fp16 model in 4 bits. try https://github.com/wildminder/ComfyUI-VibeVoice

Anonymous
09/06/25(Sat)15:57:31 No.106505431

Anonymous 09/06/25(Sat)15:57:31 No.106505431

>>106505422
prequanted*

Anonymous
09/06/25(Sat)15:57:33 No.106505432

Anonymous 09/06/25(Sat)15:57:33 No.106505432

>>106505422
man please, is there any other UI? I hate how bloated that piece of shit UI is

Anonymous
09/06/25(Sat)15:58:45 No.106505444

Anonymous 09/06/25(Sat)15:58:45 No.106505444

2 million context window
for free
and you keep using your local slop
smhtbhfamalam

Anonymous
09/06/25(Sat)15:58:55 No.106505447

Anonymous 09/06/25(Sat)15:58:55 No.106505447

>>106505422
Ok thanks, I'll take a look. I want it to be as small as possible because I'll be running it alongside LLM.

Anonymous
09/06/25(Sat)15:59:08 No.106505450

Anonymous 09/06/25(Sat)15:59:08 No.106505450

>>106505288
so you can j-j-jam it in

Anonymous
09/06/25(Sat)16:00:14 No.106505463

Anonymous 09/06/25(Sat)16:00:14 No.106505463

>>106505421
find one of the forks before MS wiped it

Anonymous
09/06/25(Sat)16:02:02 No.106505478

Anonymous 09/06/25(Sat)16:02:02 No.106505478

>>106505288
she is nervious and preparing her stink glands

Anonymous
09/06/25(Sat)16:03:46 No.106505483

Anonymous 09/06/25(Sat)16:03:46 No.106505483

>>106505444
These local clowns dont know what they are missing out on

Anonymous
09/06/25(Sat)16:05:04 No.106505490

Anonymous 09/06/25(Sat)16:05:04 No.106505490

File: 7643.jpg (153 KB, 1080x820)

153 KB JPG

>>106505444

Anonymous
09/06/25(Sat)16:05:07 No.106505491

Anonymous 09/06/25(Sat)16:05:07 No.106505491

>>106505463
no UI just example scripts. these jeets just shipped it cli wtf

Anonymous
09/06/25(Sat)16:05:21 No.106505493

Anonymous 09/06/25(Sat)16:05:21 No.106505493

>>106505444
have fun sending your life history to google

Anonymous
09/06/25(Sat)16:06:06 No.106505499

Anonymous 09/06/25(Sat)16:06:06 No.106505499

>>106505493
Google?

Anonymous
09/06/25(Sat)16:06:20 No.106505500

Anonymous 09/06/25(Sat)16:06:20 No.106505500

TTS-occupied thread.

Anonymous
09/06/25(Sat)16:07:02 No.106505507

Anonymous 09/06/25(Sat)16:07:02 No.106505507

>>106505490
I wonder if these constant hype videos still work

Anonymous
09/06/25(Sat)16:08:44 No.106505527

Anonymous 09/06/25(Sat)16:08:44 No.106505527

>>106505432
take the inference code from the node and use it wherever you want

Anonymous
09/06/25(Sat)16:09:07 No.106505530

Anonymous 09/06/25(Sat)16:09:07 No.106505530

>>106505507
wouldn't keep making them if they didn't

Anonymous
09/06/25(Sat)16:13:41 No.106505559

Anonymous 09/06/25(Sat)16:13:41 No.106505559

>>106505507
every slopwatcher is desensitized. so basically that thumbnail is appropiate and almost falls in the
>oh he's not overly estatic, maybe it's a cool niche channel with good content
category

Anonymous
09/06/25(Sat)16:13:43 No.106505560

Anonymous 09/06/25(Sat)16:13:43 No.106505560

>>106505530
that's true

Anonymous
09/06/25(Sat)16:13:57 No.106505564

Anonymous 09/06/25(Sat)16:13:57 No.106505564

>>106505432
Be the change you want to see dumbo

Anonymous
09/06/25(Sat)16:15:13 No.106505572

Anonymous 09/06/25(Sat)16:15:13 No.106505572

>>106505564
I'm going to steal this
https://github.com/wildminder/ComfyUI-VibeVoice
And implement it externally. It should be pretty straightforward.

Anonymous
09/06/25(Sat)16:18:06 No.106505596

Anonymous 09/06/25(Sat)16:18:06 No.106505596

File: 1756309867017273.png (1.05 MB, 774x1024)

1.05 MB PNG

>>106505572
if you do it without the comfy backend I would like to make a plugin for anistudio off of it. hot reloading is solved in dev so I'd like to make a few examples of models not supported in ggml yet

Anonymous
09/06/25(Sat)16:20:08 No.106505609

Anonymous 09/06/25(Sat)16:20:08 No.106505609

>>106505491
CLI is all you need.

Anonymous
09/06/25(Sat)16:20:55 No.106505619

Anonymous 09/06/25(Sat)16:20:55 No.106505619

Imagination is all you need.

Anonymous
09/06/25(Sat)16:22:02 No.106505628

Anonymous 09/06/25(Sat)16:22:02 No.106505628

>>106505620
wrong thread

Anonymous
09/06/25(Sat)16:23:12 No.106505638

Anonymous 09/06/25(Sat)16:23:12 No.106505638

>>106505628
local miku general

Anonymous
09/06/25(Sat)16:23:41 No.106505641

Anonymous 09/06/25(Sat)16:23:41 No.106505641

>>106505596
I'm not a "real" dev but I am strongly suspecting that I'm able to hack something together. Please don't hold your breath still...
Been working on lots of stuff lately.

Anonymous
09/06/25(Sat)16:27:22 No.106505673

Anonymous 09/06/25(Sat)16:27:22 No.106505673

File: AniStudio_InterOpTest-00703.webm (840 KB, 736x960)

840 KB WEBM

>>106505641
you and me both. my uncle died yesterday so a lot of time has been with the family. shit sucks but at least work was done despite the depression. wan support was added to sdcpp recently so I think it's almost time to get the memory management and node interface in. it's been a lot of cmake garbage juggling for the past while and I'm sick of it

Anonymous
09/06/25(Sat)16:36:38 No.106505758

Anonymous 09/06/25(Sat)16:36:38 No.106505758

goybros our response?
https://github.com/microsoft/VibeVoice/issues/97

Anonymous
09/06/25(Sat)16:40:52 No.106505814

Anonymous 09/06/25(Sat)16:40:52 No.106505814

>>106505758
I would argue (and I do) that the wizardlm debacle was more ridicilious. Some say it is still undergoing toxicity testing to this day

Anonymous
09/06/25(Sat)16:42:29 No.106505834

Anonymous 09/06/25(Sat)16:42:29 No.106505834

>>106505757
nobody cares it took a glacial ice age to inference and image edit on your vramlet card

>>106505758
the furk? Isn't that an /ldg/ meme man?

Anonymous
09/06/25(Sat)16:43:47 No.106505848

Anonymous 09/06/25(Sat)16:43:47 No.106505848

>>106505834
but 1 minute and 10s is pretty good for a 17b model with cfg and at 20 steps.. with a 110w pl

Anonymous
09/06/25(Sat)16:46:12 No.106505881

Anonymous 09/06/25(Sat)16:46:12 No.106505881

>>106505848
this is the llm thread. most people here are vramlet at 48gb. just go seek attention at the diffusion threads, there are four at this point and you chose this one instead. you are fucking retarded. tts is fine because there isn't anywhere else to discuss it

Anonymous
09/06/25(Sat)16:50:11 No.106505920

Anonymous 09/06/25(Sat)16:50:11 No.106505920

>>106505881
>tfw 32gb vramlet
My cope is that qwen3 30b is good enough.

Anonymous
09/06/25(Sat)16:51:27 No.106505933

Anonymous 09/06/25(Sat)16:51:27 No.106505933

>>106505920
yeah... iktf

Anonymous
09/06/25(Sat)16:53:02 No.106505952

Anonymous 09/06/25(Sat)16:53:02 No.106505952

                                                                                                  Mistral Large 3

Anonymous
09/06/25(Sat)16:53:59 No.106505963

Anonymous 09/06/25(Sat)16:53:59 No.106505963

>>106505834
>/ldg/ meme man
do not being ridicilious

Anonymous
09/06/25(Sat)16:54:10 No.106505966

Anonymous 09/06/25(Sat)16:54:10 No.106505966

File: omg it not migu with only(...).png (40 KB, 317x277)

40 KB PNG

>>106505952

Anonymous
09/06/25(Sat)17:01:05 No.106506018

Anonymous 09/06/25(Sat)17:01:05 No.106506018

>>106505952
ugh i need it so bad

Anonymous
09/06/25(Sat)17:02:25 No.106506032

Anonymous 09/06/25(Sat)17:02:25 No.106506032

>>106505952
DO NOT RELEASE!

Anonymous
09/06/25(Sat)17:07:04 No.106506079

Anonymous 09/06/25(Sat)17:07:04 No.106506079

>>106505952
>"w-what the fuck is this? A DENSE 120B MODEL? HOW WILL MY MOESISSY RIG EVEN RUN IT?"
and just like that benchmaxxing moe chinks lost

Anonymous
09/06/25(Sat)17:09:06 No.106506094

Anonymous 09/06/25(Sat)17:09:06 No.106506094

While looking into lossy text compression I found https://www.rwkv.com/ and have fallen into a little bit of a rabbit hole
>10/10 logo
>the official AI of Linux Foundation
>100% attention-free
>weird enough architecture that it needs its own software stack
>supposedly 400+ derivative projects
>no buzz whatsoever about it
Their models are tiny (~3B and less for main offerings) so probably not useful for anything, but I am curious about supposed speed benefits and wanted to run some performance benchmarks against similarly sized transformers-based models.
But it's fucking python.
There are goofs on hugging face by literally who's but I suspects they are just crude conversions that loose all architecture buffs.
Should I give up on Arch and install Debian or it wouldn't help much?

Anonymous
09/06/25(Sat)17:11:10 No.106506112

Anonymous 09/06/25(Sat)17:11:10 No.106506112

>>106506094
RWKV is a meme model.

Anonymous
09/06/25(Sat)17:11:25 No.106506115

Anonymous 09/06/25(Sat)17:11:25 No.106506115

>>106506094
if u install debian you should use debian 12 because debian 13 has no official support for cuda yet (you have to modify some things because of glibc...)
also >rwkv
sweet summer chld

Anonymous
09/06/25(Sat)17:11:35 No.106506116

Anonymous 09/06/25(Sat)17:11:35 No.106506116

>>106506094
this thing has been trying to become something for years now, all the models are a shit

Anonymous
09/06/25(Sat)17:11:48 No.106506119

Anonymous 09/06/25(Sat)17:11:48 No.106506119

>>106506094
>RWKV (pronounced RwaKuv)
I hate maths people.

Anonymous
09/06/25(Sat)17:12:13 No.106506122

Anonymous 09/06/25(Sat)17:12:13 No.106506122

>>106506094
>rwkv
just wait until you hear about retnet and you'll be all caught up when it comes to memes people thought would totally replace transformers soon back in 2023

Anonymous
09/06/25(Sat)17:13:06 No.106506129

Anonymous 09/06/25(Sat)17:13:06 No.106506129

>>106506112
RWKV models are installed on every Windows machine, making them the most successful models

Anonymous
09/06/25(Sat)17:13:44 No.106506133

Anonymous 09/06/25(Sat)17:13:44 No.106506133

>>106506122
just two more years bro

Anonymous
09/06/25(Sat)17:14:57 No.106506145

Anonymous 09/06/25(Sat)17:14:57 No.106506145

>>106506129
https://blog.rwkv.com/p/rwkvcpp-shipping-to-half-a-billion
holy shit its real
>its apache so its ok
cuckie

Anonymous
09/06/25(Sat)17:16:47 No.106506157

Anonymous 09/06/25(Sat)17:16:47 No.106506157

the new kimi 0905 is fire.
just prompted a few medical questions on openrouter and benchmaxxed against gpt5/gemini2.5pro/opus4.1/qwenmax. (openrouter system prompt off, no websearch). there were always 1-2 good additional points in the kimi answers the other models didn't bring up. I'd accuse alibaba of prompt enhancing my query or using web search stealthly with the api call, but kimi responds so fucking fast, there's no way that's happening. So yeah, idk wtf's going on.

Anonymous
09/06/25(Sat)17:18:25 No.106506168

Anonymous 09/06/25(Sat)17:18:25 No.106506168

File: neat.jpg (119 KB, 1500x1155)

119 KB JPG

>>106506112
I love meme models
>>106506129
So I should install Windows 11 instead of Debian, huh.

Anonymous
09/06/25(Sat)17:18:41 No.106506171

Anonymous 09/06/25(Sat)17:18:41 No.106506171

>>106506145
some intern probably added rwkv model loading to some copilot function to fuck around with it for an afternoon and they shipped the binaries by accident

Anonymous
09/06/25(Sat)17:19:20 No.106506177

Anonymous 09/06/25(Sat)17:19:20 No.106506177

File: file.png (1.26 MB, 1024x1024)

1.26 MB PNG

qwen takes 70-80s per image on 3060
nice

Anonymous
09/06/25(Sat)17:19:42 No.106506180

Anonymous 09/06/25(Sat)17:19:42 No.106506180

>>106506168
>So I should install Windows 11 instead of Debian, huh.
It only supports up to RWKV5 and they're up to 7 now.

Anonymous
09/06/25(Sat)17:20:23 No.106506185

Anonymous 09/06/25(Sat)17:20:23 No.106506185

>>106506171
no it because greens https://blog.rwkv.com/p/the-worlds-greenest-ai-model-rwkvs

Anonymous
09/06/25(Sat)17:20:39 No.106506187

Anonymous 09/06/25(Sat)17:20:39 No.106506187

>>106506168
>Windows 11
make sure to turn on recall! its a super helpful feature that is of course extremely secure and would never be misused by anyone.

Anonymous
09/06/25(Sat)17:22:12 No.106506197

Anonymous 09/06/25(Sat)17:22:12 No.106506197

>>106506187
they said it's local* so it's ok
*local at time of recording and storage in plain text, they never promised not to upload it as part of telemetry

Anonymous
09/06/25(Sat)17:24:05 No.106506216

Anonymous 09/06/25(Sat)17:24:05 No.106506216

>>106506197
I mean it's very smart, make the user use their compute and electricity to process the data then send yourself the compressed telemetry result, probably gives massive savings

Anonymous
09/06/25(Sat)17:26:44 No.106506232

Anonymous 09/06/25(Sat)17:26:44 No.106506232

File: Screenshot RWKV Language(...).png (105 KB, 810x683)

105 KB PNG

Nemo will finally rest.

Anonymous
09/06/25(Sat)17:30:08 No.106506258

Anonymous 09/06/25(Sat)17:30:08 No.106506258

>>106504276
lmao at the image that's brilliant

Anonymous
09/06/25(Sat)17:30:49 No.106506266

Anonymous 09/06/25(Sat)17:30:49 No.106506266

>>106505098
We still need a datacenter to train models, which is the main innovation bottleneck.

Anonymous
09/06/25(Sat)17:31:35 No.106506274

Anonymous 09/06/25(Sat)17:31:35 No.106506274

>>106505288
It's your new home

Anonymous
09/06/25(Sat)17:31:49 No.106506279

Anonymous 09/06/25(Sat)17:31:49 No.106506279

>>106506228
yeah image of hatsune miku holding a naughty sign

Anonymous
09/06/25(Sat)17:35:44 No.106506316

Anonymous 09/06/25(Sat)17:35:44 No.106506316

File: f173bfc8fd4341c42bbe73677(...).jpg (410 KB, 2017x2048)

410 KB JPG

>>106504832
She will comfort you

Anonymous
09/06/25(Sat)17:35:57 No.106506321

Anonymous 09/06/25(Sat)17:35:57 No.106506321

>>106505288
She's going to shit and piss herself and make you watch

Anonymous
09/06/25(Sat)17:36:35 No.106506324

Anonymous 09/06/25(Sat)17:36:35 No.106506324

>>106506316
Hatsune Miku: Comfort Girl

Anonymous
09/06/25(Sat)17:38:25 No.106506342

Anonymous 09/06/25(Sat)17:38:25 No.106506342

>>106504377
you have to use uv

Anonymous
09/06/25(Sat)17:39:54 No.106506353

Anonymous 09/06/25(Sat)17:39:54 No.106506353

File: 1750478146142274.png (11 KB, 475x214)

11 KB PNG

I'm pretty sure I used Qwen3 Max thinking less than a day ago. Did they disable it?

Anonymous
09/06/25(Sat)17:41:10 No.106506367

Anonymous 09/06/25(Sat)17:41:10 No.106506367

What's the difference between Mistral Nemo 12B and ReWiz Nemo 12B?

Anonymous
09/06/25(Sat)17:42:25 No.106506379

Anonymous 09/06/25(Sat)17:42:25 No.106506379

>>106506353
You are not crazy, they did disable that, though I think that wasn't actually Max doing the thinking when that was on.

Anonymous
09/06/25(Sat)17:43:42 No.106506388

Anonymous 09/06/25(Sat)17:43:42 No.106506388

i like the best friend remix, it's cute

Anonymous
09/06/25(Sat)17:45:46 No.106506402

Anonymous 09/06/25(Sat)17:45:46 No.106506402

File: Gz_ok2fbkAANfa3.jpg (1.65 MB, 1920x1080)

1.65 MB JPG

>>106506388
yee she's very sweet

Anonymous
09/06/25(Sat)17:46:13 No.106506408

Anonymous 09/06/25(Sat)17:46:13 No.106506408

>>106506324
More like Cumfart Girl, amirite?

Anonymous
09/06/25(Sat)17:50:05 No.106506439

Anonymous 09/06/25(Sat)17:50:05 No.106506439

Which one?
https://github.com/Enemyx-net/VibeVoice-ComfyUI
https://github.com/wildminder/ComfyUI-VibeVoice

Anonymous
09/06/25(Sat)17:52:34 No.106506458

Anonymous 09/06/25(Sat)17:52:34 No.106506458

>>106506439
I fucking hate comfyui developers. just try one, it shouldn't matter

Anonymous
09/06/25(Sat)18:00:29 No.106506525

Anonymous 09/06/25(Sat)18:00:29 No.106506525

>>106506439
I've seen the second one mentioned here before.

Anonymous
09/06/25(Sat)18:06:08 No.106506554

Anonymous 09/06/25(Sat)18:06:08 No.106506554

>>106506525
The second one last updated 3 days ago. First one last updated 4 hours ago.

Anonymous
09/06/25(Sat)18:07:50 No.106506566

Anonymous 09/06/25(Sat)18:07:50 No.106506566

>>106506554
using a node system just to go from text to speech seems like overkill to me

Anonymous
09/06/25(Sat)18:10:53 No.106506591

Anonymous 09/06/25(Sat)18:10:53 No.106506591

>>106506566
I don't want to use it but the official webui has no step and attention mode control, and can't add new voices without restarting.

Anonymous
09/06/25(Sat)18:11:23 No.106506596

Anonymous 09/06/25(Sat)18:11:23 No.106506596

Either OR is hosting a whole bunch of faulty K2-0905 deployments or this model is just bad for being a 1T monster. GLM4.5, R1-0528 and even V3.1 are all more enjoyable and smarter.

Anonymous
09/06/25(Sat)18:11:35 No.106506597

Anonymous 09/06/25(Sat)18:11:35 No.106506597

>>106506566
It's nice if you want to plug it into a bigger workflow it's just that comfy in particular is lacking 80% of the features for a proper node workflow.

Anonymous
09/06/25(Sat)18:12:36 No.106506607

Anonymous 09/06/25(Sat)18:12:36 No.106506607

>>106506596
Local doesn't have this problem.

Anonymous
09/06/25(Sat)18:14:04 No.106506611

Anonymous 09/06/25(Sat)18:14:04 No.106506611

>>106506607
Can you confirm the full unqanted K2-0905 is fine or are you just shitposting?

Anonymous
09/06/25(Sat)18:14:36 No.106506614

Anonymous 09/06/25(Sat)18:14:36 No.106506614

>>106506591
Just modify it to let you type in a file path in a text box for the voice and hardcode your desired step count/attention mode.

Anonymous
09/06/25(Sat)18:16:10 No.106506630

Anonymous 09/06/25(Sat)18:16:10 No.106506630

>>106506611
i use the official API and it's fine ;)
just sayin'

Anonymous
09/06/25(Sat)18:16:34 No.106506632

Anonymous 09/06/25(Sat)18:16:34 No.106506632

>>106506611
I can confirm that you wouldn't have doubts about whether or not you're getting duped by openrouter if you were testing unquanted K2 locally.

Anonymous
09/06/25(Sat)18:20:32 No.106506672

Anonymous 09/06/25(Sat)18:20:32 No.106506672

>>106506607
I guess I'll try those. I haven't gotten around downloading them but I very much hope that Q6 is significantly better than what OR is serving me because this is seriously not worth it otherwise.

Anonymous
09/06/25(Sat)18:26:41 No.106506711

Anonymous 09/06/25(Sat)18:26:41 No.106506711

>>106506596
Turn off OR system prompt

Anonymous
09/06/25(Sat)18:27:54 No.106506722

Anonymous 09/06/25(Sat)18:27:54 No.106506722

>openrouter
lule

Anonymous
09/06/25(Sat)18:29:11 No.106506732

Anonymous 09/06/25(Sat)18:29:11 No.106506732

File: fuck-groq-amazon-azure-ne(...).jpg (50 KB, 640x553)

50 KB JPG

yes saar please use my api saar no quantized saar very good like microsoft azure saaar

Anonymous
09/06/25(Sat)18:31:51 No.106506751

Anonymous 09/06/25(Sat)18:31:51 No.106506751

>>106505444
>2 million context window
>for free
Where? how slop is it? How safetymaxxed?

Anonymous
09/06/25(Sat)18:32:36 No.106506758

Anonymous 09/06/25(Sat)18:32:36 No.106506758

>>106506751
Yes.

Anonymous
09/06/25(Sat)18:33:11 No.106506765

Anonymous 09/06/25(Sat)18:33:11 No.106506765

>>106506732
why would you even quantize a model that is natively in 4-bit? probably because their shit backend doesn't support mxfp4

Anonymous
09/06/25(Sat)18:35:35 No.106506784

Anonymous 09/06/25(Sat)18:35:35 No.106506784

>>106506751
New grok models on openrouter
it's not slop or safety maxed, just collecting all ur data

Anonymous
09/06/25(Sat)18:36:54 No.106506799

Anonymous 09/06/25(Sat)18:36:54 No.106506799

>>106506784
>New grok models on openrouter
Ah i see it thanks
>Collecting all ur data
What isnt doing that?

Anonymous
09/06/25(Sat)18:37:16 No.106506808

Anonymous 09/06/25(Sat)18:37:16 No.106506808

>>106506765
They say it was a "mistake" buy who knows
Also check this out lol
https://x.com/andersonbcdefg/status/1955348480643477570

Anonymous
09/06/25(Sat)18:42:30 No.106506849

Anonymous 09/06/25(Sat)18:42:30 No.106506849

File: 1747371153899119.png (83 KB, 705x472)

83 KB PNG

>>106505444
llama 4 scout best model have 10m context sir
shit in your face sir

Anonymous
09/06/25(Sat)18:43:21 No.106506856

Anonymous 09/06/25(Sat)18:43:21 No.106506856

>>106506849
Llama 4 Reasoner when?

Anonymous
09/06/25(Sat)18:44:01 No.106506860

Anonymous 09/06/25(Sat)18:44:01 No.106506860

>>106506799
The swiss 70B model that has the output quality of a 8B llama model :D

Anonymous
09/06/25(Sat)18:47:29 No.106506888

Anonymous 09/06/25(Sat)18:47:29 No.106506888

File: bob's burgers logan bush (...).jpg (194 KB, 1200x1920)

194 KB JPG

>>106499389
>>106498240
>>106499448
>>106499466
>>106503518
You can clean up vocals with these.
bandit v2. This will separate vocals from background music and sound effects. Because the GitHub page has no instructions to guide you, you'll need something like Microsoft Copilot to help you.
https://github.com/kwatcharasupat/bandit-v2
Resemble Enhance. This removes background noises like the wind. Use the gradio app version for better effect.
https://github.com/resemble-ai/resemble-enhance
Also use this modded gradio app. This will only do denoising.
https://github.com/resemble-ai/resemble-enhance/issues/69
Acon Digital DeVerberate 3 plugin for audacity. This reduces reverb.
https://rutracker.org/forum/viewtopic.php?t=6118812
Moises ai pro plan does a better job at isolating vocals from background music and sound effects than bandit v2 but it costs $300, i bought it during a black friday sale for $150.

Anonymous
09/06/25(Sat)18:48:12 No.106506896

Anonymous 09/06/25(Sat)18:48:12 No.106506896

>>106506856
just a couple more war rooms and a few more billion spent on randos who didn't accomplish anything at apple but are totally worth hiring for a hundred million a piece
then we can make the true llama4

Anonymous
09/06/25(Sat)18:48:19 No.106506898

Anonymous 09/06/25(Sat)18:48:19 No.106506898

>>106506849
Bloody benchoid
I will redeem amazon free to run this beautiful basterd bitch

Anonymous
09/06/25(Sat)18:48:43 No.106506901

Anonymous 09/06/25(Sat)18:48:43 No.106506901

>>106506888
>you'll need something like Microsoft Copilot to help you.
local models lost

Anonymous
09/06/25(Sat)18:49:17 No.106506909

Anonymous 09/06/25(Sat)18:49:17 No.106506909

>>106506896
llama 4.20 next april will be so lit

Anonymous
09/06/25(Sat)18:50:16 No.106506916

Anonymous 09/06/25(Sat)18:50:16 No.106506916

>>106506860
didnt know about that have you tried the 70b quant or the 8b?

Anonymous
09/06/25(Sat)18:52:33 No.106506932

Anonymous 09/06/25(Sat)18:52:33 No.106506932

File: 30474 - SoyBooru.png (118 KB, 337x390)

118 KB PNG

Are you feeling kiwi today? (Qwen®) (More models coming soon™) (Two weeks)

Anonymous
09/06/25(Sat)18:53:02 No.106506933

Anonymous 09/06/25(Sat)18:53:02 No.106506933

>>106506711
How?

Anonymous
09/06/25(Sat)18:53:41 No.106506939

Anonymous 09/06/25(Sat)18:53:41 No.106506939

>>106506888
>Use the gradio app version for better effect.
>Also use this modded gradio app.
FUCKING

Anonymous
09/06/25(Sat)18:57:46 No.106506968

Anonymous 09/06/25(Sat)18:57:46 No.106506968

>grok code fast performance improves at 70k+ prompt tokens
ts some kv cache magic or what's happening here. Can't do shit in a fresh session. Bloat that bitch up with some pseudo context and suddenly it's god mode

>>106506916
The one on the site. Ain't no way I'm gonna run that locally. Absolute waste of time.

>>106506933
Click the tree dots on the model tab at the top on the chat page. Then disable "use openrouter system prompt". You always gotta check settings and make sure no frauds like on
>>106506732
are serving you.

Anonymous
09/06/25(Sat)19:08:36 No.106507037

Anonymous 09/06/25(Sat)19:08:36 No.106507037

https://litter.catbox.moe/49eylpj3rj8ry1nz.wav

Anonymous
09/06/25(Sat)19:11:24 No.106507059

Anonymous 09/06/25(Sat)19:11:24 No.106507059

Is there an indian LLM?
1.5 billion indians and no indian LLM?
saars?

Anonymous
09/06/25(Sat)19:11:26 No.106507060

Anonymous 09/06/25(Sat)19:11:26 No.106507060

>>106507037
we must refuse

Anonymous
09/06/25(Sat)19:13:00 No.106507072

Anonymous 09/06/25(Sat)19:13:00 No.106507072

>>106507059
You are permanently fixated on Indians.

Anonymous
09/06/25(Sat)19:17:51 No.106507094

Anonymous 09/06/25(Sat)19:17:51 No.106507094

File: Gemini-2.0-Flash-001.png (951 KB, 1344x756)

951 KB PNG

>>106507059
Gemini.

Anonymous
09/06/25(Sat)19:18:52 No.106507099

Anonymous 09/06/25(Sat)19:18:52 No.106507099

>>106507072
gm sir

Anonymous
09/06/25(Sat)19:24:11 No.106507130

Anonymous 09/06/25(Sat)19:24:11 No.106507130

UPDATE
indexTTS2 is still not released
END UPDATE

Anonymous
09/06/25(Sat)19:25:55 No.106507140

Anonymous 09/06/25(Sat)19:25:55 No.106507140

>>106507130
Who? I can't hear you over the sound of my Microsoft-sponsored ASMR.

Anonymous
09/06/25(Sat)19:26:58 No.106507142

Anonymous 09/06/25(Sat)19:26:58 No.106507142

is GIGAVOICE better than the soviets, or just easier to get going with?

Anonymous
09/06/25(Sat)19:28:19 No.106507149

Anonymous 09/06/25(Sat)19:28:19 No.106507149

OpenAI just released a very interesting paper
https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf
>Why Language Models Hallucinate
In short: LLMs hallucinate because we've inadvertently designed the training and evaluation process to reward confident, even if incorrect, answers, rather than honest admissions of uncertainty. Fixing this requires a shift in how we grade these systems to steer them towards more trustworthy behavior.

The Solution:

Explicitly stating "confidence targets" in evaluation instructions, where mistakes are penalized and admitting uncertainty (IDK) might receive 0 points, but guessing incorrectly receives a negative score. This encourages "behavioral calibration," where the model only answers if it's sufficiently confident.

Anonymous
09/06/25(Sat)19:28:49 No.106507152

Anonymous 09/06/25(Sat)19:28:49 No.106507152

>>106507142
vibevoice is a generational leap above whatever was before

Anonymous
09/06/25(Sat)19:30:10 No.106507158

Anonymous 09/06/25(Sat)19:30:10 No.106507158

>>106507149
we need to train them with Socratic dialogs so we have philosopher kings to rule over us.

Anonymous
09/06/25(Sat)19:30:17 No.106507159

Anonymous 09/06/25(Sat)19:30:17 No.106507159

>>106506596
use mikupad or turn off all ST's formatting, you will see a massive difference

Anonymous
09/06/25(Sat)19:31:27 No.106507168

Anonymous 09/06/25(Sat)19:31:27 No.106507168

>>106507149
You can really tell that all the talent left because nobody intelligent would write something this retarded.

Anonymous
09/06/25(Sat)19:32:49 No.106507176

Anonymous 09/06/25(Sat)19:32:49 No.106507176

>>106507149
this is just more safety slopping
models hallucinate because coming up with new stuff that was not in the training set is their inherent and desirable property.
>>106507158
That would be cool actually.

Anonymous
09/06/25(Sat)19:33:25 No.106507180

Anonymous 09/06/25(Sat)19:33:25 No.106507180

        {%- if loop.first %}
            {{- "[Round " ~ (ns.rounds) ~ "] USER:" }}
        {%- else %}
            {{- " [Round " ~ (ns.rounds) ~ "] USER:"}}
It appears there's no way to use LongCat-Flash-Chat with most frontends except with chat completion mode.

SillyTavern's aggressively useless STScript allegedly can increment variables but even macros don't work right/consistently in instruction templates so I'm not even going to try it.

Anonymous
09/06/25(Sat)19:35:03 No.106507186

Anonymous 09/06/25(Sat)19:35:03 No.106507186

>>106507149
Aw sweet, maybe now more people will understand things we've known about for ages.

Anonymous
09/06/25(Sat)19:36:26 No.106507195

Anonymous 09/06/25(Sat)19:36:26 No.106507195

>>106507149
base models are more honest (at least internally) about their certainty per token
>we've inadvertently designed the training and evaluation process to reward confident, even if incorrect, answers
this has been a known problem with how they approach instruct training, another paper just stating the obvious

Anonymous
09/06/25(Sat)19:46:08 No.106507258

Anonymous 09/06/25(Sat)19:46:08 No.106507258

>>106505490
Those men in the picture should have gay HIV sex with one another.

Anonymous
09/06/25(Sat)19:46:29 No.106507263

Anonymous 09/06/25(Sat)19:46:29 No.106507263

File: SoyBooru.com - 8805 - 2so(...).png (32 KB, 621x558)

32 KB PNG

>>106507180
>LongCat-Flash-Chat

Anonymous
09/06/25(Sat)19:47:06 No.106507268

Anonymous 09/06/25(Sat)19:47:06 No.106507268

>>106507258
fujo go home

Anonymous
09/06/25(Sat)19:48:08 No.106507280

Anonymous 09/06/25(Sat)19:48:08 No.106507280

>>106507268
Do fujo's dream of HIV sex?

Anonymous
09/06/25(Sat)19:48:26 No.106507283

Anonymous 09/06/25(Sat)19:48:26 No.106507283

>>106507258
Based fujo.

Anonymous
09/06/25(Sat)19:48:39 No.106507288

Anonymous 09/06/25(Sat)19:48:39 No.106507288

This is the 1.5B model, generated using the demo. Pretty amazing stuff...
https://litter.catbox.moe/0a895dbvq9a21sya.wav
If you need sound sources try this website
https://www.soundboard.com/category

Anonymous
09/06/25(Sat)19:51:51 No.106507314

Anonymous 09/06/25(Sat)19:51:51 No.106507314

>>106507258
ahh ahh fujo...

Anonymous
09/06/25(Sat)20:01:28 No.106507409

Anonymous 09/06/25(Sat)20:01:28 No.106507409

>>106507263
>To mitigate potential contamination from existing open-source benchmarks and enhance evaluation confidence, we meticulously constructed two new benchmarks: Meeseeks (Wang et al., 2025a) and VitaBench.

Anonymous
09/06/25(Sat)20:02:12 No.106507417

Anonymous 09/06/25(Sat)20:02:12 No.106507417

>>106507409
>Meeseeks
This can't be real.

Anonymous
09/06/25(Sat)20:05:24 No.106507447

Anonymous 09/06/25(Sat)20:05:24 No.106507447

File: miggy.jpg (190 KB, 992x1487)

190 KB JPG

>>106507288
>eat poster for dinner
Do it, I dare you

Anonymous
09/06/25(Sat)20:12:35 No.106507499

Anonymous 09/06/25(Sat)20:12:35 No.106507499

>They think vibevoice is good
https://voca.ro/1ovxYUlilVV4

Anonymous
09/06/25(Sat)20:14:11 No.106507512

Anonymous 09/06/25(Sat)20:14:11 No.106507512

File: IMG_8543.jpg (1.84 MB, 4030x2197)

1.84 MB JPG

Anonymous
09/06/25(Sat)20:14:24 No.106507514

Anonymous 09/06/25(Sat)20:14:24 No.106507514

>>106507499
>https://voca.ro/1ovxYUlilVV4
for a second i thought you did this but in a different voice
https://vocaroo.com/1mUGlhbVCFvm

Anonymous
09/06/25(Sat)20:16:09 No.106507523

Anonymous 09/06/25(Sat)20:16:09 No.106507523

>>106506094
I gave up on python for now and got the goofs for the RWKV ~3B model, and as I expected, speed seems to be in the same ballpark as Qwen3 4B.
I guess real benefits should manifest at longer context.
I'll need a more proper testing rig for this one instead of just typing "write a poem about beauty of young girl's navel" in the chat and looking at console logs.
But I'm sleepy so maybe tomorrow.

Anonymous
09/06/25(Sat)20:18:08 No.106507539

Anonymous 09/06/25(Sat)20:18:08 No.106507539

>>106507499
>>106507514
From where are you getting voice examples? That's pretty hard. Plus most clips downloadable clips are often pretty noisy or have background music.

Anonymous
09/06/25(Sat)20:19:42 No.106507548

Anonymous 09/06/25(Sat)20:19:42 No.106507548

>>106507539
>most clips downloadable clips are often pretty noisy or have background music.
this might help
https://github.com/Anjok07/ultimatevocalremovergui

Anonymous
09/06/25(Sat)20:20:09 No.106507553

Anonymous 09/06/25(Sat)20:20:09 No.106507553

>>106507499
that sounds perfectly fine to me, what problems do you have with it exactly?

Anonymous
09/06/25(Sat)20:23:32 No.106507583

Anonymous 09/06/25(Sat)20:23:32 No.106507583

>>106507512
I like these Bakas and Dipsy

Anonymous
09/06/25(Sat)20:24:09 No.106507585

Anonymous 09/06/25(Sat)20:24:09 No.106507585

>>106507548
Thanks. Also figuring out why inference_from_file.py won't use cuda. Cuda should be enabled by default...

Anonymous
09/06/25(Sat)20:24:40 No.106507589

Anonymous 09/06/25(Sat)20:24:40 No.106507589

>>106507514
Good idea. Here's my try: https://voca.ro/13jnfkR0QnPf

Anonymous
09/06/25(Sat)20:24:58 No.106507590

Anonymous 09/06/25(Sat)20:24:58 No.106507590

>>106507149
I thought they hallucinate because when a completion is poorly represented in the dataset, the model generates a probability set that's likely to pick a bad token.
Post training with more "I don't know" answers might help, yeah. Though it would have to pull the weights pretty strongly to overcome all the other non confident possibilities.

Anonymous
09/06/25(Sat)20:26:47 No.106507596

Anonymous 09/06/25(Sat)20:26:47 No.106507596

>>106507553
That was the latest gptsovits, not vibevoice. Eat barely 4GB of VRAM and took 2s to generate that. I still can't understand the hype over M$ new toy.

Anonymous
09/06/25(Sat)20:28:49 No.106507613

Anonymous 09/06/25(Sat)20:28:49 No.106507613

>>106507589
this one sounds pretty fucking bad on the other hand
>>106507596
oh figures why it completely shat the bed when it got to grotesqueries, sovits's vocabulary is utterly piss poor and almost a complete deal breaker for me

Anonymous
09/06/25(Sat)20:29:12 No.106507616

Anonymous 09/06/25(Sat)20:29:12 No.106507616

File: bakas.mp4 (2.98 MB, 1280x672)

2.98 MB MP4

>>106507512

Anonymous
09/06/25(Sat)20:31:15 No.106507630

Anonymous 09/06/25(Sat)20:31:15 No.106507630

Don't you dare!

Anonymous
09/06/25(Sat)20:31:18 No.106507631

Anonymous 09/06/25(Sat)20:31:18 No.106507631

>>106507616
adorable

Anonymous
09/06/25(Sat)20:34:06 No.106507654

Anonymous 09/06/25(Sat)20:34:06 No.106507654

>>106507613
I used whisper on your sample to get the text so it's understandable. The pronunciation is easy to fix if you pass the arpabet transcription directly (I integrated it in my api)

Anonymous
09/06/25(Sat)20:40:22 No.106507704

Anonymous 09/06/25(Sat)20:40:22 No.106507704

>>106507654
I'm not gonna deal with all that when VibeVoice sounds five times more natural and can say pretty much every word I've thrown at it without fiddling around with syntax and shit and again I just drop a minute long sample to clone instead of going trhough the tiresome and lengthy training process for each new voice with Sovits
Sovits can sound fine when you carefully cherry pick the good generations but even then it tends to sound stilted, Sovits is like 2 generations behind the curve here

Anonymous
09/06/25(Sat)20:43:49 No.106507725

Anonymous 09/06/25(Sat)20:43:49 No.106507725

>>106507704
Well we will see if they drop their training script first, then I might give it a go. Not being able to finetune it is doa for me

Anonymous
09/06/25(Sat)20:45:38 No.106507734

Anonymous 09/06/25(Sat)20:45:38 No.106507734

>>106507725
I don't see that ever happening.

Anonymous
09/06/25(Sat)20:50:44 No.106507763

Anonymous 09/06/25(Sat)20:50:44 No.106507763

>https://github.com/ggml-org/llama.cpp/pull/15327
So does this really mean we can finally use models like how they do in image gen, where you just download the base model and the loras you want?

Anonymous
09/06/25(Sat)20:55:24 No.106507796

Anonymous 09/06/25(Sat)20:55:24 No.106507796

>>106507725
I can see that happening.

Anonymous
09/06/25(Sat)20:55:53 No.106507800

Anonymous 09/06/25(Sat)20:55:53 No.106507800

>>106507763
>where you just download the base model and the loras you want?
I can already do that. You could for a while.
aLoRA is about changing those during runtime, right?

Anonymous
09/06/25(Sat)20:59:29 No.106507824

Anonymous 09/06/25(Sat)20:59:29 No.106507824

>>106507800
>I can already do that
So why don't any of us just do that then? Why do people still upload and download the merged weights?

Anonymous
09/06/25(Sat)21:01:32 No.106507835

Anonymous 09/06/25(Sat)21:01:32 No.106507835

>>106507824
Because the average retard user can't even get a single-click installer working without copius amount of hand-holding. Trying to explain loras and usage would be asking to much and would hurt download stats. Easier to just provide a plug and play model.

Anonymous
09/06/25(Sat)21:01:51 No.106507838

Anonymous 09/06/25(Sat)21:01:51 No.106507838

>>106507824
Sorry, meant to write
>I'm pretty sure we can already do that
Also, back in the day, there used to be a couple of LoRas out and about.
Notably, SuperCOT and SuperHOT.
Hell, in the PR itself there's a normal LoRa alongside the aLoRa.
For some reason, we all just decided to distribute pre-merged weights instead of just the LoRa, no idea why.

Anonymous
09/06/25(Sat)21:05:35 No.106507863

Anonymous 09/06/25(Sat)21:05:35 No.106507863

>>106507835
But local image gen has more users and they deal with needing to mess with loras fine.

Anonymous
09/06/25(Sat)21:05:53 No.106507866

Anonymous 09/06/25(Sat)21:05:53 No.106507866

>>106507704
https://litter.catbox.moe/rehari2tvedhwccm.wav

Anonymous
09/06/25(Sat)21:06:09 No.106507868

Anonymous 09/06/25(Sat)21:06:09 No.106507868

>>106507616
Ty, saved.

Anonymous
09/06/25(Sat)21:09:55 No.106507903

Anonymous 09/06/25(Sat)21:09:55 No.106507903

>>106507863
Loras are basically required for image gen so the frontends make them a major component and easy to add/set. Less important for giant text models that can't be so easily changed and so are basically an afterthought. It means messing with adding it to the scary command-line arguments instead of a file selection field on a web intefrace.

Anonymous
09/06/25(Sat)21:12:11 No.106507921

Anonymous 09/06/25(Sat)21:12:11 No.106507921

>>106507866
kek

Anonymous
09/06/25(Sat)21:27:31 No.106508042

Anonymous 09/06/25(Sat)21:27:31 No.106508042

>>106507866
wtf i love sovits now

Anonymous
09/06/25(Sat)21:30:11 No.106508067

Anonymous 09/06/25(Sat)21:30:11 No.106508067

>>106507866
Depending whether that's sovits or not you either proved his point with how lifeless that sounds or vibe sucks

Anonymous
09/06/25(Sat)21:33:45 No.106508090

Anonymous 09/06/25(Sat)21:33:45 No.106508090

>>106508067
Sir, this is 1.5B VibeVoice.

Anonymous
09/06/25(Sat)21:33:58 No.106508094

Anonymous 09/06/25(Sat)21:33:58 No.106508094

>>106508067
It sounds stilted as all fuck. My money is on it being soviet.

Anonymous
09/06/25(Sat)21:37:09 No.106508121

Anonymous 09/06/25(Sat)21:37:09 No.106508121

>>106508090
How long did that take the generate? I have the large wights downloaded from the torrent link posted a couple threads ago but I want to know whether or not it's worth using the big version or the smaller version (haven't tested either yet)

Anonymous
09/06/25(Sat)21:39:07 No.106508138

Anonymous 09/06/25(Sat)21:39:07 No.106508138

How are you guys using VV? Any rentry for retards to setup with ST?

Anonymous
09/06/25(Sat)21:39:26 No.106508142

Anonymous 09/06/25(Sat)21:39:26 No.106508142

>>106504274
Where the hell are you finding the money for all the gpus to run this shit

Anonymous
09/06/25(Sat)21:45:10 No.106508193

Anonymous 09/06/25(Sat)21:45:10 No.106508193

>>106508142
Steady employment.

Anonymous
09/06/25(Sat)21:47:03 No.106508207

Anonymous 09/06/25(Sat)21:47:03 No.106508207

>>106508138
>>106501145

Anonymous
09/06/25(Sat)21:47:33 No.106508212

Anonymous 09/06/25(Sat)21:47:33 No.106508212

>>106508142
Money just appears in my bank account every month. It's crazy.

Anonymous
09/06/25(Sat)21:49:31 No.106508231

Anonymous 09/06/25(Sat)21:49:31 No.106508231

>>106508142
GPUs? Poor people like us CPUmaxx

Anonymous
09/06/25(Sat)21:52:05 No.106508245

Anonymous 09/06/25(Sat)21:52:05 No.106508245

>>106508212
I hope you at least tell your parents thank you

Anonymous
09/06/25(Sat)22:13:00 No.106508380

Anonymous 09/06/25(Sat)22:13:00 No.106508380

File: 21522 - SoyBooru.png (46 KB, 457x694)

46 KB PNG

Was 'berry the worst marketing campaign that started the downfall of OpenAI and killed the hype?

Anonymous
09/06/25(Sat)22:16:01 No.106508399

Anonymous 09/06/25(Sat)22:16:01 No.106508399

multi-token prediction status?

Anonymous
09/06/25(Sat)22:19:07 No.106508415

Anonymous 09/06/25(Sat)22:19:07 No.106508415

>>106508193
I mean, I could afford around 10k worth if I really wanted to, but there are better things I could do with 10k

Anonymous
09/06/25(Sat)22:19:24 No.106508417

Anonymous 09/06/25(Sat)22:19:24 No.106508417

>>106508399
lazy ggergachod won't do the needful, kindly ask ikawrakow
Or wait until cloud models can code it for us, we better not be hitting the wall.

Anonymous
09/06/25(Sat)22:19:53 No.106508423

Anonymous 09/06/25(Sat)22:19:53 No.106508423

>>106508121
1.5B is comparable to SDXL image gen speed at 1024x1024.
Just use the large model if you can.

Anonymous
09/06/25(Sat)22:20:24 No.106508425

Anonymous 09/06/25(Sat)22:20:24 No.106508425

>>106508415
Such as?

Anonymous
09/06/25(Sat)22:22:23 No.106508436

Anonymous 09/06/25(Sat)22:22:23 No.106508436

can a rtx pro 6000 gen vibevoice large in real time?

Anonymous
09/06/25(Sat)22:23:45 No.106508444

Anonymous 09/06/25(Sat)22:23:45 No.106508444

File: moatboy at google hq.png (1.83 MB, 1024x1024)

1.83 MB PNG

Any Google insiders here? How are Gemini 3/Gemma update working out? Did you hit the wall too like the rest?

Anonymous
09/06/25(Sat)22:23:56 No.106508447

Anonymous 09/06/25(Sat)22:23:56 No.106508447

https://voca.ro/14hcU3N3ZLxZ

Anonymous
09/06/25(Sat)22:24:59 No.106508456

Anonymous 09/06/25(Sat)22:24:59 No.106508456

>>106508142
Having money saved up from previous work. I have a 3090 but spending on more ram seems a lot more worth it now compared to adding an extra gpu due to all the fuckhuge models coming out recently.

Anonymous
09/06/25(Sat)22:25:41 No.106508461

Anonymous 09/06/25(Sat)22:25:41 No.106508461

>>106508444
Gemini? We're now simulating the next generation of LLMs within our Genie 3 world model that uses its capabilities to manifest a SOTA llm writing responses within its virtual world.

Anonymous
09/06/25(Sat)22:27:59 No.106508480

Anonymous 09/06/25(Sat)22:27:59 No.106508480

>>106508461
How safe is it?

Anonymous
09/06/25(Sat)22:28:11 No.106508483

Anonymous 09/06/25(Sat)22:28:11 No.106508483

>>106508447
Is that your voice?

Anonymous
09/06/25(Sat)22:32:20 No.106508506

Anonymous 09/06/25(Sat)22:32:20 No.106508506

>>106508461
How good are they at math and coding?

Anonymous
09/06/25(Sat)22:33:21 No.106508509

Anonymous 09/06/25(Sat)22:33:21 No.106508509

>>106508480
>>106508506
The only questions investors care about

Anonymous
09/06/25(Sat)22:33:38 No.106508511

Anonymous 09/06/25(Sat)22:33:38 No.106508511

>>106507866
This is just a recording, you can't fool me!!

Anonymous
09/06/25(Sat)22:34:31 No.106508520

Anonymous 09/06/25(Sat)22:34:31 No.106508520

>>106508425
Student loans, this year's Roth contribution, emergency fund, saving for a house (hopefully the market crashes), paying a lawyer to research a business idea I've been sitting on

Anonymous
09/06/25(Sat)22:40:55 No.106508552

Anonymous 09/06/25(Sat)22:40:55 No.106508552

File: afis.jpg (18 KB, 516x532)

18 KB JPG

My AFIS roleplay just got voices and they are GOOD! VibeVoice really is impressive. There's probably an easy way to integrate it with ST as well.

Anonymous
09/06/25(Sat)22:48:31 No.106508596

Anonymous 09/06/25(Sat)22:48:31 No.106508596

>>106508552
Post an example.

Anonymous
09/06/25(Sat)22:49:57 No.106508604

Anonymous 09/06/25(Sat)22:49:57 No.106508604

>>106508552
Large is pretty sensitive to how good the input voice data is. You want clear, smooth studio quality and it will get close to it, it has a massive range in type and age of voice.

Anonymous
09/06/25(Sat)22:50:50 No.106508610

Anonymous 09/06/25(Sat)22:50:50 No.106508610

English is cool and all, but I'm not moving off sovits unless someone can demonstrate jap abilities better than what i get with my custom trained model paired with clean samples.

Anonymous
09/06/25(Sat)22:53:06 No.106508621

Anonymous 09/06/25(Sat)22:53:06 No.106508621

VibeVoice recognized gluck gluck gluck as blowjob noices, lmao I'm feeling real unsafe now

Anonymous
09/06/25(Sat)22:53:59 No.106508627

Anonymous 09/06/25(Sat)22:53:59 No.106508627

>>106508621
We need like a list of sounds it recognizes, it seems a bit random

Anonymous
09/06/25(Sat)22:55:55 No.106508641

Anonymous 09/06/25(Sat)22:55:55 No.106508641

File: Taylor County PSA.jpg (425 KB, 2655x1500)

425 KB JPG

>Real niggas listen to what they feel in they gut after a long shift at the warehouse or when they ridin’ out to the track to hustle horses or pedicabs. If ya real, ya feel: trap beats, Drill tempo (especially from UK drill, or Miami slime), but also music a man love his daughter to.
w-what?

Anonymous
09/06/25(Sat)22:57:16 No.106508652

Anonymous 09/06/25(Sat)22:57:16 No.106508652

>>106508641
lolwut

Anonymous
09/06/25(Sat)23:30:14 No.106508831

Anonymous 09/06/25(Sat)23:30:14 No.106508831

>>106508596
https://vocaroo.com/12A6GA08pA5C
This is with ~10s of uncleaned input audio per voice/speaker. The original voices are also low fidelity, that's not VibeVoice crushing them btw.

>>106508604
Yeah, I've been playing around with it for a bit now and it seems quite versatile. I wonder if there's a way to get it to do laughs or perhaps precisely fiddle with the inflection mid sentence? I think I remember doing something like that with TortoiseTTS a while ago.

Anonymous
09/06/25(Sat)23:33:57 No.106508848

Anonymous 09/06/25(Sat)23:33:57 No.106508848

>>106508831
It seemingly tries to emulate bitrate and noise of the original clip and maybe even exaggerate it.

Anonymous
09/06/25(Sat)23:57:35 No.106508987

Anonymous 09/06/25(Sat)23:57:35 No.106508987

>>106508848
Wouldn't say it exaggerates the effect. Sounds pretty spot on to me, especially Fox. Only thing I'd like to improve is the inflections and the random bouts of background noise. Probably could all be fixed with better and longer input voices. I should probably clean it up a bit and give it more than 10 seconds per voice.

Anonymous
09/07/25(Sun)00:06:29 No.106509035

Anonymous 09/07/25(Sun)00:06:29 No.106509035

>>106508831
Are you just using the default settings? I am getting good results inbetween ones that just spazzes the fuck out lol

Anonymous
09/07/25(Sun)00:16:52 No.106509086

Anonymous 09/07/25(Sun)00:16:52 No.106509086

>>106506094
>I am curious about supposed speed benefits
RWKV is one specific state-space model (SSM) architecture. The chief practical difference with transformers is an "infinite" lossy context. As the context grows, runtime performance won't degrade, but the memory of older tokens will gradually fade. SSM proponents also argue that the context compression intrinsic to SSMs produces superior results in some use-cases.

Any RWKV models I've played with were retarded, seemingly from the training regime rather than architecture. Bo Peng claims the newer models are at least on-par with transformers of the same size. I'm awaiting the newest 14B to check it out again.

Anonymous
09/07/25(Sun)01:10:11 No.106509373

Anonymous 09/07/25(Sun)01:10:11 No.106509373

>>106508641
https://vocaroo.com/141Mzkn5YKGh

Anonymous
09/07/25(Sun)01:34:32 No.106509496

Anonymous 09/07/25(Sun)01:34:32 No.106509496

>serious
Is there an open source vision LLM with an open license that's on par with Gemma? I can't find one that's on par with abliterated Gemma cause they are all STEM benchmaxxed.

I guess you can say I'm looking for the Nemo of vision, but sadly Pixtral doesn't cut it either.

Anonymous
09/07/25(Sun)01:40:18 No.106509517

Anonymous 09/07/25(Sun)01:40:18 No.106509517

what happened to petra? i just saw the name somewhere and made me think of /lmg/ kek

Anonymous
09/07/25(Sun)01:42:16 No.106509525

Anonymous 09/07/25(Sun)01:42:16 No.106509525

>>106509086
The official RWKV models will never be good because they are trained on EleutherAI-sourced open training data on a shoestring budget. It will take a commercial company to make something half-decent with this architecture.

Anonymous
09/07/25(Sun)02:01:48 No.106509632

Anonymous 09/07/25(Sun)02:01:48 No.106509632

do we have any prompting guides for vibevoice? curious if there's any way to control it beyond just a simple script

Anonymous
09/07/25(Sun)02:21:06 No.106509719

Anonymous 09/07/25(Sun)02:21:06 No.106509719

any way of using my 6700xt with my 9070xt for 28gbs of vram?

Anonymous
09/07/25(Sun)02:29:30 No.106509752

Anonymous 09/07/25(Sun)02:29:30 No.106509752

>>106509719
yeah, why wouldnt there be? just plug them both in. they will run at half pcie bandwidth but that wont really make much of a difference

Anonymous
09/07/25(Sun)02:45:14 No.106509820

Anonymous 09/07/25(Sun)02:45:14 No.106509820

>>106509086
>I'm awaiting the newest 14B to check it out again
Is this actually happening?

Anonymous
09/07/25(Sun)03:23:06 No.106509994

Anonymous 09/07/25(Sun)03:23:06 No.106509994

File: 1608766070516.jpg (5 KB, 250x245)

5 KB JPG

So when I use safetensors files am I basically running the model at FP16? Do I need to find GGUF files if i want to use say Q8/Q6?

Anonymous
09/07/25(Sun)03:30:18 No.106510026

Anonymous 09/07/25(Sun)03:30:18 No.106510026

>>106509994
Yeah usually, but it depends on your mood.
>load_in_4bit=True

Anonymous
09/07/25(Sun)03:30:18 No.106510027

Anonymous 09/07/25(Sun)03:30:18 No.106510027

>>106509994
yes

Anonymous
09/07/25(Sun)03:31:19 No.106510033

Anonymous 09/07/25(Sun)03:31:19 No.106510033

>>106509994
this >>106510026 is worse

Anonymous
09/07/25(Sun)03:31:24 No.106510036

Anonymous 09/07/25(Sun)03:31:24 No.106510036

>>106508436
An old Ampere A6000 takes about 25s for 20s of speech, so I'd hope so.

Anonymous
09/07/25(Sun)03:40:00 No.106510085

Anonymous 09/07/25(Sun)03:40:00 No.106510085

>>106505432
download pinokio, there is already a webui fully working under community scripts. It works better than the comfyui version too lol

Anonymous
09/07/25(Sun)03:56:59 No.106510181

Anonymous 09/07/25(Sun)03:56:59 No.106510181

>>106509820

>>106506232
>planned

Anonymous
09/07/25(Sun)04:05:58 No.106510226

Anonymous 09/07/25(Sun)04:05:58 No.106510226

>>106506232
>>106510181
I've been doing some research. It sounds like they're training for 100 different languages with a meme vocabulary / tokenizer, that might be degrading their results. I also wonder if there's been enough hyperparameter testing with respect to the hidden state size. I also can't help but notice no one is using their last 14b model, is it just undertrained or is there a fundamental issue with scaling this architecture?

Anonymous
09/07/25(Sun)04:09:12 No.106510238

Anonymous 09/07/25(Sun)04:09:12 No.106510238

>>106510226
As many anons said RWKV models have always been under performing memes, that's why no one uses them outside of MS for some reason, but then again MS gave us Phi so they're no strangers to weirdly useless models.

Anonymous
09/07/25(Sun)04:09:55 No.106510240

Anonymous 09/07/25(Sun)04:09:55 No.106510240

>>106508142
Scamming VCs.

Anonymous
09/07/25(Sun)04:12:46 No.106510254

Anonymous 09/07/25(Sun)04:12:46 No.106510254

>>106510238
I have been playing around with the 3B model and it gives me hope, I don't agree that it's a meme. Transformers will probably always be better if kv cache size is a non-factor on your machine but I think for slow boil chads RWKV might save local.

Anonymous
09/07/25(Sun)04:15:21 No.106510264

Anonymous 09/07/25(Sun)04:15:21 No.106510264

File: c3670048-19d2-4da6-94f4-0(...).png (124 KB, 616x463)

124 KB PNG

>>106510254
>RWKV might save local.
That's been the sentiment of some since before llama even existed... See old comparison in picrel

Anonymous
09/07/25(Sun)04:16:37 No.106510268

Anonymous 09/07/25(Sun)04:16:37 No.106510268

>>106509994
safetensors and GGUF are just file formats.
They can in principle both store arbitrary data, though ggml (the library providing GGUF) has a particular focus on quantization.
Rule of thumb: safetensors is for Python-based projects, GGUF is for projects using llama.cpp/ggml as the backend.

Anonymous
09/07/25(Sun)04:25:10 No.106510315

Anonymous 09/07/25(Sun)04:25:10 No.106510315

https://www.reddit.com/r/LocalLLaMA/comments/1namz1q/hf_releases_3t_tokens_dataset_sourced_entirely_from_pdfs/
More high quality curated data to reduce our reliance on toxic webslop.

Anonymous
09/07/25(Sun)04:30:03 No.106510342

Anonymous 09/07/25(Sun)04:30:03 No.106510342

>>106510315
We don't need "high-quality" data, we need data that is relevant for the models' primary end uses. Improved performance on synthetic benchmarks is a red herring and is not an indicator of quality anyway.

Anonymous
09/07/25(Sun)04:31:08 No.106510347

Anonymous 09/07/25(Sun)04:31:08 No.106510347

>>106510315
Ignore the other guy, this is based. We need long context data like this so models will stop being retarded after the first 8k tokens

Anonymous
09/07/25(Sun)04:31:17 No.106510348

Anonymous 09/07/25(Sun)04:31:17 No.106510348

>>106510315
Actual link:
https://huggingface.co/datasets/HuggingFaceFW/finepdfs

Anonymous
09/07/25(Sun)04:33:30 No.106510359

Anonymous 09/07/25(Sun)04:33:30 No.106510359

>>106510347
Utterly pointless when the data of interest (conversational SFT data) is all short and the models are still mostly pretrained on 2k-8k tokens context anyway because of the quadratic costs of attention.

Anonymous
09/07/25(Sun)04:35:05 No.106510367

Anonymous 09/07/25(Sun)04:35:05 No.106510367

>>106510348
>https://huggingface.co/datasets/HuggingFaceFW/finepdfs

>As we run out of web pages to process, the natural question has always been: what to do next?
Fuck right off, this wouldn't be a problem if you didn't filter 99% of it

Anonymous
09/07/25(Sun)04:39:58 No.106510393

Anonymous 09/07/25(Sun)04:39:58 No.106510393

>>106510359
>the models are still mostly pretrained on 2k-8k tokens context
There's nothing stopping us from training the last ~3T tokens at a much higher context window, we just need someone to take the leap. It was recently discovered that labs are overspending on training and can lower their batch sizes without degradation in results, we just need a lab that has their priorities straight and actually cares about context length beyond needle in a haystack benchmarks

Anonymous
09/07/25(Sun)04:41:16 No.106510398

Anonymous 09/07/25(Sun)04:41:16 No.106510398

>>106510393
>There's nothing stopping us from training the last ~3T tokens at a much higher context window,
Isn't that exactly the kind of thing they do already

Anonymous
09/07/25(Sun)04:43:48 No.106510406

Anonymous 09/07/25(Sun)04:43:48 No.106510406

>>106510398
They do to some degree but that's why the pdfs are based, the more long context training the better

Anonymous
09/07/25(Sun)04:46:19 No.106510418

Anonymous 09/07/25(Sun)04:46:19 No.106510418

>>106510393
Long-context performance is task-dependent. Pretty much all officially released models have received long-context training, but not with multi-turn conversations. In practice they just want to end the conversations after a few turns, because most existing conversational data is like that.

Anonymous
09/07/25(Sun)04:48:25 No.106510426

Anonymous 09/07/25(Sun)04:48:25 No.106510426

>>106510342
And what kind of data would be relevant to ERP?

Anonymous
09/07/25(Sun)04:50:06 No.106510436

Anonymous 09/07/25(Sun)04:50:06 No.106510436

>>106510426
Other than examples of ERP itself, lots of common-sense and/or obvious data that only exists in very diluted form in random web documents.

Anonymous
09/07/25(Sun)04:51:26 No.106510439

Anonymous 09/07/25(Sun)04:51:26 No.106510439

>>106510418
A multi-turn conversation is just a flavor of text, just a presentation layer. Having long context capabilities is much more fundamental and needs to be done in pretraining. You could realistically train a base model to learn a conversation format in only a couple million tokens

Anonymous
09/07/25(Sun)04:58:19 No.106510479

Anonymous 09/07/25(Sun)04:58:19 No.106510479

>>106510426
visual novels

Anonymous
09/07/25(Sun)05:03:58 No.106510505

Anonymous 09/07/25(Sun)05:03:58 No.106510505

>>106510426
whatever they trained the first character.ai models on or the new ones which now successfully capture the spirit of their 2022 models

Anonymous
09/07/25(Sun)05:28:14 No.106510630

Anonymous 09/07/25(Sun)05:28:14 No.106510630

>>106509035
Same here, I mostly fiddle with temp, though. Seems very sensitive to that. Lowered it to 0.92 helps.

Anonymous
09/07/25(Sun)05:42:30 No.106510703

Anonymous 09/07/25(Sun)05:42:30 No.106510703

>>106510426
- different types of relationships and development there of
- physical range of motion

Anonymous
09/07/25(Sun)05:44:08 No.106510709

Anonymous 09/07/25(Sun)05:44:08 No.106510709

What's the easiest way to get vibevoice running. The official repo needs some Docker bullshit. Has anyone made a .cpp version of their shitware that just werks

Anonymous
09/07/25(Sun)05:47:32 No.106510729

Anonymous 09/07/25(Sun)05:47:32 No.106510729

best model for studying?
>deepseek
good enough for solving homework, needs very thorough prompting to tutor though
>kimi
has not been good for homework in my experience

i have not tried GLM 4.5 yet, gpt-oss-120b turns out to be the best at tutoring, maybe im prompting badly, either way help

Anonymous
09/07/25(Sun)05:49:11 No.106510736

Anonymous 09/07/25(Sun)05:49:11 No.106510736

>>106510703
That's some of the information a LLM will probably only acquire after getting trained on several trillions of unfiltered random tokens, hopefully without getting averaged out in the various training batches. I still think that the way LLMs are usually pretrained is not conducive for learning this sort of stuff efficiently.

Anonymous
09/07/25(Sun)06:01:51 No.106510781

Anonymous 09/07/25(Sun)06:01:51 No.106510781

>>106510238
>As many anons said RWKV models have always been under performing memes, that's why no one uses them outside of MS for some reason, but then again MS gave us Phi so they're no strangers to weirdly useless models.
you forgot bitnet
for some reason MS has a lot of copers who dream of a world where toasters can run models
maybe it's in the jeet blood

Anonymous
09/07/25(Sun)06:05:37 No.106510794

Anonymous 09/07/25(Sun)06:05:37 No.106510794

>>106510781
i am not buying any more of your gpus jensen

Anonymous
09/07/25(Sun)06:17:20 No.106510850

Anonymous 09/07/25(Sun)06:17:20 No.106510850

https://github.com/resemble-ai/chatterbox
https://xcancel.com/heysehajsingh/status/1963640592661188857
https://huggingface.co/ResembleAI/chatterbox
How does it compare to Microsoft's voice model?

Anonymous
09/07/25(Sun)06:21:27 No.106510870

Anonymous 09/07/25(Sun)06:21:27 No.106510870

>>106510850
worse

Anonymous
09/07/25(Sun)06:29:51 No.106510919

Anonymous 09/07/25(Sun)06:29:51 No.106510919

>she whispered, her voice barely above a whisper

Anonymous
09/07/25(Sun)06:35:55 No.106510957

Anonymous 09/07/25(Sun)06:35:55 No.106510957

>>106507824
I am too retarded to understand the details, but I remember seeing some graphs ITT that proper finetune is better/more balanced that LoRas.

Anonymous
09/07/25(Sun)06:37:58 No.106510966

Anonymous 09/07/25(Sun)06:37:58 No.106510966

>>106507824
iirc one of the reasons is how loras would interact with quanted models differently if applied on top instead of merged, and since there's so many different quant levels something about that as well as what others said

Anonymous
09/07/25(Sun)06:39:55 No.106510977

Anonymous 09/07/25(Sun)06:39:55 No.106510977

>>106510426
discord chat logs

Anonymous
09/07/25(Sun)06:41:52 No.106510990

Anonymous 09/07/25(Sun)06:41:52 No.106510990

>>106510977
yes daddy :uwu_32:

Anonymous
09/07/25(Sun)06:45:11 No.106511014

Anonymous 09/07/25(Sun)06:45:11 No.106511014

>>106510367
>Fuck right off, this wouldn't be a problem if you didn't filter 99% of it
I find it funny that most companies tried their best to filter out anything explicit yet kept "erotica" syrupy romances (why we have shit like >>106510919 as the standard of chatbot/ai story writing) and random web page backend errors that shouldn't be in any dataset.

>>106510990
He's right anon, discord is what made cai so alive and different from most other models despite their model being outdated and dumb.

Anonymous
09/07/25(Sun)06:52:37 No.106511051

Anonymous 09/07/25(Sun)06:52:37 No.106511051

>>106510367
it's insane they thought saying something like that was a good idea, those people really believe the internet is just reddit and twitter, my god...

Anonymous
09/07/25(Sun)07:06:00 No.106511129

Anonymous 09/07/25(Sun)07:06:00 No.106511129

I know llama.cpp is pretty bad at parallel request, you need to set double the context length if you want n=2 parallel request and such.
How good is exllamav3+tabbyapi in that regard? Now it has tensor parallel support and it doesn't require 2,4,8 number of gpus to work, you can have a mix of vram sizes too. Is the parallel request similar to vllm? does it take up more vram to have more request at the same time?

Anonymous
09/07/25(Sun)07:13:47 No.106511181

Anonymous 09/07/25(Sun)07:13:47 No.106511181

>>106510850
a bit worse but performance is real time with a mid-range gpu

Anonymous
09/07/25(Sun)07:14:43 No.106511189

Anonymous 09/07/25(Sun)07:14:43 No.106511189

>>106509086
>The chief practical difference with transformers is an "infinite" lossy context. As the context grows, runtime performance won't degrade, but the memory of older tokens will gradually fade.
If it's infinite context, but it's lossy, then what's the point of it being infinite if it will eventually forget things just like transformers?

Anonymous
09/07/25(Sun)07:21:16 No.106511228

Anonymous 09/07/25(Sun)07:21:16 No.106511228

>>106511189
>runtime performance won't degrade

Anonymous
09/07/25(Sun)07:24:11 No.106511251

Anonymous 09/07/25(Sun)07:24:11 No.106511251

>>106511228
I too love fast retardation

Anonymous
09/07/25(Sun)07:46:49 No.106511399

Anonymous 09/07/25(Sun)07:46:49 No.106511399

I too love 4chan

Anonymous
09/07/25(Sun)07:52:21 No.106511430

Anonymous 09/07/25(Sun)07:52:21 No.106511430

>>106510630
Thanks. I think the problem I was having is having the node resample the audio down 24000 from what I saved on audacity. Saving straight to 24000 seems cut the artifacts a lot.

Anonymous
09/07/25(Sun)08:00:08 No.106511486

Anonymous 09/07/25(Sun)08:00:08 No.106511486

>>106511430
Oh, good to know, never would've guessed. Thanks for the tip! What character are you trying to clone, btw?

Anonymous
09/07/25(Sun)08:03:36 No.106511499

Anonymous 09/07/25(Sun)08:03:36 No.106511499

>>106511486
I am mostly testing still to get the best settings so a bit of everything, no real specifics.

Anonymous
09/07/25(Sun)08:05:06 No.106511507

Anonymous 09/07/25(Sun)08:05:06 No.106511507

>>106511486
>>106511430
speaking of tips, i found that setting cfg to 3 and steps to 5 produces very accurate character impersonation. the comfy node i'm using had the cfg clamped at 2 for no good reason so I had to edit the script

Anonymous
09/07/25(Sun)08:07:49 No.106511525

Anonymous 09/07/25(Sun)08:07:49 No.106511525

>>106511507
Hadn't even considered that. Thanks, I'll test that too.

Anonymous
09/07/25(Sun)08:10:10 No.106511544

Anonymous 09/07/25(Sun)08:10:10 No.106511544

I understand that training is not cheap, but RWKV people really gimping themselves with small model sizes, performance is just not a bottleneck at that scale.

Anonymous
09/07/25(Sun)08:16:11 No.106511581

Anonymous 09/07/25(Sun)08:16:11 No.106511581

>>106511507
Steps to 5?? That seems oddly low. My comfyui node had it at 10 and I often cranked it to something between 12 and 20. Bigger number more betterer :)

Anonymous
09/07/25(Sun)08:21:18 No.106511610

Anonymous 09/07/25(Sun)08:21:18 No.106511610

>>106511581
You're absolutely right!

Anonymous
09/07/25(Sun)08:22:26 No.106511620

Anonymous 09/07/25(Sun)08:22:26 No.106511620

>>106511581
I learned it from image gen where if you have a very strong checkpoint or lora you can do something like karras - gradient_estimation - 15 steps and it looks great instead of having that signature ai look you'd get with more steps and cfg. sdxl btw

Anonymous
09/07/25(Sun)08:37:41 No.106511726

Anonymous 09/07/25(Sun)08:37:41 No.106511726

File: 1747951068990741.jpg (93 KB, 1216x684)

93 KB JPG

>>106504274
So it's my understanding that the only thing Microsoft got rid of on their official repos were the Large wieghts but the code itself is largely unchanged. Is that correct? Or did they fuck with the code before re-release too?

Anonymous
09/07/25(Sun)08:38:48 No.106511733

Anonymous 09/07/25(Sun)08:38:48 No.106511733

>bytedance/seed-oss-36b
Is this thing any good?

Anonymous
09/07/25(Sun)08:46:08 No.106511772

Anonymous 09/07/25(Sun)08:46:08 No.106511772

>>106511189
>they reinvented context shift

Anonymous
09/07/25(Sun)08:52:13 No.106511818

Anonymous 09/07/25(Sun)08:52:13 No.106511818

Has anyone tried using hermes 4? I'm getting {{user}} tokens at the natural ending point of the response but it doesn't actually end there. 2: How do we get it to use reasoning mode in local? I swapped over to generic llama 3 instruct + the enclosed system prompt but it's not doing anything.

Tested on 4.25bpw_exl3 on tabby / ST

Anonymous
09/07/25(Sun)08:57:22 No.106511854

Anonymous 09/07/25(Sun)08:57:22 No.106511854

>>106510367
>Fuck
Using advertiser-unfriendly language like this is exactly why this whole domain gets filtered out.

Anonymous
09/07/25(Sun)08:59:13 No.106511868

Anonymous 09/07/25(Sun)08:59:13 No.106511868

>>106511726
They removed all of the code entirely before putting the repo back. You can check for yourself.

Anonymous
09/07/25(Sun)09:04:37 No.106511896

Anonymous 09/07/25(Sun)09:04:37 No.106511896

>>106511544
If anything it's the opposite. Instead of splitting their limited training compute across 5 sizes including 7B and 14B, they should just train one or two small models well, ideally with transformers equivalents so that they can show the architecture does not adversely affect output quality.

Anonymous
09/07/25(Sun)09:06:34 No.106511910

Anonymous 09/07/25(Sun)09:06:34 No.106511910

File: 1750989943069498.png (162 KB, 761x680)

162 KB PNG

so this is what the superintelligence team has been working on...

Anonymous
09/07/25(Sun)09:08:19 No.106511920

Anonymous 09/07/25(Sun)09:08:19 No.106511920

>>106511910
llm 2.0 baby

Anonymous
09/07/25(Sun)09:09:56 No.106511929

Anonymous 09/07/25(Sun)09:09:56 No.106511929

>>106511910
RAG is back, baby
cline and augmentcucks are seething

Anonymous
09/07/25(Sun)09:19:41 No.106512007

Anonymous 09/07/25(Sun)09:19:41 No.106512007

>>106511910
>Rice University

Anonymous
09/07/25(Sun)09:45:13 No.106512174

Anonymous 09/07/25(Sun)09:45:13 No.106512174

>>106511868
It's incredibly easy to use, a field day for scammers. I don't think they were concerned about some chud making lolita sex noises. Scammers and voice cloning is more of a real issue.

Anonymous
09/07/25(Sun)09:45:56 No.106512180

Anonymous 09/07/25(Sun)09:45:56 No.106512180

>>106511910
>>106512007
kek, it's really written "rice university"

Anonymous
09/07/25(Sun)09:48:09 No.106512191

Anonymous 09/07/25(Sun)09:48:09 No.106512191

>>106511910
yup, just use rag bro

Anonymous
09/07/25(Sun)09:48:27 No.106512194

Anonymous 09/07/25(Sun)09:48:27 No.106512194

>>106512174
Maybe it was just its abilities in nsfw in general.

Anonymous
09/07/25(Sun)09:50:27 No.106512210

Anonymous 09/07/25(Sun)09:50:27 No.106512210

Has anyone gotten VibeVoice 7B to generate a long script? I'm trying and it always ends early after 4-5 minutes.

Anonymous
09/07/25(Sun)09:53:38 No.106512236

Anonymous 09/07/25(Sun)09:53:38 No.106512236

>>106510709
It doesn't need docker. Maybe learn how to read. You have two options:
>text inference python demo via command line (you can add your own voices and it reads a text file..)
or
>install CumragUI and use comfui-vibevoice node

Anonymous
09/07/25(Sun)09:55:58 No.106512254

Anonymous 09/07/25(Sun)09:55:58 No.106512254

>>106512174
This isn't the first TTS model with voice cloning support. If that was the problem they wouldn't have left the 1.5B up and they would have had an internal ban on making anything with voice cloning capability in the first place.

Anonymous
09/07/25(Sun)09:56:08 No.106512255

Anonymous 09/07/25(Sun)09:56:08 No.106512255

>>106512210
Never mind, I'm stupid.

Anonymous
09/07/25(Sun)09:56:44 No.106512262

Anonymous 09/07/25(Sun)09:56:44 No.106512262

>>106512236
>CumragUI
wtf is that

Anonymous
09/07/25(Sun)09:58:29 No.106512278

Anonymous 09/07/25(Sun)09:58:29 No.106512278

>>106512262
A schizo way of writing ComfyUI.

Anonymous
09/07/25(Sun)09:58:33 No.106512279

Anonymous 09/07/25(Sun)09:58:33 No.106512279

SAAAAAAARS
https://huggingface.co/YannQi/R-4B

Anonymous
09/07/25(Sun)09:59:00 No.106512284

Anonymous 09/07/25(Sun)09:59:00 No.106512284

>>106512279
sirs*

Anonymous
09/07/25(Sun)09:59:12 No.106512285

Anonymous 09/07/25(Sun)09:59:12 No.106512285

>>106512236
both options are total dogshit because using Microsoft's webui or their script requires reloading the model and messing around with temporary files every time you do anything
and using cumshitUI was way slower than their original code for some reason (it only used 1 core and took more than 4x longer).
I'm gonna try and vibe code a better UI for it maybe, one that isn't a cancerous WEB UI.

Anonymous
09/07/25(Sun)10:00:47 No.106512297

Anonymous 09/07/25(Sun)10:00:47 No.106512297

>>106512278
oh ok, anons always making finding stuff in the archives very easy I see

Anonymous
09/07/25(Sun)10:03:31 No.106512319

Anonymous 09/07/25(Sun)10:03:31 No.106512319

>>106512307
>>106512307
>>106512307

Anonymous
09/07/25(Sun)10:09:51 No.106512358

Anonymous 09/07/25(Sun)10:09:51 No.106512358

Just read their oneshot script and add a while loop that and a thing that reads new prompts. You can get an LLM do it for you. It's not that hard.

Anonymous
09/07/25(Sun)10:42:58 No.106512596

Anonymous 09/07/25(Sun)10:42:58 No.106512596

>>106512285
You are like a spoiled little child.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.