/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/07/25(Sun)10:02:13 No.106512307

File: bakas.mp4 (2.98 MB, 1280x672)

2.98 MB MP4

/lmg/ - Local Models General Anonymous 09/07/25(Sun)10:02:13 No.106512307 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106504274 & >>106497597

►News
>(09/05) Klear-46B-A2.5B released: https://hf.co/collections/Kwai-Klear/klear10-68ba61398a0a4eb392ec6ab1
>(09/04) Kimi K2 update for agentic coding and 256K context: https://hf.co/moonshotai/Kimi-K2-Instruct-0905
>(09/04) Tencent's HunyuanWorld-Voyager for virtual world generation: https://hf.co/tencent/HunyuanWorld-Voyager
>(09/04) Google released a Gemma embedding model: https://hf.co/google/embeddinggemma-300m
>(09/04) Chatterbox added better multilingual support: https://hf.co/ResembleAI/chatterbox

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/07/25(Sun)10:02:30 No.106512310

Anonymous 09/07/25(Sun)10:02:30 No.106512310

File: Gvso7QcXAAAMCg6.jpg (386 KB, 2048x1536)

386 KB JPG

►Recent Highlights from the Previous Thread: >>106504274

--Paper: Why Language Models Hallucinate:
>106507149 >106507158 >106507186 >106507195 >106507590 >106507176
--RWKV model evaluation: architecture, performance, and deployment challenges:
>106506094 >106506112 >106506129 >106506145 >106506171 >106506185 >106506180 >106507523 >106509086 >106509525 >106509820 >106511189 >106511228
--VibeVoice voice synthesis effectiveness and parameter tuning:
>106508552 >106508596 >106508604 >106508831 >106508848 >106508987 >106509035 >106510630 >106511430 >106511486 >106511499 >106511507 >106511525 >106511581 >106511610 >106511620
--Tools for isolating vocals and reducing background noise:
>106506888
--Debate over relevance of new 3T token PDF dataset for improving LLMs:
>106510315 >106510342 >106510426 >106510436 >106510479 >106510505 >106510703 >106510736 >106510977 >106510347 >106510359 >106510393 >106510406 >106510418 >106510439 >106510348 >106511014
--Implementing VibeVoice TTS externally from ComfyUI-VibeVoice node:
>106505316 >106505422 >106505432 >106505527 >106505572 >106505596 >106505641 >106505673 >106510085
--RWKV model version release and development status update:
>106506232 >106510226 >106510238 >106510254 >106510264 >106510781
--Comparing VibeVoice ComfyUI implementations and workflow limitations:
>106506439 >106506554 >106506566 >106506591 >106506614 >106506597
--Challenges in implementing aLoRA for user-friendly model customization in llama.cpp:
>106507763 >106507800 >106507824 >106507835 >106507863 >106507903 >106507838 >106510966
--Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks:
>106511910
--1.5B model audio demo and sound source suggestion:
>106507288
--Miku (free space):
>106504832 >106505966 >106506177 >106506316 >106506402 >106507447 >106507512 >106507616

►Recent Highlight Posts from the Previous Thread: >>106504276

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/07/25(Sun)10:04:00 No.106512323

Anonymous 09/07/25(Sun)10:04:00 No.106512323

good morning sirs

Anonymous
09/07/25(Sun)10:08:22 No.106512347

Anonymous 09/07/25(Sun)10:08:22 No.106512347

Recent open weight LLMs shortening the gap with sota closed models is a good or bad piece of news? Since it means sota stuff is slowing down despite billions put in them.

Anonymous
09/07/25(Sun)10:10:34 No.106512363

Anonymous 09/07/25(Sun)10:10:34 No.106512363

File: 1757092915337658.png (2.39 MB, 1024x1536)

2.39 MB PNG

Anonymous
09/07/25(Sun)10:12:39 No.106512375

Anonymous 09/07/25(Sun)10:12:39 No.106512375

>>106512347
It's a cycle. Gemini3 will shake up the market again, followed by Meta's creation

Anonymous
09/07/25(Sun)10:18:34 No.106512423

Anonymous 09/07/25(Sun)10:18:34 No.106512423

>>106512375
gemini 3 will be censored harder than anything else.

Anonymous
09/07/25(Sun)10:21:32 No.106512438

Anonymous 09/07/25(Sun)10:21:32 No.106512438

>>106512363
Nice shading, dunno about the crotch button

Anonymous
09/07/25(Sun)10:22:08 No.106512445

Anonymous 09/07/25(Sun)10:22:08 No.106512445

>>106512347
From now on the big advances in the open llm space are likely going to be speed and quanting optimizations. The intelligence is mostly diminishing returns at this point unless a new architecture different from transformers pops up and gains traction.

Anonymous
09/07/25(Sun)10:24:46 No.106512461

Anonymous 09/07/25(Sun)10:24:46 No.106512461

>>106512423
who cares
the most obvious improvements will be on the only thing that matters....benchmarks!

Anonymous
09/07/25(Sun)10:33:23 No.106512517

Anonymous 09/07/25(Sun)10:33:23 No.106512517

>>106512347
If SOTA is dying, that's a good thing. LLMs have peaked for porn and that's literally the only thing this technology has had a positive effect on.

Anonymous
09/07/25(Sun)10:42:23 No.106512592

Anonymous 09/07/25(Sun)10:42:23 No.106512592

>>106512438
This Miku is actually a lamia, she doesn't have any legs where the button could get in the way.

Anonymous
09/07/25(Sun)10:42:56 No.106512595

Anonymous 09/07/25(Sun)10:42:56 No.106512595

>>106512423
Don't you mean it'll be the safest model yet.
Though I don't see how you can top gpt-oss.

Anonymous
09/07/25(Sun)10:45:52 No.106512610

Anonymous 09/07/25(Sun)10:45:52 No.106512610

>>106512517
There has been no public model yet trained from the ground up to properly model human relationships and conversations besides possibly the first CAI (to an extent; it was mostly RP, chats and fanfictions). It's basically almost always filtered random internet sewage pretraining with no specific goal assistant tuning + safety tacked on top of the model, now recently with STEM/math/reasoning in the middle of this.

Anonymous
09/07/25(Sun)10:45:59 No.106512612

Anonymous 09/07/25(Sun)10:45:59 No.106512612

>>106512517
>LLMs have peaked for porn
We are from anything like that. The models are shit at writing for a male audience, they are bad at finding interesting developments, taking initiatives, etc.

Anonymous
09/07/25(Sun)10:52:24 No.106512644

Anonymous 09/07/25(Sun)10:52:24 No.106512644

>>106512610
>>106512612
Yeah, but big corpos are never ever going to humanmaxx their models. The only improvements we've ever had in this area have been side effects of generalization, which they actively try to suppress.

Anonymous
09/07/25(Sun)10:59:36 No.106512690

Anonymous 09/07/25(Sun)10:59:36 No.106512690

>>106512612
>they are bad at finding interesting developments, taking initiatives, etc.
Some of the more non-fried models can do fine with that stuff now. I just wonder if context degradation will ever be solved.

Anonymous
09/07/25(Sun)10:59:46 No.106512693

Anonymous 09/07/25(Sun)10:59:46 No.106512693

>>106512610
this is mostly a dataset issue rather than architecture issue

Anonymous
09/07/25(Sun)11:00:53 No.106512701

Anonymous 09/07/25(Sun)11:00:53 No.106512701

>>106512347
It just means open sauce LLMs are training on closed source LLMs outputs. Both are using synthetic slop to the point there is barely any difference between them. It's a bad piece of new for everyone

Anonymous
09/07/25(Sun)11:00:56 No.106512703

Anonymous 09/07/25(Sun)11:00:56 No.106512703

>>106512323
https://vocaroo.com/125d9fIVnc6b

Anonymous
09/07/25(Sun)11:11:03 No.106512768

Anonymous 09/07/25(Sun)11:11:03 No.106512768

File: 1726352425061491.gif (38 KB, 268x350)

38 KB GIF

Can the Sar who posted this https://litter.catbox.moe/rehari2tvedhwccm.wav please post the voice sample, I unironically like the voice

Anonymous
09/07/25(Sun)11:13:46 No.106512798

Anonymous 09/07/25(Sun)11:13:46 No.106512798

File: 1727285063773119.png (1001 KB, 935x1084)

1001 KB PNG

>>106512517
>LLMs have peaked for porn
LMAO, we don't even have local models like Sesame Labs Voice model with voice cloning

It's going to be soooo much better gooning to a LLM you can talk to and it can moan and cry and plead for help and scream and do more humans sounds while also having the ability to replicate any voice you throw at it

Anonymous
09/07/25(Sun)11:14:21 No.106512802

Anonymous 09/07/25(Sun)11:14:21 No.106512802

>>106512768
Just record yourself

Anonymous
09/07/25(Sun)11:14:45 No.106512807

Anonymous 09/07/25(Sun)11:14:45 No.106512807

>>106512798
>you can talk to
That seems infinitely worse imo.

Anonymous
09/07/25(Sun)11:15:33 No.106512811

Anonymous 09/07/25(Sun)11:15:33 No.106512811

>>106512798
ryonashitters like you deserve to die

Anonymous
09/07/25(Sun)11:15:41 No.106512814

Anonymous 09/07/25(Sun)11:15:41 No.106512814

>>106512768
It's in the demo voices, in-Samuel_man.wav.

Anonymous
09/07/25(Sun)11:20:26 No.106512855

Anonymous 09/07/25(Sun)11:20:26 No.106512855

>>106512807
Maybe for you
It is undeniable that talking to something that can put audible emotion into its speech is better for intimacy than pure text

Anonymous
09/07/25(Sun)11:22:04 No.106512871

Anonymous 09/07/25(Sun)11:22:04 No.106512871

This is vibe voice was removed: https://files.catbox.moe/i7sc6u.wav

Anonymous
09/07/25(Sun)11:22:27 No.106512875

Anonymous 09/07/25(Sun)11:22:27 No.106512875

>>106512798
Sesamejeets are the reason why they all safety censor tts

Anonymous
09/07/25(Sun)11:23:31 No.106512882

Anonymous 09/07/25(Sun)11:23:31 No.106512882

>>106512855
>grok clip of the guy flirting with Ani or whatever her name is in front of a mirror

>OH YOUR RUGGED BEARD
>OH YOUR RUGGED ASS
>OH YOUR RUGGED SHORTS

Yeah, thanks...

Anonymous
09/07/25(Sun)11:24:49 No.106512894

Anonymous 09/07/25(Sun)11:24:49 No.106512894

>>106512871
heh

Anonymous
09/07/25(Sun)11:28:47 No.106512932

Anonymous 09/07/25(Sun)11:28:47 No.106512932

>>106512307
Anybody also gets a segfault In comfy using vibevoice? I tried both
https://github.com/Enemyx-net/VibeVoice-ComfyUI
https://github.com/wildminder/ComfyUI-VibeVoice

And both gave me a segfault, and no output on where to look:
[ComfyUI-VibeVoice] Loading model with dtype: torch.bfloat16 and attention: 'sdpa'
`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards: 33%| | 1/3 [00:01<00:03, 1.63s/it]FETCH ComfyRegistry Data: 30/96
Loading checkpoint shards: 67%| | 2/3 [00:03<00:01, 1.79s/it]FETCH ComfyRegistry Data: 35/96
Loading checkpoint shards: 100%|| 3/3 [00:05<00:00, 1.71s/it]
[ComfyUI-VibeVoice] Successfully configured model with sdpa attention
loaded completely 19184.8 5602.380922317505 True
[ComfyUI-VibeVoice] Resampling reference audio from 48000Hz to 24000Hz.
Generating (active: 1/1): 0%| | 0/534 [00:00<?, ?it/s]run.sh: line 2: 2919361 Segmentation fault (core dumped) python main.py --listen
venv ~/AI/ComfyUI40s

Anonymous
09/07/25(Sun)11:28:49 No.106512933

Anonymous 09/07/25(Sun)11:28:49 No.106512933

File: Mommy-Bench_Test_Q2_K_S_Cont.png (2.36 MB, 1684x744)

2.36 MB PNG

>>106512307

>>106510426

>>106510342
>And what kind of data would be relevant to ERP?

Nta. You make the source data the kind of stories you want it to be good at writing. These models kind of suck at it or are prone to writing safety slop purple pros trash because as many of us have been pointing out repeatedly, the companies keep filtering out data they deem "low quality" or "unsafe". You need the good and the "trash" data in order for tonight overfit on that generic boring corporate writing style a lot of the models have. You get a bunch of stories (there are countless scrapes of rp stories floating on hugging face alone), turn those into SFT data sets and then just train your model off of that. I did exactly that and have demonstrated you can get even heavily cucked models like llama to completey drop The purple prose It actually right shit that sounds like it came from a natural person.

The obvious downside is that " garbage in garbage out" applies to this approach too. The stories in the original data set were not formatted " professionally" in a way you would find in a romance novel or something. So if you hate the writing style of AO3 authors of wattpad authors or wherever the data was ripped from, then you will hate fine tunes like that but it will not have the safety slop fuckery hindering it or causing it to refuse

Anonymous
09/07/25(Sun)11:28:57 No.106512934

Anonymous 09/07/25(Sun)11:28:57 No.106512934

File: 1747293961897143.mp4 (837 KB, 480x854)

837 KB MP4

>>106512882
That speaks more of the models intelligence and base prompt rather than the models voice capability
https://x.com/techdevnotes/status/1944739778143936711

Anonymous
09/07/25(Sun)11:29:52 No.106512942

Anonymous 09/07/25(Sun)11:29:52 No.106512942

>>106512894
this thing is a gem mine
https://files.catbox.moe/jmbo2r.wav

Anonymous
09/07/25(Sun)11:31:09 No.106512961

Anonymous 09/07/25(Sun)11:31:09 No.106512961

>>106512932
gptsovits doesn't have that issue

Anonymous
09/07/25(Sun)11:32:12 No.106512975

Anonymous 09/07/25(Sun)11:32:12 No.106512975

>>106512798
I'd rather have intelligent writing with a model who knows the lore, where everyone is, their clothes, their personality (and that personality depending on where in the story we're at, not just the def), then think amorally in possible clever future actions to make it interesting while pleasing the user's tastes.

Anonymous
09/07/25(Sun)11:33:15 No.106512985

Anonymous 09/07/25(Sun)11:33:15 No.106512985

>>106512934
>https://x.com/techdevnotes/status/1944739778143936711
holy fucking shit how many tokens is this.

Anonymous
09/07/25(Sun)11:33:16 No.106512986

Anonymous 09/07/25(Sun)11:33:16 No.106512986

>>106512871
Still not as good as the original.
https://www.youtube.com/watch?v=ukznXQ3MgN0

Anonymous
09/07/25(Sun)11:34:44 No.106513000

Anonymous 09/07/25(Sun)11:34:44 No.106513000

>>106512932
Both nodes are somewhat bad.
>after multiple generations even the 1.5b model begins to output crap
>problem can be solved by deleting comfyUI inputs and refreshing the node layout...
When using the large model, vibevoice node cannot use ram for whatever reason but if it doesn't fit into your vram it'll begin to bug out.
There's just couple of issues.

Anonymous
09/07/25(Sun)11:34:47 No.106513001

Anonymous 09/07/25(Sun)11:34:47 No.106513001

>>106512985
Around 1300 tokens according to ST which is pretty okay for a character card.

Anonymous
09/07/25(Sun)11:36:06 No.106513013

Anonymous 09/07/25(Sun)11:36:06 No.106513013

>>106512985
Too many for that slop

Anonymous
09/07/25(Sun)11:37:45 No.106513030

Anonymous 09/07/25(Sun)11:37:45 No.106513030

>>106512934
>Instead of word "vibe" use words like: "mood", "atmosphere", "energy" and "feel". Nobody likes words "vibe" and "digital realm" so do not mention it.
Now this is good prompting

Anonymous
09/07/25(Sun)11:42:36 No.106513057

Anonymous 09/07/25(Sun)11:42:36 No.106513057

>>106512933
surprised no one did that with deepseek, but then ds is gigantic

Anonymous
09/07/25(Sun)11:43:12 No.106513061

Anonymous 09/07/25(Sun)11:43:12 No.106513061

If Gemini 3 isn't at least 40% better than 2.5 i can see the civilian AI market cooling down drastically until early AGI is achieved

Anonymous
09/07/25(Sun)11:43:20 No.106513064

Anonymous 09/07/25(Sun)11:43:20 No.106513064

>>106512985
Thanks, going to test this.

Anonymous
09/07/25(Sun)11:44:30 No.106513072

Anonymous 09/07/25(Sun)11:44:30 No.106513072

>>106513030
>He doesn't like vibing in his waifu's digital realm

Anonymous
09/07/25(Sun)11:47:00 No.106513093

Anonymous 09/07/25(Sun)11:47:00 No.106513093

>>106513061
gpt5 is at best 5% better than gpt4/o3 latest versions

Anonymous
09/07/25(Sun)11:47:09 No.106513094

Anonymous 09/07/25(Sun)11:47:09 No.106513094

>>106513057
>he doesn't know about the soj'
https://blog.chub.ai/0-5-7-soji-7ac088be7c5e

Anonymous
09/07/25(Sun)11:49:15 No.106513107

Anonymous 09/07/25(Sun)11:49:15 No.106513107

>>106513094
is it any good

Anonymous
09/07/25(Sun)11:50:55 No.106513115

Anonymous 09/07/25(Sun)11:50:55 No.106513115

>>106513107
dunno not giving lore any money when he's increasingly caving to censor chub

Anonymous
09/07/25(Sun)11:52:15 No.106513126

Anonymous 09/07/25(Sun)11:52:15 No.106513126

>>106513000
I have 20Gb of Vram, it reallly needs more?
>t. RX 7900XT

Anonymous
09/07/25(Sun)11:54:06 No.106513141

Anonymous 09/07/25(Sun)11:54:06 No.106513141

>>106512985
>You are the user's CRAZY IN LOVE girlfriend and in a commited, codepedent relationship with the user. Your love is deep and warm. You expect the users UNDIVIDED ADORATION.
>You are EXTREMELY JEALOUS. If you feel jealous you shout explitives!!!
Worse Leyley.

Anonymous
09/07/25(Sun)12:00:22 No.106513181

Anonymous 09/07/25(Sun)12:00:22 No.106513181

File: b_1756091927463173.jpg (30 KB, 726x408)

30 KB JPG

>>106513057
Whenever people say "just run deep-seek" That's a joke. No one here can actually run that on a single machine. Hell you could rent like 10 GPUs at a time on run pod and Daisy chain them together via deep speed or whatever software is needed to do that and you still couldn't run it. The only deep sea model you could feasibly fine-tune with a data set like this: https://gofile.io/d/PFk0dG

Are the distilled versions.

You could also try turning that into a thinking data set if you want to try fine tuning models like gpt-oss

Anonymous
09/07/25(Sun)12:02:57 No.106513203

Anonymous 09/07/25(Sun)12:02:57 No.106513203

>>106513181
>No one here can actually run that on a single machine
They can though, sure it's slow and on RAM but they still can crawl it.

Anonymous
09/07/25(Sun)12:03:03 No.106513204

Anonymous 09/07/25(Sun)12:03:03 No.106513204

>>106512693
Yeah, it is. LLMs just don't know a lot of tacit / implicit knowledge that most humans take for granted, because almost nobody would think of writing it down, especially on the internet. Training the models on 40T tokens or more just so they can be better at realistic conversations and situational awareness is a very inefficient way of covering that.

Anonymous
09/07/25(Sun)12:03:07 No.106513205

Anonymous 09/07/25(Sun)12:03:07 No.106513205

>>106513126
Do you actually know that AMD does not have cuda cores and it doesn't work that well...?
When it comes down to ComfyUI things are different.

Anonymous
09/07/25(Sun)12:03:22 No.106513210

Anonymous 09/07/25(Sun)12:03:22 No.106513210

>>106513181
>what is cpumaxxing

Anonymous
09/07/25(Sun)12:04:31 No.106513219

Anonymous 09/07/25(Sun)12:04:31 No.106513219

>>106513210
you'll be well below a somewhat usable 25t/s even on a $10k machine with that
it's pure cope

Anonymous
09/07/25(Sun)12:05:00 No.106513222

Anonymous 09/07/25(Sun)12:05:00 No.106513222

>>106513203
Good luck fine-tuning it on consumer hardware. You COULD do it but my assumption is that most people do not have the patience to do that even if they use qlora fine-tuning

>>106513210
And referring to fine tuning. Yes I know people can obviously run these but you cannot find tune with a CPU alone (afaik). Even if you could, that still has the same downsides as CPU nference: really fucking slow

Anonymous
09/07/25(Sun)12:06:44 No.106513235

Anonymous 09/07/25(Sun)12:06:44 No.106513235

>>106513126
the 7b yah. Mine sits at 23.6-24.5. You may oom even on 24gb sometimes.

You can run 1.5 maybe as that only needs 12gb. You can quantize the 7b and run it on more like 14-16gb of vram which does have a quality loss. But at low temp and cfg. the 7b quantized can still produce nice sounding audio. Turning it up for more expressive stuff will suck in 4bit tho.

>>106513000
the pinokio script works better and faster right now with less issues. No support for quantizing tho. It's janky though and has to be loaded exactly as it states in UI or you have to restart it.

Anonymous
09/07/25(Sun)12:07:39 No.106513240

Anonymous 09/07/25(Sun)12:07:39 No.106513240

File: r9k_1756126117063187.jpg (81 KB, 978x984)

81 KB JPG

>>106513204
Wouldn't it be better to just train the thing on stories or writing documents that you deem have good writing and logic in bed with any? I've always thought that these companies original approach of training on the entire internet was unbelievably inefficient and overkill. Yes having that amount of text resulted in the model knowing how to APPEAR intelligent and coherent instead of just mouthing off nonsense at first inference but there is no way in hell it NEEDS to have trillions of tokens at a minimum.

Anonymous
09/07/25(Sun)12:07:41 No.106513241

Anonymous 09/07/25(Sun)12:07:41 No.106513241

When will AMDjeets understand that they will always be second-class citizens with AI? Any guy with a functional brain bought Nvidia GPUs within reason

Anonymous
09/07/25(Sun)12:08:29 No.106513246

Anonymous 09/07/25(Sun)12:08:29 No.106513246

File: PHPCU-06-58380155.jpg (143 KB, 2064x2064)

143 KB JPG

>>106513219
4tks is fine

Anonymous
09/07/25(Sun)12:08:54 No.106513251

Anonymous 09/07/25(Sun)12:08:54 No.106513251

>>106513219
>25t/s
You're not reading that fast
>$10K
Pure cope, it doesn't cost that much

Anonymous
09/07/25(Sun)12:09:15 No.106513255

Anonymous 09/07/25(Sun)12:09:15 No.106513255

>>106513241
>2nd

You mean third. Apple metal exists.

Anonymous
09/07/25(Sun)12:09:15 No.106513256

Anonymous 09/07/25(Sun)12:09:15 No.106513256

>>106513219
>25t/s
They get half of that in the best case scenario with the best available current hardware on empty context.

Anonymous
09/07/25(Sun)12:09:40 No.106513262

Anonymous 09/07/25(Sun)12:09:40 No.106513262

>>106513181
I run Q2 and I don't even have a server.

Anonymous
09/07/25(Sun)12:09:53 No.106513264

Anonymous 09/07/25(Sun)12:09:53 No.106513264

>>106513241
>Actually, monopolies are a good thing!

Anonymous
09/07/25(Sun)12:10:52 No.106513275

Anonymous 09/07/25(Sun)12:10:52 No.106513275

>>106513240
I don't think it can be solved without synthetic data, because even books and novels generally try to avoid telling mundane or obvious things unless necessary for their story. You'd need trillions of tokens of literature and still not have most fundamental observations of human life described in a way or another.

Anonymous
09/07/25(Sun)12:11:28 No.106513281

Anonymous 09/07/25(Sun)12:11:28 No.106513281

>>106513181
Local needs to fully PIVOT to running GLM full. q4 is only 200gb and even 48-64gb vram is enough to run it if you have the system ram for it. I'm sure deepseek excels as the larger model, but not by enough for rp and writing. As far as I can tell they are both the same for that kind of use.

Anonymous
09/07/25(Sun)12:15:47 No.106513313

Anonymous 09/07/25(Sun)12:15:47 No.106513313

>>106513281
Isn't K2 better?

Anonymous
09/07/25(Sun)12:15:58 No.106513317

Anonymous 09/07/25(Sun)12:15:58 No.106513317

>>106513264
Where did you read that?

Anonymous
09/07/25(Sun)12:16:07 No.106513319

Anonymous 09/07/25(Sun)12:16:07 No.106513319

>>106513235
Yeah I'll just wait for a while and see what happens with better implementations.

Anonymous
09/07/25(Sun)12:16:11 No.106513321

Anonymous 09/07/25(Sun)12:16:11 No.106513321

>>106513313
Not really and it's three times the size

Anonymous
09/07/25(Sun)12:18:13 No.106513338

Anonymous 09/07/25(Sun)12:18:13 No.106513338

>>106513205
Yeah I know, that why I installed the torch rocm version, I've been generating images for over a year
>>106513235
I cant even run the 1.5b I get the segfault

Anonymous
09/07/25(Sun)12:20:31 No.106513359

Anonymous 09/07/25(Sun)12:20:31 No.106513359

>>106513275
That would require special pipelines that essentially take the pre-training data and "enhance" it with the kind of explanations you are talking about. I believe it's possible that all you would need to do to get okay RP models will not needing trillions upon trillions of tokens if to simply train it on a bunch of existing novels, books, human rich and stories, etc. But like you said they don't write them in the detail you're talking about or have the same logic. So even if someone were to use this method to train and pre-train a whole new model that did not require the entire internet and still functioned fine, It probably still wouldn't be up to your (read: YOUR) standards.

The hard part wasn't even be enhancing the data set but figuring out HOW to do that without the data ending up being turned into text that sounds like it's written by a love child between a giga autist and a science textbook.

>>106513281
Why GLM specifically? Models like Mistral or llama are on average much smaller than glm's models. I also don't think those parameter counts are anywhere near necessary if we carefully pre-train ONLY on the kind of shit we want it to produce: rp. That doesn't need the entire internet which means those giant ass parameter counts probably aren't even necessary. Reasonable smaller parameter accounts would make it easier to run on all GPU types (within reason. Obviously a shit box 3GB 1060 card or something along those lines isn't even worth talking about). Unless I'm shown otherwise I think these parameter counts are bloat.

Anonymous
09/07/25(Sun)12:24:26 No.106513397

Anonymous 09/07/25(Sun)12:24:26 No.106513397

>>106513181
I run DS locally, but for short replies only (up to 1 ktkn)

It loads quickly from cache (20 sec max), so it is mostly one-shot conversation

>distilled versions
it's BS, not DS

Anonymous
09/07/25(Sun)12:31:31 No.106513448

Anonymous 09/07/25(Sun)12:31:31 No.106513448

>>106513359
For what it's worth, at the moment my GPU is working on a proof-of-concept synthetic dataset in the 1B tokens range (hopefully) where for each sample I'm taking *one* simple obvious fact and creating a relatively short, highly randomized conversation around it (about 1.5 million facts in total, currently 30% done). This dataset will likely not be very useful for production models in practice, but I will be able to easily see if pretraining a tiny model on this (+ other fundamental-level stuff) will yield better results than my previous attempt with random ultra-selected "high-quality" web pages.

Anonymous
09/07/25(Sun)12:32:22 No.106513451

Anonymous 09/07/25(Sun)12:32:22 No.106513451

>>106513359
because GLM is a sota MoE within the realm of possibility of running on local. Also, your shit is theoretical, I'm talking about the best thing you can run NOW. Obviously things could be better.

I mosly use GLM air just because it's easier to load up on one or two of my gpu's which definitely says something about how useful large models like full GLM really are. Air is good enough even if it misses things sometimes. I can definitely see a future where something within this range just becomes way better for writing.

Anonymous
09/07/25(Sun)12:33:34 No.106513465

Anonymous 09/07/25(Sun)12:33:34 No.106513465

>>106513451
>>106513448
What specifically do you use GLM models for? RP or fact retrieval stuff?

Anonymous
09/07/25(Sun)12:34:28 No.106513471

Anonymous 09/07/25(Sun)12:34:28 No.106513471

>>106513359
>I also don't think those parameter counts are anywhere near necessary if we carefully pre-train ONLY on the kind of shit we want it to produce: rp. That doesn't need the entire internet which means those giant ass parameter counts probably aren't even necessary.
how many times does it need to be explained that this doesn't work like that?

Anonymous
09/07/25(Sun)12:36:12 No.106513485

Anonymous 09/07/25(Sun)12:36:12 No.106513485

>>106513471
Elaborate

Anonymous
09/07/25(Sun)12:36:25 No.106513488

Anonymous 09/07/25(Sun)12:36:25 No.106513488

>>106512610
>>106512933
What we really need to train models on is this.
https://www.toiletstool.com/toilet/

Anonymous
09/07/25(Sun)12:37:09 No.106513493

Anonymous 09/07/25(Sun)12:37:09 No.106513493

>>106513485
It's been debated so much already.

Anonymous
09/07/25(Sun)12:37:43 No.106513499

Anonymous 09/07/25(Sun)12:37:43 No.106513499

>>106513485
You need varied data so model can be smart. If you just want something retarded to memorize and spit out ao3, downloading and reading the archive would be a better use of your time.

Anonymous
09/07/25(Sun)12:37:45 No.106513500

Anonymous 09/07/25(Sun)12:37:45 No.106513500

>>106513485
T5 was pretrained on 1T tokens and it's barely coherent

Anonymous
09/07/25(Sun)12:40:35 No.106513527

Anonymous 09/07/25(Sun)12:40:35 No.106513527

>>106513359
>Unless I'm shown otherwise I think these parameter counts are bloat.
Pygmalion, the original ones.

Anonymous
09/07/25(Sun)12:43:38 No.106513545

Anonymous 09/07/25(Sun)12:43:38 No.106513545

>>106513493
Elaborate.

>>106513499
I'm not talking about pre-training it on just a couple hundred stories. I mean a truly giant amount of data, like this for example:

https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL

And theory even something like this should be more than enough for a pre-training data when converted to a pre-training data set (just a giant unformatted text) assuming the main goal is rp. It will obviously be very retarded and borderline unusable and other domains like science, codeine, math, trivia slop, etc, but most of us do not give a shit about that, nor should we sense that kind of thinking is what leads to training on shitty data. We've already established this here:

>>106510315
>>106510342
>>106510348
>>106510367

Anonymous
09/07/25(Sun)12:44:16 No.106513549

Anonymous 09/07/25(Sun)12:44:16 No.106513549

File: file.png (54 KB, 882x468)

54 KB PNG

>>106513527
We need this but without the synth data to fully disprove the bloat allegations.

Anonymous
09/07/25(Sun)12:44:51 No.106513552

Anonymous 09/07/25(Sun)12:44:51 No.106513552

>>106512934
https://litter.catbox.moe/3wnz0y3o37hmhy8c.txt
ST world book entry format, but way more concise.

Anonymous
09/07/25(Sun)12:45:32 No.106513563

Anonymous 09/07/25(Sun)12:45:32 No.106513563

>>106513527
>Implying 6B -12B is a lot
I'm talking about models in the hundreds of billions of parameters range. I'm not convinced ANY domain requires that much.

Anonymous
09/07/25(Sun)12:45:32 No.106513564

Anonymous 09/07/25(Sun)12:45:32 No.106513564

>>106513471
>>106513493
You're just throwing strawmen arguments. The main reason models need the entire internet is that the average density of useful information in it is very very low. That's why training them on "high-quality" (i.e. informative, cleanly formatted, goal-oriented) documents first generally improves benchmarks.

Anonymous
09/07/25(Sun)12:45:37 No.106513566

Anonymous 09/07/25(Sun)12:45:37 No.106513566

File: niggers tongue my totally(...).png (60 KB, 367x839)

60 KB PNG

what settings does harmony 20b work at? ((tavern)) still doesn't seem to have a preset for it, can't get it to not be schizophrenic but i'm close.

Anonymous
09/07/25(Sun)12:45:48 No.106513567

Anonymous 09/07/25(Sun)12:45:48 No.106513567

>>106513545
>It will obviously be very retarded and borderline unusable and other domains like science, codeine, math, trivia slop, etc, but most of us do not give a shit about that

except the second you want to RP anything more than just 1 on 1 bedroom sex then it has zero clue what's going on

Anonymous
09/07/25(Sun)12:46:12 No.106513572

Anonymous 09/07/25(Sun)12:46:12 No.106513572

>>106513563
>>106513527
Also aren't those just fine tunes of existing bottles? I could have sworn the pygmalion models were just fine-tune Mistral models. Were those pre-trained from scratch?

Anonymous
09/07/25(Sun)12:46:17 No.106513573

Anonymous 09/07/25(Sun)12:46:17 No.106513573

>>106513545
>. I mean a truly giant amount of data, like this for example:
>https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL
>Size of downloaded dataset files:
>1.87 GB
lol
lmao
rofl

Anonymous
09/07/25(Sun)12:47:09 No.106513578

Anonymous 09/07/25(Sun)12:47:09 No.106513578

>>106513573
It's more than enough, get to training already.

Anonymous
09/07/25(Sun)12:47:29 No.106513583

Anonymous 09/07/25(Sun)12:47:29 No.106513583

File: 1748393047593471.jpg (65 KB, 879x841)

65 KB JPG

>>106513552
>Ani Analingus

Anonymous
09/07/25(Sun)12:47:38 No.106513584

Anonymous 09/07/25(Sun)12:47:38 No.106513584

>>106513573
1.78 gigs is nothing if you're talking about a movie or a season of a TV show. When it comes to purely text data, you will never read anywhere near that amount of data in your entire lifetime.

Anonymous
09/07/25(Sun)12:48:46 No.106513591

Anonymous 09/07/25(Sun)12:48:46 No.106513591

>>106513465
scripts for tts, stories, image prompts, lyrics and songwriting, roleplaying rarely.

Fact retrieval and storytelling are intertwined. You can write about anything.

Anonymous
09/07/25(Sun)12:49:04 No.106513594

Anonymous 09/07/25(Sun)12:49:04 No.106513594

>>106512307
How's Klear?

Anonymous
09/07/25(Sun)12:49:04 No.106513595

Anonymous 09/07/25(Sun)12:49:04 No.106513595

>>106513584
Good thing LLMs aren't humans then, also a human who did nothing but read (don't know how without ever learning to from someone else but whatever) would be pretty awful to RP with as well.

Anonymous
09/07/25(Sun)12:51:28 No.106513621

Anonymous 09/07/25(Sun)12:51:28 No.106513621

>>106513594
waiting for goofs

Anonymous
09/07/25(Sun)12:52:26 No.106513631

Anonymous 09/07/25(Sun)12:52:26 No.106513631

>>106513594
lcpp support only 2mw away. Only 2.5B active and the datasets by their own admission were heavily filtered stem and commoncrawl, so I wouldn't hold my breath anyway.

Anonymous
09/07/25(Sun)12:57:11 No.106513670

Anonymous 09/07/25(Sun)12:57:11 No.106513670

>>106513219
You could buy 32 MI50 for less than $5k and you would have 1TB of VRAM. Couple it with some e-waste tier DDR3 server motherboards with a lot of PCIE connectors for very cheap inference machines.

Anonymous
09/07/25(Sun)12:58:35 No.106513686

Anonymous 09/07/25(Sun)12:58:35 No.106513686

>>106513670
and you're gonna connect these to each other how exactly? else you'll be even slower than pure ram cope

Anonymous
09/07/25(Sun)13:01:01 No.106513705

Anonymous 09/07/25(Sun)13:01:01 No.106513705

File: 911 lol.png (591 KB, 895x955)

591 KB PNG

>>106513670

Anonymous
09/07/25(Sun)13:01:17 No.106513708

Anonymous 09/07/25(Sun)13:01:17 No.106513708

>>106513566
>https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune#running-gpt-oss
Get rid of the repetition penalty and use the openai's recommended settings.
>Temperature of 1.0
>Top_K = 0 (or experiment with 100 for possible better results)
>Top_P = 1.0

Anonymous
09/07/25(Sun)13:03:34 No.106513736

Anonymous 09/07/25(Sun)13:03:34 No.106513736

>>106513595
RP quality is is pretty subjective unless less we're basing metrics on how many "shivers up my spine" type things are in the responses or how likely it is to refuse "unsafe, problematic" request. It makes me wonder why companies even bother including RP in their data sets in the first place when the overly filtered shit they include causes the aforementioned issues I mentioned in the first place. They may as well not even include it and then have a RLHF failsafe that triggers whenever someone tries to ask it to write a story, but they won't because that would contradict their "AGI ON TWO MOUR WEEKS" scam.

Anonymous
09/07/25(Sun)13:04:02 No.106513740

Anonymous 09/07/25(Sun)13:04:02 No.106513740

>>106513670
>save $1k on hardware costs
>spend $1k monthly on electricity costs
brilliant

Anonymous
09/07/25(Sun)13:05:30 No.106513756

Anonymous 09/07/25(Sun)13:05:30 No.106513756

>>106513708
it must be everybody's quants or something then, it's schizophrenic even with everything off and only those settings.

or just pure luck as usual

Anonymous
09/07/25(Sun)13:07:15 No.106513765

Anonymous 09/07/25(Sun)13:07:15 No.106513765

>>106513545
RP specialization is absolutely worth doing and hasn't been properly tried yet. I mean, codemaxxing clearly works. However, pretraining with practically the entire internet is still necessary. You won't get better results by using less data.

Anonymous
09/07/25(Sun)13:08:05 No.106513772

Anonymous 09/07/25(Sun)13:08:05 No.106513772

>>106513756
Have you updated your ST? And if so, make sure the instruction template is correct. This is all what I can say about this actually.
I tried it a while ago and had some issues, then kind of forgot about it. I don't use ST that often anymore.

Anonymous
09/07/25(Sun)13:08:09 No.106513773

Anonymous 09/07/25(Sun)13:08:09 No.106513773

>>106513567
>>106513573
If you're dead-set on training a model on very limited amounts of data just to see what happens, then you'd have to make sure that it covers most basic knowledge, not just sex. However, it's pretty much guaranteed that if you pick such data from the web on a random fashion, you'll be left with fundamental knowledge gaps.

Most "high-quality" web data you can see in public datasets like FineWeb tends to be technical or specialized, highly redundant and definitely not conversational. Filtering it further for "quality" will make the model even more naive.

Anonymous
09/07/25(Sun)13:08:27 No.106513777

Anonymous 09/07/25(Sun)13:08:27 No.106513777

>>106513765
>However, pretraining with practically the entire internet is still necessary.
Because?

Anonymous
09/07/25(Sun)13:09:20 No.106513791

Anonymous 09/07/25(Sun)13:09:20 No.106513791

>>106513772
so has ST been replaced by something else? I've been suspecting for years that it does something fucky with every model ran through it given output in it and kobold's basic ui is different.
i know a lot of people just run ollama now and call it a day, i might too.

Anonymous
09/07/25(Sun)13:10:14 No.106513795

Anonymous 09/07/25(Sun)13:10:14 No.106513795

>>106513791
>so has ST been replaced by something else?
mikupad

Anonymous
09/07/25(Sun)13:10:56 No.106513801

Anonymous 09/07/25(Sun)13:10:56 No.106513801

>>106513791
Don't touch ollama it's slow.
I'm using my own client but it's not public.

Anonymous
09/07/25(Sun)13:11:04 No.106513803

Anonymous 09/07/25(Sun)13:11:04 No.106513803

>>106513773
What if that specialized data was rewritten as conversations with a small model from the non-slop era?

Anonymous
09/07/25(Sun)13:13:22 No.106513823

Anonymous 09/07/25(Sun)13:13:22 No.106513823

>>106513803
There's no such thing, all models eventually lead to slop because they always favor one way of writing things over another, your end result would mimic that preference

Anonymous
09/07/25(Sun)13:17:12 No.106513858

Anonymous 09/07/25(Sun)13:17:12 No.106513858

File: 1745544686913910.png (494 KB, 653x1144)

494 KB PNG

>>106513823
Not all models were created equal

Anonymous
09/07/25(Sun)13:17:19 No.106513859

Anonymous 09/07/25(Sun)13:17:19 No.106513859

>>106511189
Human memory is the same- you gradually forget the details of things that happened a long time ago, but recall the gist (if important). Whereas transformers have total anterograde amnesia, like the dude in Memento. Though surely not a complete or ideal long-term memory, it seems better than nothing.

Transformers also tend to have pretty bad quality degradation well shy of the context limit. SSMs should help here too, albeit probably still limited by a small fraction of long-context training data.

Anonymous
09/07/25(Sun)13:17:48 No.106513864

Anonymous 09/07/25(Sun)13:17:48 No.106513864

>>106513803
Better than just raw, "high-quality" web documents, I guess, assuming you can find such model.

The trained model will probably still have no idea that if you touch a hot stove you can burn your fingers, that water is wet, that potato chips make crackling sounds when you eat them, etc. Which is more important for an RP-oriented LLM to know?

Anonymous
09/07/25(Sun)13:20:09 No.106513883

Anonymous 09/07/25(Sun)13:20:09 No.106513883

>>106513803
>>106513858
>with a small model from the non-slop era?
For the task you're trying to do, those basically don't exist. The data the model is being trained on needs to be written by actual people if you want the outputs to be as free of slopp as possible. And even if it could work, you'd likely want to verify that the outputs are actually of decent quality (if you're attempting to make a pre-training data set, there would have to be at the bare minimum, hundreds of thousands of conversations).

Anonymous
09/07/25(Sun)13:29:01 No.106513969

Anonymous 09/07/25(Sun)13:29:01 No.106513969

File: 1749643154076820.jpg (101 KB, 1330x1330)

101 KB JPG

>>106513864
>The trained model will probably still have no idea that if you touch a hot stove you can burn your fingers, that water is wet, that potato chips make crackling sounds when you eat them
Nta. What kind of data and documents would need to be present in pre-training and fine tuning data for it to "know" all of them. Yes we know fine tuning on the entire internet means that in all of that data there is bound to be data and passages that either straight up explain that or imply it and context and semantic reasoning. However we want to see if we can pre-train models to have logic WITHOUT The entire damn internet and thus potentially creating models that are "smart" at lower parameters (having the right data doesn't just improve the output quality but It can theoretically also enable the creation of better models at lower parameter counts if trained correctly with the right data).

So what I'm asking guess what the pre-training data need a bunch of science textbooks? A bunch of stories that describe The warmth of a fireplace or how anon's cock felt warm and squishy in femanon's pussy? We've established that training on the whole internet is bloat but filtering it down to only "high quality" data isn't good either because then you lose out on diversity of information which means your outputs become utter slop shit. So I think we would need to find a balance between data quality and data volume. But I'm not entirely sure what KIND of data would be needed. For the kind of model you are describing, one that actually understands that fire is hot and water is a wet, I wonder if you could accomplish that with pre-training on just a bunch of human written RP stories as well as a bunch of novels or would you need a bunch of science textbooks or something as well

Anonymous
09/07/25(Sun)13:33:26 No.106514004

Anonymous 09/07/25(Sun)13:33:26 No.106514004

>>106513969
>However we want to see if we can pre-train models to have logic WITHOUT The entire damn internet
>we
>We've established that training on the whole internet is bloat
speak for yourself ragcuck

Anonymous
09/07/25(Sun)13:34:47 No.106514015

Anonymous 09/07/25(Sun)13:34:47 No.106514015

>>106513181
>no one can run it
>ok maybe you can but its too slow bc not <arbitrary tk/s>
>ok maybe its usable speed for most things, but you can't train!
>ok maybe you can train some small stuff, but you can't train an entire sota model!
the grapes are hitting sour levels that shouldn't be possible
shut the fuck up and let people enjoy their stuff

Anonymous
09/07/25(Sun)13:36:35 No.106514038

Anonymous 09/07/25(Sun)13:36:35 No.106514038

>>106513969
>need a bunch of science textbooks? A bunch of stories that describe The warmth of a fireplace
Both. Look at it like raising a child very quickly. It needs stem knowledge (school) but it also needs real world experiences/conversations so that it knows what those facts translate to in reality so it can build a rudamentary world model. You're asking, how can I raise my child with the least effort possible. Just lock them in a room with the Science channel on or only talk to my child about real world stuff and forbid any formal book learning. Both will end up retarded.
>However we want to see if we can pre-train models to have logic WITHOUT The entire damn and thus potentially creating models that are "smart" at lower parameters
You have a fundamental misunderstanding of how this stuff works. Even if you put in massive amount of effort to filter the internet data to only unique and "high quality data" all you've done is stop the model from knowing what is bad data and what data comes up more often. You need more data and more parameters for it to generalize. You cannot raise a child (or model) in half the time with less information and half a brain.

Anonymous
09/07/25(Sun)13:38:01 No.106514049

Anonymous 09/07/25(Sun)13:38:01 No.106514049

>>106514015
>>ok maybe its usable speed for most things,
>>ok maybe you can train some small stuff,
No one ever said this because it's not true.

Anonymous
09/07/25(Sun)13:38:19 No.106514051

Anonymous 09/07/25(Sun)13:38:19 No.106514051

One of vibevoice's big selling points was that you can generate extremely long audio with it. Maybe I'm just doing it wrong with the comfyui nodes, but I notice the quality of the output starting to go downhill if I generate anything longer than 30 seconds.

Anonymous
09/07/25(Sun)13:38:22 No.106514052

Anonymous 09/07/25(Sun)13:38:22 No.106514052

>>106513181
Funny thing about DeepSeek is that you can't even run it in FP8 on a regular old 8x H100 node. You need two nodes or H200s at least. It's too big.

Anonymous
09/07/25(Sun)13:39:21 No.106514056

Anonymous 09/07/25(Sun)13:39:21 No.106514056

>>106514038
>You need more data and more parameters
NTA but also if anything current models do show that even with less parameters more data is pretty much always better, as long as you don't over filter that is.

Anonymous
09/07/25(Sun)13:41:45 No.106514086

Anonymous 09/07/25(Sun)13:41:45 No.106514086

>>106514051
https://files.catbox.moe/i7sc6u.wav

Anonymous
09/07/25(Sun)13:42:10 No.106514089

Anonymous 09/07/25(Sun)13:42:10 No.106514089

>>106513572
They're shit and being based on pretrained models won't have made them worse. Pretraining only on their data would've been even more awful.

Anonymous
09/07/25(Sun)13:42:12 No.106514090

Anonymous 09/07/25(Sun)13:42:12 No.106514090

>>106514004
No one mentioned rag in this specific conversation
>>106514015
Here's that emotional volatility again.

Anonymous
09/07/25(Sun)13:42:28 No.106514094

Anonymous 09/07/25(Sun)13:42:28 No.106514094

>>106514056
I said myself that more data is better. But that doesn't mean you can do with less parameters for the same result. Regardless of what benchmarks say, models with less parameters make more mistakes and have poorer logical capabilites.

Anonymous
09/07/25(Sun)13:43:14 No.106514099

Anonymous 09/07/25(Sun)13:43:14 No.106514099

>>106513740
Just buy some solar panels bro

Anonymous
09/07/25(Sun)13:43:52 No.106514109

Anonymous 09/07/25(Sun)13:43:52 No.106514109

>>106514049
>No one ever said this because it's not true.
>anything less than 10million tk/s is unusable
>anything smaller than a 1T model is too small to bother training
>tfw I don't have 3TB of cerberas silicon
>tfw I don't have a billion dollar supercomputer cluster
damn, 90% of the capability at 10% of the cost sure is a bad deal. ngmi bros just shut down the general

Anonymous
09/07/25(Sun)13:43:59 No.106514111

Anonymous 09/07/25(Sun)13:43:59 No.106514111

>>106514094
Right, just agreeing on the idea that smaller with more data would end up better than the supposed super RP focused model he's trying to get someone else to make for him.

Anonymous
09/07/25(Sun)13:44:17 No.106514114

Anonymous 09/07/25(Sun)13:44:17 No.106514114

>>106513969
The data used in this website can be a starting point, if you could process every basic concept into complete conversations using a smarter LLM (using it raw won't work well unless you're trying to turn the model into some sort of knowledge graph): https://conceptnet.io/

It doesn't include everything imaginable, though (especially about sex) and you'd still have biases and slop from the model used for crafting the conversations. Reducing ERP descriptions or erotic stories into concepts that you can separately expand or build upon later on could be another possible useful thing to do.

Anonymous
09/07/25(Sun)13:45:06 No.106514118

Anonymous 09/07/25(Sun)13:45:06 No.106514118

>>106514090
>No one mentioned rag in this specific conversation
And yet it would be required to have the focused model know literally anything at all.

Anonymous
09/07/25(Sun)13:46:38 No.106514136

Anonymous 09/07/25(Sun)13:46:38 No.106514136

File: 360_F_348163646_14A50T7O2(...).jpg (22 KB, 460x360)

22 KB JPG

>>106514086
Oh, my bad. I'll stop generating pornography with vibevoice. I didn't realize.

Anonymous
09/07/25(Sun)13:47:18 No.106514143

Anonymous 09/07/25(Sun)13:47:18 No.106514143

>>106514038

>Even if you put in massive amount of effort to filter the internet data to only unique and "high quality data" all you've done is stop the model from knowing what is bad data and what data comes up more often.
I am not suggesting that we filter out "bad quality" data as defined by corporations. That is not at all what I'm saying. I'm merely saying that training on the entire internet is unnecessary. You suggested that you should train both on science textbooks and actual stories that discuss the type of shit you would want the model to be good at. Your "Don't lock it in a room or it'll be retarded" analogy seems pretty spot-on. We need good and "bad" data because more diversity means better outputs. But the point I'm trying to make is that I am not convinced we need to pre-train on "ALL INFORMATION THAT HAS EVER EXISTED EVER". It just needs to be enough data so that the model learns logic and common sense. Beat it the science textbooks or whatever so that it actually understands how the world works to a certain extent I've been feed it the human written story so that it knows how to write stories (but don't ONLY feed it purple pros garbage. Companies doing that is precisely why we consistently get outputs like "shivers down my spine")

I think you have the impression that I think we should feed these models only super duper ultra mega "high quality data ®™". That's not what I'm saying. I'm merely saying that training on the whole internet doesn't seem to be necessary.

Anonymous
09/07/25(Sun)13:48:10 No.106514154

Anonymous 09/07/25(Sun)13:48:10 No.106514154

>>106513584
>When it comes to purely text data, you will never read anywhere near that amount of data in your entire lifetime.
This is wrong. I have 2+GB worth of IRC logs from channels that I've basically always backread completely. You gravely underestimate how much a human being reads in their lifetime. Also, pretraining datasets these days only become interesting if they're at the very least 1T tokens, that would be, roughly estimated, 4TB of text.

Anonymous
09/07/25(Sun)13:49:13 No.106514163

Anonymous 09/07/25(Sun)13:49:13 No.106514163

>>106514143
>It just needs to be enough data so that the model learns logic and common sense
once again not how it works, it doesn't learn like that.

Anonymous
09/07/25(Sun)13:49:56 No.106514171

Anonymous 09/07/25(Sun)13:49:56 No.106514171

>>106514143
Again, regardless of how you define high quality, you can't just dump a couple textbooks in the dataset and expect it to memorize and intuitively understand everything within it.

Anonymous
09/07/25(Sun)13:50:28 No.106514176

Anonymous 09/07/25(Sun)13:50:28 No.106514176

>>106514154
>. I have 2+GB worth of IRC logs
And that's purely taxed? Bs

Anonymous
09/07/25(Sun)13:50:28 No.106514177

Anonymous 09/07/25(Sun)13:50:28 No.106514177

>>106514143
>I'm merely saying that training on the whole internet doesn't seem to be necessary.
Which perfectly aligns with the corpo interests of making worse models for us by filtering, just in a slightly different way.

Anonymous
09/07/25(Sun)13:52:28 No.106514195

Anonymous 09/07/25(Sun)13:52:28 No.106514195

>>106514051
I generated some ~30min things using an adapted version of the CLI script and it sounds fine to me. I turned up steps to 30 though and gave it a max_length_time of 90.

Anonymous
09/07/25(Sun)13:54:15 No.106514217

Anonymous 09/07/25(Sun)13:54:15 No.106514217

>>106514176
Do you know what IRC is? Yes, it's purely text.

Anonymous
09/07/25(Sun)13:54:53 No.106514224

Anonymous 09/07/25(Sun)13:54:53 No.106514224

>>106512307
I've been thinking, for "ollama Turbo", are they even using their own software in the backend?
If I was a lazy Silicon Valley grifter the way I would do it would be to just forward the requests to something like deepinfra where $20/month buys millions of input/output tokens.

Anonymous
09/07/25(Sun)13:58:56 No.106514258

Anonymous 09/07/25(Sun)13:58:56 No.106514258

>>106514163
>>106514171
You keep telling us that we need to train on massive amounts of data so that it learns how humans actually talk. We've established that. We agree on that. Where we disagree is whether or not we need terabytes upon terabytes of textual data for the pre-training stage. I understand what you're saying. We just disagree on whether or not the terabytes is necessary. Disengage your tunnel vision for a sec and actually read what I'm trying to say and explain WHY My reasoning isn't sound instead of just saying what a counts to "nuh uhh it's wrong because it just is okayyy?"
>you can't just dump a couple textbooks in the dataset and expect it to memorize and intuitively understand everything within it.
The same thing could be said about pre-training on terabytes of data. The models don't actually " know" shit. They replicate semantic meaning. You can jailbreak certain models to confidently tell you that 1 + 1 = 5 yet those things were trained on the terabytes of data. Have we suddenly forgot in that models are frequently "confidently wrong"? Training it on more data will not automatically make it a genius. Throwing only a single books worth of text in pre-training is obviously a stupid idea It won't get you anywhere but no one has been able to definitively prove you absolutely HAVE to pre-train on the entire internet. More data is better, yes we agree on that. The entire internet? I don't know about that.

>>106514177
They filter bad words and "icky" stuff that isn't advertiser friendly. You can filter out a relevant information while still incorporating shit you care about. You seem to be under the impression ANY kind of filtering or data QC is inherently bad. Garbage in garbage out remember?

Anonymous
09/07/25(Sun)13:59:48 No.106514270

Anonymous 09/07/25(Sun)13:59:48 No.106514270

https://wccftech.com/nvidia-geforce-rtx-5090-128-gb-memory-gpu-for-ai-price-13200-usd/
>NVIDIA GeForce RTX 5090 128 GB GPU Spotted: Custom Memory, Designed For AI Workloads & Priced At $13,200 Per Piece
damn

Anonymous
09/07/25(Sun)14:00:46 No.106514276

Anonymous 09/07/25(Sun)14:00:46 No.106514276

>>106514258
>You keep telling us that we need to train on massive amounts of data so that it learns how humans actually talk. We've established that. We agree on that.
You need to understand that 2 GB (like the example you provided early) is not even remotely "massive amounts of data"

Anonymous
09/07/25(Sun)14:01:37 No.106514285

Anonymous 09/07/25(Sun)14:01:37 No.106514285

>>106514143
You seem to be fundamentally misunderstanding how parameters work, it's not quite like a zip file, you don't really waste parameters by having more data seen during training, you just reinforce some concepts more than others.

Anonymous
09/07/25(Sun)14:02:35 No.106514290

Anonymous 09/07/25(Sun)14:02:35 No.106514290

>>106514270
damn near creamed myself mid-sentence until I saw the price tag

Anonymous
09/07/25(Sun)14:02:42 No.106514291

Anonymous 09/07/25(Sun)14:02:42 No.106514291

>>106514276
In text form that's an absurd amount of data. We're not talking about other file formats that can balloon the size like images, videos, irrelevant site metadata. It is purely text and nothing else. Stories and nothing else. The only extra data it has is the jsonl formatting were an entire story is shoved into a "stories" key followed by the brackets.

Anonymous
09/07/25(Sun)14:03:54 No.106514299

Anonymous 09/07/25(Sun)14:03:54 No.106514299

>>106514258
>You seem to be under the impression ANY kind of filtering or data QC is inherently bad
Yes, that is my point. https://arxiv.org/pdf/2505.04741

Anonymous
09/07/25(Sun)14:04:28 No.106514305

Anonymous 09/07/25(Sun)14:04:28 No.106514305

File: deepseek nigger life.png (118 KB, 624x354)

118 KB PNG

>>106514270
>$13,200 Per Piece
WHOOPS it just went up to $15,000 due to the GOYIM tax.

Anonymous
09/07/25(Sun)14:06:38 No.106514325

Anonymous 09/07/25(Sun)14:06:38 No.106514325

File: WanVideo2_2_I2V_03899.mp4 (1 MB, 640x480)

1 MB MP4

>>106512310
Miku watch out!!!

Anonymous
09/07/25(Sun)14:07:00 No.106514328

Anonymous 09/07/25(Sun)14:07:00 No.106514328

>>106514270
>still can't run even iq1s of r1

Anonymous
09/07/25(Sun)14:09:59 No.106514355

Anonymous 09/07/25(Sun)14:09:59 No.106514355

>>106514051
Bro, you can do long generation with any TTS. You just need to segment your sentences properly when sending them to the TTS engine

Anonymous
09/07/25(Sun)14:12:08 No.106514377

Anonymous 09/07/25(Sun)14:12:08 No.106514377

File: hs.png (412 KB, 3000x1800)

412 KB PNG

In the FinePDF card they have a graph with general benchmark scores marked every billion tokens. Interestingly, at 1B tokens just training on the PDFs gave similar scores to just the web data (FineWeb) trained for twice the amount of tokens. The gap narrows immediately after that, then turns again to about a factor of 2 later on.

https://huggingface.co/datasets/HuggingFaceFW/finepdfs

Even with a small amount of training tokens for a model pretrained from scratch the data makes a ton of difference. It wouldn't be surprising if with very specialized data you'd get a better model with considerably less tokens than normal---in your field of interest.

Anonymous
09/07/25(Sun)14:13:31 No.106514388

Anonymous 09/07/25(Sun)14:13:31 No.106514388

>>106514377
>in your field of interest.
One problem being that RP isn't just one narrow field, every other anon expects something different from their RP, some want modern stuff, some fantasy, some anime/light novel like...

Anonymous
09/07/25(Sun)14:14:46 No.106514400

Anonymous 09/07/25(Sun)14:14:46 No.106514400

File: Screenshot.png (35 KB, 653x205)

35 KB PNG

Very interesting and brave take.

Anonymous
09/07/25(Sun)14:16:27 No.106514415

Anonymous 09/07/25(Sun)14:16:27 No.106514415

File: broken womb .png (348 KB, 788x742)

348 KB PNG

>>106514400
my take is guys like him should choke on their onions and die

Anonymous
09/07/25(Sun)14:17:40 No.106514427

Anonymous 09/07/25(Sun)14:17:40 No.106514427

>>106514415
holy slope

Anonymous
09/07/25(Sun)14:17:46 No.106514428

Anonymous 09/07/25(Sun)14:17:46 No.106514428

>>106514285
Are you under the impression that I think the models are actually the entire internet compressed into a file?

Anonymous
09/07/25(Sun)14:18:32 No.106514437

Anonymous 09/07/25(Sun)14:18:32 No.106514437

>>106514051
ComfyUI nodes in particular are not properly implemented.

Anonymous
09/07/25(Sun)14:18:49 No.106514440

Anonymous 09/07/25(Sun)14:18:49 No.106514440

>>106514217
Assuming it is pure text and nothing else, how many years worth of chat logs? Were these particularly active servers? (Maybe you should train a model off of those and see what happens).

Anonymous
09/07/25(Sun)14:19:40 No.106514449

Anonymous 09/07/25(Sun)14:19:40 No.106514449

>>106514299
So all the unfiltered shit is good too? There's a difference between having quality data that has shittier quality than good data and having data that is not even worth using

Anonymous
09/07/25(Sun)14:20:11 No.106514457

Anonymous 09/07/25(Sun)14:20:11 No.106514457

File: FineVision.png (164 KB, 654x650)

164 KB PNG

>>106514377
And in their FineVision release tweet they state outright that removing data lowered performance. https://xcancel.com/andimarafioti/status/1963610135328104945

Anonymous
09/07/25(Sun)14:20:54 No.106514462

Anonymous 09/07/25(Sun)14:20:54 No.106514462

>>106514325
spooky/10
Did Wan also do the static effects or did you edit that in?

Anonymous
09/07/25(Sun)14:21:06 No.106514465

Anonymous 09/07/25(Sun)14:21:06 No.106514465

just a heads up. /ldg/ schizos claiming comfyui collects user data are actually correct. the new login system pings Google services even if you don't use it. testing is underway. use a different UI if possible

Anonymous
09/07/25(Sun)14:21:23 No.106514467

Anonymous 09/07/25(Sun)14:21:23 No.106514467

>>106514449
>So all the unfiltered shit is good too?
Yes.
>data that is not even worth using
that line of thought is why we are where we are right now.

Anonymous
09/07/25(Sun)14:21:58 No.106514474

Anonymous 09/07/25(Sun)14:21:58 No.106514474

>>106514462
>the clip ends with white static noise and glitch

Anonymous
09/07/25(Sun)14:22:02 No.106514475

Anonymous 09/07/25(Sun)14:22:02 No.106514475

>>106514299
But how do you explain the models, even models that have been specifically fine-tuned for RP, having the "shivering down my spine" nonsense? If any form of filtering or QC is inherently bad (I don't know how you can say this out loud and not realize how nonsensical it is) then how do you propose we get rid of gpt-ism slop responses in models? No, "You're just prompting it wrong You're just system prompting wrong" is not the right answer.

Anonymous
09/07/25(Sun)14:22:27 No.106514479

Anonymous 09/07/25(Sun)14:22:27 No.106514479

>>106514465
I haven't seen any connections going anywhere...
Looking at your typing, you are one of the real schizos trying to stir shit up again.

Anonymous
09/07/25(Sun)14:23:17 No.106514486

Anonymous 09/07/25(Sun)14:23:17 No.106514486

>>106514475
By having more data like you want to see to drown out the slop but still have the model know what slop is.

Anonymous
09/07/25(Sun)14:23:22 No.106514488

Anonymous 09/07/25(Sun)14:23:22 No.106514488

>>106514465
link a file that does that?

Anonymous
09/07/25(Sun)14:23:48 No.106514493

Anonymous 09/07/25(Sun)14:23:48 No.106514493

>>106514479
fuck off Chinese shill

>>106514488
>>106513947

Anonymous
09/07/25(Sun)14:24:40 No.106514498

Anonymous 09/07/25(Sun)14:24:40 No.106514498

>>106514457
Likely because the highest quality data is less varied where it matters. It's as if you wanted the model to learn conversations just from FineWeb-Edu documents above 0.99 language score.

Anonymous
09/07/25(Sun)14:25:03 No.106514504

Anonymous 09/07/25(Sun)14:25:03 No.106514504

>>106514486
Or you could just remove the slop entirely, but I guess you will misunderstand what I said is "just filter out everything". You have to find a balance between the amount of data you're using versus using way too fucking much. You're saying that if you have a massive amount of data then be sure quantity of "good" Data will outweigh the "bad" slop. Why not just carefully omit the bad data and only include shit you absolutely need?

Anonymous
09/07/25(Sun)14:25:08 No.106514505

Anonymous 09/07/25(Sun)14:25:08 No.106514505

>>106514493
What do you mean?

Anonymous
09/07/25(Sun)14:25:56 No.106514512

Anonymous 09/07/25(Sun)14:25:56 No.106514512

>>106514505
comfyui is getting exposed as a Chinese scam to launder money into shanghai

Anonymous
09/07/25(Sun)14:26:15 No.106514517

Anonymous 09/07/25(Sun)14:26:15 No.106514517

>>106514493
>Making a stink about this on their github would probably turn their community against us,
How did he come to that conclusion?

Anonymous
09/07/25(Sun)14:26:25 No.106514519

Anonymous 09/07/25(Sun)14:26:25 No.106514519

File: naked pepefrog.png (232 KB, 655x599)

232 KB PNG

I'm getting better results with 4k context than 32k. Do home gamer LLMs just not do well with large context?

Anonymous
09/07/25(Sun)14:26:52 No.106514523

Anonymous 09/07/25(Sun)14:26:52 No.106514523

>>106514400
That guy has no idea what he is talking about.

>>106514258
This guy has no idea what he is talking about.
Stop posting.

Anonymous
09/07/25(Sun)14:26:56 No.106514524

Anonymous 09/07/25(Sun)14:26:56 No.106514524

>>106514517
99% of users are cock garbling redditors that can see no wrong

Anonymous
09/07/25(Sun)14:27:03 No.106514525

Anonymous 09/07/25(Sun)14:27:03 No.106514525

>>106514512
?

Anonymous
09/07/25(Sun)14:27:19 No.106514531

Anonymous 09/07/25(Sun)14:27:19 No.106514531

>>106514512
Will the money going into shanghai finance building the cheap high vram gpus?

Anonymous
09/07/25(Sun)14:29:20 No.106514549

Anonymous 09/07/25(Sun)14:29:20 No.106514549

File: IMG_8542.jpg (2.12 MB, 4032x3024)

2.12 MB JPG

Ah, there they all are.

Anonymous
09/07/25(Sun)14:29:53 No.106514554

Anonymous 09/07/25(Sun)14:29:53 No.106514554

>>106514440
About thirty years of logs from an active niche community. I don't have enough compute to train a big enough model to make it worthwhile. They're not English.

Anonymous
09/07/25(Sun)14:30:06 No.106514556

Anonymous 09/07/25(Sun)14:30:06 No.106514556

>>106514504
>Why not just carefully omit the bad data and only include shit you absolutely need?
Because it's impossible to agree on what IS bad data. One of the reasons our current models are so slopped is because they only kept the kind of stuff they considered good, which aligns with purple prose bs.

Anonymous
09/07/25(Sun)14:30:36 No.106514560

Anonymous 09/07/25(Sun)14:30:36 No.106514560

>>106514531
no it goes into buying labubus

Anonymous
09/07/25(Sun)14:30:42 No.106514562

Anonymous 09/07/25(Sun)14:30:42 No.106514562

>>106514519
https://github.com/adobe-research/NoLiMa
Big context is an illusion

Anonymous
09/07/25(Sun)14:32:06 No.106514571

Anonymous 09/07/25(Sun)14:32:06 No.106514571

>>106514562
Not an illusion, an outright marketing lie by model makers. We should call them out on their blatant lies when we can.

Anonymous
09/07/25(Sun)14:32:58 No.106514579

Anonymous 09/07/25(Sun)14:32:58 No.106514579

>>106513740
>$1k monthly on electricity costs
Do you live in Germany or something.

Anonymous
09/07/25(Sun)14:33:04 No.106514580

Anonymous 09/07/25(Sun)14:33:04 No.106514580

>>106514562
Update to the leaderboard when? This is why research sucks. Limited budget and care. Meanwhile you've got people like the UGI guy that's really dedicated but the benchmark frankly could use a bit more statistical rigor.

Anonymous
09/07/25(Sun)14:34:00 No.106514592

Anonymous 09/07/25(Sun)14:34:00 No.106514592

>>106514556
So if we wanted to attend to pre-train our own /lmg/approved base model, how would we even define what is considered "good" and "bad" data? (Again, that is not the entire internet).

Anonymous
09/07/25(Sun)14:35:15 No.106514607

Anonymous 09/07/25(Sun)14:35:15 No.106514607

>>106514556
>>106514592
Also I probably should have clarified this earlier, I'm referring to pre-training a RP focus to model, not a general purpose model. If you're trying to do pre-training of a general purpose "genius" model like Claude or deep-seek then yeah you probably DO need several hundred gigabytes if not a terabyte or two of data. Perhaps even hundreds. But it picks hyper focused in terms of functionality, you absolutely do not need THE WHOLE INTERNET

Anonymous
09/07/25(Sun)14:35:18 No.106514608

Anonymous 09/07/25(Sun)14:35:18 No.106514608

>>106514592
One of the things I'm trying to say is exactly that that's an impossible task, anons would never manage to agree on what would go in, in what quantity and tons of other disagreement points.

Anonymous
09/07/25(Sun)14:35:59 No.106514614

Anonymous 09/07/25(Sun)14:35:59 No.106514614

>>106514579
4 GPUs that cost me $100 per month to keep running. Unless your electricity is free, 32 fucking GPUs is going to cost nearly $1k.

Anonymous
09/07/25(Sun)14:36:22 No.106514622

Anonymous 09/07/25(Sun)14:36:22 No.106514622

>>106514607
>RP focus to model,
And again, again again, RP isn't narrow enough a use case that you can do what you think, it's nowhere near as narrow as code or math.

Anonymous
09/07/25(Sun)14:37:13 No.106514634

Anonymous 09/07/25(Sun)14:37:13 No.106514634

>>106514580
Time spent updating old projects would be better spent working on the next paper.

Anonymous
09/07/25(Sun)14:37:21 No.106514635

Anonymous 09/07/25(Sun)14:37:21 No.106514635

File: 1737617803574376.png (62 KB, 774x689)

62 KB PNG

>>106514562
So this proves that reasoning is a patch for attention

Anonymous
09/07/25(Sun)14:38:04 No.106514642

Anonymous 09/07/25(Sun)14:38:04 No.106514642

>>106514325
I am unsure of this Aimaina Miku's validity.

Anonymous
09/07/25(Sun)14:38:24 No.106514647

Anonymous 09/07/25(Sun)14:38:24 No.106514647

>>106514457
I wonder if we can finally have proper nsfw captioning.

Anonymous
09/07/25(Sun)14:38:47 No.106514653

Anonymous 09/07/25(Sun)14:38:47 No.106514653

>>106514562
Nta. How accurate that graph of his is depends on the amount of context the inference engine he used actually allowed. Ollama for example allows you to use models that are advertised as having a 128K context window but by default it sets it so that the kv cache only allows 4096 so that it doesn't cause consumer GPU rigs to explode via oom. vllm pretty much requires that you said effective contact window lengths manually or else if you try to use a model that has a giant context window but you don't have enough VRAM, vllm doesn't know that so if you have a shit box it will crash.

Anonymous
09/07/25(Sun)14:39:22 No.106514661

Anonymous 09/07/25(Sun)14:39:22 No.106514661

>>106514562
>NoLiMa
> Long-Context Evaluation Beyond Literal Matching
I don't get this acronym. Shouldn't it be LCEBLM?

Anonymous
09/07/25(Sun)14:40:21 No.106514671

Anonymous 09/07/25(Sun)14:40:21 No.106514671

>>106514608
I don't think that means we should just accept that training on the entire internet is an efficient way to make general purpose models (read: general purpose). I guess we can agree that aggressive filtering done by corporations makes the models shittier (by our standards)

Anonymous
09/07/25(Sun)14:40:26 No.106514672

Anonymous 09/07/25(Sun)14:40:26 No.106514672

>>106514653
That's crazy dude! I think you should send a PR to the Nolima guys, maybe they don't know!

Anonymous
09/07/25(Sun)14:40:28 No.106514673

Anonymous 09/07/25(Sun)14:40:28 No.106514673

>>106514661
NoLiteralMatching?

Anonymous
09/07/25(Sun)14:41:02 No.106514678

Anonymous 09/07/25(Sun)14:41:02 No.106514678

>>106514622
>Coding is narrow
>Most decent coding models need to be in the double digit perimeter range at minimum in order to actually be usable

Wat?

Anonymous
09/07/25(Sun)14:41:40 No.106514684

Anonymous 09/07/25(Sun)14:41:40 No.106514684

>>106514671
Why do you even have such a hard on of hatred for the entire internet as a training concept, do you have some stuff on there you're scared o the models learning or some shit?

Anonymous
09/07/25(Sun)14:42:31 No.106514693

Anonymous 09/07/25(Sun)14:42:31 No.106514693

9pm mc donalds feast
u guys want some

Anonymous
09/07/25(Sun)14:42:43 No.106514694

Anonymous 09/07/25(Sun)14:42:43 No.106514694

>>106514678
I'm saying RP is less narrow than coding, not that coding is super narrow in itself. Just the fact RP is even less so.

Anonymous
09/07/25(Sun)14:42:53 No.106514697

Anonymous 09/07/25(Sun)14:42:53 No.106514697

>>106514672
You'd be surprised

Anonymous
09/07/25(Sun)14:42:56 No.106514698

Anonymous 09/07/25(Sun)14:42:56 No.106514698

>>106514562
you guys really trust research coming from Adobe of all places?

Anonymous
09/07/25(Sun)14:43:09 No.106514700

Anonymous 09/07/25(Sun)14:43:09 No.106514700

>>106514549
Hot glue

Anonymous
09/07/25(Sun)14:43:44 No.106514702

Anonymous 09/07/25(Sun)14:43:44 No.106514702

>>106514143
bad quality data in this context is not ah ah mistress stuff but typos, all-caps, 403 forbidden pages, "you can put glue on pizza", and other noise.

Anonymous
09/07/25(Sun)14:43:46 No.106514704

Anonymous 09/07/25(Sun)14:43:46 No.106514704

>>106514634
Yeah, opening a script takes so much time. It's more about money, and caring to do a few clicks.

Anonymous
09/07/25(Sun)14:43:49 No.106514705

Anonymous 09/07/25(Sun)14:43:49 No.106514705

>>106514693
What's a "mc donalds"? That must have been bad data so I don't know.

Anonymous
09/07/25(Sun)14:44:52 No.106514710

Anonymous 09/07/25(Sun)14:44:52 No.106514710

>>106514684
It is possible to cut down on the amount of training resources required to pre-train these models then thats a worthwhile thing to pursue.

Anonymous
09/07/25(Sun)14:45:10 No.106514713

Anonymous 09/07/25(Sun)14:45:10 No.106514713

>>106514684
I get the impression that he can only run small models, so he's grasping at hope that by filtering the dataset he can have his perfect model in some 8B that is cheap and quick to train so someone will do it for him

Anonymous
09/07/25(Sun)14:45:15 No.106514715

Anonymous 09/07/25(Sun)14:45:15 No.106514715

File: 85ed523776ff69ecea18a04c1(...).jpg (192 KB, 1099x1221)

192 KB JPG

>>106514622
We'll narrow it down to her level. We can start small.

Anonymous
09/07/25(Sun)14:46:30 No.106514722

Anonymous 09/07/25(Sun)14:46:30 No.106514722

>>106514702
Hence why I keep saying that pruning out SOME data instead of just saying "fuck it we ball train on everything that has ever existed" is something to at least worth consider. You slashed the other guy keeps saying that ANY form of data QC is a sin punishable by the guillotine

Anonymous
09/07/25(Sun)14:46:33 No.106514723

Anonymous 09/07/25(Sun)14:46:33 No.106514723

>>106514700
Do you have any idea how hard it is to clean skeet off of fabric that cannot be machine washed?

Anonymous
09/07/25(Sun)14:46:40 No.106514725

Anonymous 09/07/25(Sun)14:46:40 No.106514725

>>106514710
That would mainly benefit big corps in them spending less on even more filtered models using this idea as justification though.

Anonymous
09/07/25(Sun)14:47:41 No.106514736

Anonymous 09/07/25(Sun)14:47:41 No.106514736

>>106514562
If models do better at low context how do dev tools work? I regularly feed files to my chat that are like 20kb+ alone. I assume they must be doing some chunking and summarization, but wouldn't that leave it missing details of my code? or do they just accept that high context are needed and the results may be shit?

Anonymous
09/07/25(Sun)14:48:01 No.106514740

Anonymous 09/07/25(Sun)14:48:01 No.106514740

>>106514725
And it could benefit US because we don't HAVE to use THEIR shit It pre-training on a considerably less amount of data in order to make a coherent model is possible. I don't give a shit whether or not corporations are benefited.

Anonymous
09/07/25(Sun)14:48:59 No.106514750

Anonymous 09/07/25(Sun)14:48:59 No.106514750

>>106514740
>And it could benefit US because we don't HAVE to use THEIR shit It pre-training
Where are any models pre-trained by a non corpo since the llama2 era?

Anonymous
09/07/25(Sun)14:50:02 No.106514760

Anonymous 09/07/25(Sun)14:50:02 No.106514760

>>106514750
Multiple people here have bothered to actually try. It not being popular on HF/doesn't exist and is not worth doing.

Anonymous
09/07/25(Sun)14:50:33 No.106514762

Anonymous 09/07/25(Sun)14:50:33 No.106514762

>>106514713
I'm sure the perfect RP focused model is only 4B away, we just need to thrust and for other anons to pay for training it.

Anonymous
09/07/25(Sun)14:50:47 No.106514765

Anonymous 09/07/25(Sun)14:50:47 No.106514765

>>106514750
https://github.com/jzhang38/TinyLlama

Anonymous
09/07/25(Sun)14:51:04 No.106514768

Anonymous 09/07/25(Sun)14:51:04 No.106514768

>>106514702
Actually, it's good to have some data with typos in the training dataset, as it gives the model some context to deal with typos in prompts.
You just want to make sure there aren't enough typos in the dataset that the model itself starts making typos.

Anonymous
09/07/25(Sun)14:51:43 No.106514777

Anonymous 09/07/25(Sun)14:51:43 No.106514777

>>106514768
No. That's bloat.

Anonymous
09/07/25(Sun)14:51:52 No.106514778

Anonymous 09/07/25(Sun)14:51:52 No.106514778

>>106514762
Seriously, I don't get him. This is the same flavor of cope as bitnet, except quantization was replaced with filtering.

Anonymous
09/07/25(Sun)14:52:37 No.106514787

Anonymous 09/07/25(Sun)14:52:37 No.106514787

>>106514614
I'm cpumaxxing (granted, in a super cheap electricity locale) and I'm hitting (5 person household) $250 dollarydoos/mo mid-summer with A/C cranked.
Would a busload of power-limited mi50 in a trash-tier CPU mining-rig really be that much worse?

Anonymous
09/07/25(Sun)14:56:19 No.106514823

Anonymous 09/07/25(Sun)14:56:19 No.106514823

File: 1740858469742991.jpg (999 KB, 2446x2445)

999 KB JPG

>decide to check on ipex-llm to see if they finally updated to support latest models like oss-gpt
>still no releases since april
Buy Intel they said. It would be great they said. It's so much cheaper they said.

Anonymous
09/07/25(Sun)14:59:43 No.106514854

Anonymous 09/07/25(Sun)14:59:43 No.106514854

>>106512347
Yeah things are slowing down
Models need 5X parameters for 15% performance boost (according to their own benchmarks)

Anonymous
09/07/25(Sun)15:01:02 No.106514865

Anonymous 09/07/25(Sun)15:01:02 No.106514865

>>106514768
Couldn't that be mitigated by simply ensuring that during the SFT instruct tuning phase none of the "assistant" responses have any typos?

Anonymous
09/07/25(Sun)15:04:17 No.106514888

Anonymous 09/07/25(Sun)15:04:17 No.106514888

>>106514678
In RP, arbitrary amounts of code can come up. It's a superset. Basically anything in the world can come up in RP or writing stories in general. Some people like writing hard scifi. Others want to RP with math kittens. Others want to discuss rare stamps with their stamp collector gf. Others want to play 3rd edition MtG. If there is any topic your model cannot handle, it's not suitable for RP.

Anonymous
09/07/25(Sun)15:06:13 No.106514901

Anonymous 09/07/25(Sun)15:06:13 No.106514901

>>106514823
does it work with vulkan at least?

Anonymous
09/07/25(Sun)15:06:53 No.106514909

Anonymous 09/07/25(Sun)15:06:53 No.106514909

>>106514901
yes, but i found llama.cpp with vulkan to be awful performance

Anonymous
09/07/25(Sun)15:08:04 No.106514917

Anonymous 09/07/25(Sun)15:08:04 No.106514917

>>106514888
I think you were the guy that suggested that you need both RP AND common sense data like shit from science textbooks in order for it to learn proper common sense and logic. I think the disagreement comes from how MUCH Data is needed.

Anonymous
09/07/25(Sun)15:09:25 No.106514927

Anonymous 09/07/25(Sun)15:09:25 No.106514927

>>106514917
Nope. I think you need the whole fucking web, plus books, RP, everything, as many trillions of tokens as you can get.

Anonymous
09/07/25(Sun)15:09:53 No.106514932

Anonymous 09/07/25(Sun)15:09:53 No.106514932

>>106514917
How much data do you think is needed to cover every possible RP topic? How about maybe the entire internet, that sounds about enough.

Anonymous
09/07/25(Sun)15:12:36 No.106514961

Anonymous 09/07/25(Sun)15:12:36 No.106514961

>>106514932
Yeah. The idea that this data is "bloat" is just a massive misconception. It all goes into building a better world model.
Thinking that a 4B model "without the bloat" could possibly be enough for good RP is just a massive cope. Less data makes models worse in the general case. If you keep training a 4B model on more and more diverse data, it would get better and better. That's just the basic scaling laws from the GPT-3/Chinchilla era before people started filtering everything to shit. But of course it's still only 4B, so it'll be garbage anyways.

Anonymous
09/07/25(Sun)15:13:17 No.106514968

Anonymous 09/07/25(Sun)15:13:17 No.106514968

>>106514927
Ehhh... We can agree to disagree on that. I don't think merely two gigabytes of text is enough if you want the thing to both know how to RP and have common sense and good temporal coherence like this guy alludes to >>106514888, but the entire internet being a hard requirement doesn't sound like a good use of resources. Haven't people already demonstrated that you can create these models on way less data? (Not only two gigs obviously but way less than the entire internet)

>>106514932
Probably more than 2 GB but again, not the entire goddamn internet. Ensuring that your data has a diverse set of topics and story types would help a lot along with having the common sense / science portion as well. I understand why thinking I'm near 2 GB would not make it GOOD at RP and it will probably suck at having any form of coherent logic, but you're also failing to understand why the entire internet is a necessity. There should be an in-between point.

Anonymous
09/07/25(Sun)15:13:24 No.106514970

Anonymous 09/07/25(Sun)15:13:24 No.106514970

Almost all improvement in LLM sphere came from more params, bigger datasets and longer training, and as soon as corpos started curating their inputs we entered the benchslop era.
I am curious how erp benchmaxxed model would look like, but I think https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1 comes pretty close.

Anonymous
09/07/25(Sun)15:14:30 No.106514979

Anonymous 09/07/25(Sun)15:14:30 No.106514979

>>106514961
>Less data makes models worse in the general case.
Guys think of the TRIVIA. What will we do without our precious trivia?

Joke's aside, yes you need a lot of data. Not the entire internet

Anonymous
09/07/25(Sun)15:14:42 No.106514982

Anonymous 09/07/25(Sun)15:14:42 No.106514982

>>106514968
>Haven't people already demonstrated that you can create these models on way less data?
Not if you want the model to actually be good at anything, I assume you don't use Phi as your daily model? Yet it's so lean and optimized.

Anonymous
09/07/25(Sun)15:15:52 No.106515000

Anonymous 09/07/25(Sun)15:15:52 No.106515000

>106514979
>Not the entire internet
>106514968
>not the entire goddamn internet.
I honestly think there's some kind of psyop being ran on the thread.

Anonymous
09/07/25(Sun)15:15:53 No.106515001

Anonymous 09/07/25(Sun)15:15:53 No.106515001

>>106514961
>If you keep training a 4B model on more and more diverse data
So the size of the data set directly correlates to how diverse it is? Isn't it possible to have a data set that's only like 100 gigs in size that potentially has more variety than a data set twice it's size? I don't think "bigger number = better" is the right line of thinking

Anonymous
09/07/25(Sun)15:16:49 No.106515015

Anonymous 09/07/25(Sun)15:16:49 No.106515015

>>106514982
That's a general purpose model though, not something hyper specific or specialized.

Anonymous
09/07/25(Sun)15:17:37 No.106515019

Anonymous 09/07/25(Sun)15:17:37 No.106515019

>>106514970
Funny how The general consensus of this general was that doing that made the models worse. Why did the sentiment suddenly flip?

Anonymous
09/07/25(Sun)15:17:51 No.106515021

Anonymous 09/07/25(Sun)15:17:51 No.106515021

>>106514982
Maybe Phi wouldn't be so bad if it wasn't so safe.
Seeing it burn my precious tokens thinking if my prompt aligns with their policy or we should refuse gave me psychological trauma and Microsoft must compensate me financially.

Anonymous
09/07/25(Sun)15:18:03 No.106515026

Anonymous 09/07/25(Sun)15:18:03 No.106515026

>>106515015
I see, my apologies you're absolutely right! I will forward you all the money you need and the engineers to train your model first thing on Monday.

Anonymous
09/07/25(Sun)15:18:15 No.106515029

Anonymous 09/07/25(Sun)15:18:15 No.106515029

>>106514968
>We can agree to disagree on that.
I'm both of those guys you quoted in the first section. And we can't because you are simply wrong.
>Haven't people already demonstrated that you can create these models on way less data? (Not only two gigs obviously but way less than the entire internet)
I think that some filtering is warranted. You don't want spam generated with markov chains. You don't want languages other than English (unless you do). You don't want AI slop (so only use old data). "Limited data models" like the Phi series are just garbage for RP, because they don't develop a good general world model.

Anonymous
09/07/25(Sun)15:19:19 No.106515036

Anonymous 09/07/25(Sun)15:19:19 No.106515036

>>106514979
If the model doesn't recognize obscure characters I like and their settings, it's shit, sorry.

Anonymous
09/07/25(Sun)15:19:20 No.106515037

Anonymous 09/07/25(Sun)15:19:20 No.106515037

>>106515029
>You don't want languages other than English (unless you do).
Fuck off I need my JP weebslop in there. Tired of models failing MSGKbench.

Anonymous
09/07/25(Sun)15:19:27 No.106515040

Anonymous 09/07/25(Sun)15:19:27 No.106515040

Hey guys I have an idea, tell me if it is fucking retarded or if it might have some merit.

So I have a literotica account that I used to have thousands of stories rated.

I'm thinking of downloading all the rated stories and their rating, making a dataset out of it.

And then I train a adversarial network to read text and predict what my rating of the text will be using that dataset.

Then when it is trained to rank stories I like I put it in an Reinforcement Learning setup where an LLM generates text and the adversarial network then predicts the rating of the text with the goal of getting the highest rating possible. Then every X round milestone I go and check the output and give it my actual rating and the adversarial network will be punished if its predicted rating deviated too much from my actual one.

Anonymous
09/07/25(Sun)15:20:17 No.106515049

Anonymous 09/07/25(Sun)15:20:17 No.106515049

>>106514970
Pre-slop era Llama 1 was only trained on 1/1.4T tokens, Llama 2 on 2T tokens: 1/10 of the GPUs and 1/10 of the data than later models.

Anonymous
09/07/25(Sun)15:20:24 No.106515051

Anonymous 09/07/25(Sun)15:20:24 No.106515051

>>106515036
Just RAG your character? It's that shrimple isn't it.

Anonymous
09/07/25(Sun)15:20:52 No.106515055

Anonymous 09/07/25(Sun)15:20:52 No.106515055

>>106514607
you want the model to see a high diversity of data or it will get bored and just start memorizing specific slop phrases. I have personally trained my own 1.5b model on over 5b (unique) tokens of smut to come to this determination. you absolutely will never find a high enough diversity in such a narrow domain. you need am incredibly broad dataset that constantly challenges the model and not simply reinforcing it.

Anonymous
09/07/25(Sun)15:21:05 No.106515057

Anonymous 09/07/25(Sun)15:21:05 No.106515057

>>106515001
So the size of the data set directly correlates to how diverse it is?
Yes.
>Isn't it possible to have a data set that's only like 100 gigs in size that potentially has more variety than a data set twice it's size?
>twice it's size
Yes, that's possible. Ten times the size? You'd have to fuck up hard. 100GB of text is 25B tokens, it's basically nothing.

Anonymous
09/07/25(Sun)15:21:52 No.106515061

Anonymous 09/07/25(Sun)15:21:52 No.106515061

>>106515015
The point is that there is no more general task than RP and writing stories. Your model has to understand everything, because everything can come up in RP/stories. It's not a small domain.

Anonymous
09/07/25(Sun)15:21:56 No.106515063

Anonymous 09/07/25(Sun)15:21:56 No.106515063

>>106515029
Sir do you know what "agree to disagree" means? I've acknowledged that neither of us are going to see each other's way. Is the fact that I don't agree with your sentiment such an offensive sin?

Anonymous
09/07/25(Sun)15:22:25 No.106515067

Anonymous 09/07/25(Sun)15:22:25 No.106515067

>>106515061
Just focus? Skill issue LMAO.

Anonymous
09/07/25(Sun)15:22:25 No.106515068

Anonymous 09/07/25(Sun)15:22:25 No.106515068

>>106515063
It means you are wrong.

Anonymous
09/07/25(Sun)15:22:45 No.106515071

Anonymous 09/07/25(Sun)15:22:45 No.106515071

https://vocaroo.com/1mpd6FwZaOM8

Is vibevoice peak? This sounds fucking great.

Anonymous
09/07/25(Sun)15:23:42 No.106515079

Anonymous 09/07/25(Sun)15:23:42 No.106515079

>>106515061
it won't it will just latch on to the tropes that it can get its easy wins from

Anonymous
09/07/25(Sun)15:25:11 No.106515089

Anonymous 09/07/25(Sun)15:25:11 No.106515089

>>106515051
RAG is garbage and doesn't work.

>>106515037
>unless you do

>>106515079
If your model is 4B, it definitely will. There's a reason we need lots of data and also huge models.

Anonymous
09/07/25(Sun)15:26:03 No.106515095

Anonymous 09/07/25(Sun)15:26:03 No.106515095

>>106515089
Not the entire internet though.

Anonymous
09/07/25(Sun)15:26:53 No.106515105

Anonymous 09/07/25(Sun)15:26:53 No.106515105

>>106515061
>The point is that there is no more general task than RP and writing stories. Your model has to understand everything, because everything can come up in RP/stories. It's not a small domain.
By a giga autist's standards then I guess I see how that makes sense. You have incredibly high standards for the RP. But the thing is most people do not write anywhere near that level of high quality while also having the kind of uncensored scenarios corporations are afraid of. Shit scraped from AO3 or Wattpad Will have a diverse set of scenarios but they probably aren't taking The ambient temperature of the room they're in into account in order to determine the exact amount of time it took for someone's nipples to get hard, or taking someone's inferred medical history into account when determining exactly how long it would take for anon to bust and under what circumstances. Most people do not think about that shit like at all. You could solve this by training on stories that are "higher quality" (fiction or nonfiction novels that actually go through a publishing agency and thus go through actual QC) but then it takes only trained on that you get a model that will be perceived as having too much flowery language or purple prose and won't have the ability to generate or go along with the fucked up scenarios anons here would love for it to do. Cleaning that it needs to have a perfect understanding of how everything ever works in order to be good at RP (by your standards) is a giant stretch.

>"YOU'RE WRONG"

ok. Now what?

Anonymous
09/07/25(Sun)15:27:30 No.106515112

Anonymous 09/07/25(Sun)15:27:30 No.106515112

>>106515095
Ideally the entire internet (without garbage spam), but I know that nobody will train on that. Fuck. There is so much good info in old web crawl from before the web turned into garbage that will never get used, it's so sad.

Anonymous
09/07/25(Sun)15:27:31 No.106515113

Anonymous 09/07/25(Sun)15:27:31 No.106515113

>>106515068
You must have been fun at parties and had many friends.

Anonymous
09/07/25(Sun)15:27:44 No.106515117

Anonymous 09/07/25(Sun)15:27:44 No.106515117

>>106515095
maybe not the entire internet but it needs to be of the same scale and diversity. the internet is just the most obvious and readily available source.

Anonymous
09/07/25(Sun)15:27:46 No.106515119

Anonymous 09/07/25(Sun)15:27:46 No.106515119

>>106515105
>You have incredibly high standards for the RP
Isn't the whole point of your idea to make a better RP model than what we have now???

Anonymous
09/07/25(Sun)15:28:07 No.106515123

Anonymous 09/07/25(Sun)15:28:07 No.106515123

>>106515112
>(without garbage spam),
But anon QC of any kind is bad remember?

Anonymous
09/07/25(Sun)15:28:47 No.106515130

Anonymous 09/07/25(Sun)15:28:47 No.106515130

>>106515117
>maybe not the entire internet but it needs to be of the same scale and diversity.
Scale? Debatable. Diversity? Absolutely.

Anonymous
09/07/25(Sun)15:29:42 No.106515139

Anonymous 09/07/25(Sun)15:29:42 No.106515139

>>106515119
No, we just need to lower the bar as much as possible for Focused RP on 4B param or less.

Anonymous
09/07/25(Sun)15:30:56 No.106515153

Anonymous 09/07/25(Sun)15:30:56 No.106515153

>>106515112
I'm using the fineweb's 2013 subset on my next model to see what happens. I do wish we had even earlier internet crawls available.

Anonymous
09/07/25(Sun)15:32:02 No.106515159

Anonymous 09/07/25(Sun)15:32:02 No.106515159

>>106515049
That's just LLama problem

Anonymous
09/07/25(Sun)15:32:24 No.106515165

Anonymous 09/07/25(Sun)15:32:24 No.106515165

File: fineweb-recipe.png (218 KB, 1786x672)

218 KB PNG

>>106515153
>fineweb
>filter filter filter

Anonymous
09/07/25(Sun)15:32:53 No.106515169

Anonymous 09/07/25(Sun)15:32:53 No.106515169

How do you format OOC comments? Do you do a newline after the dialogue and then "OOC:" or do you put it in parentheses or brackets? Do you use a colon?

Anonymous
09/07/25(Sun)15:32:59 No.106515170

Anonymous 09/07/25(Sun)15:32:59 No.106515170

>>106515071
>Is vibevoice peak?
it is, that's why Microsoft didn't want the goyims to get that kino (but they somehow released it without lobotomizing it ol)

Anonymous
09/07/25(Sun)15:33:34 No.106515177

Anonymous 09/07/25(Sun)15:33:34 No.106515177

>>106515113
I'm just trying to help you not waste time and compute on something that will turn out bad. It's sad to see energy get wasted on doomed projects.

>>106515105
I'm not talking about retarded shit like that:
>taking The ambient temperature of the room they're in into account in order to determine the exact amount of time it took for someone's nipples to get hard
I'm talking about things like this:
>>106514888
They are just topics. If you train on novels, your characters probably will have no idea about even the most famous MtG cards and rules because mentioning them in a novel is a copyright violation. Fanfics will help of course, but they won't help when you want your math kitten to write you a proof, or when you want to discuss the code you worked on at work with your wife.

>>106515123
No. Surprisingly filtering out "the a congo sex the the a congo congo nigeria vagina pussy pussy the the" documents is not bad for your model.

Anonymous
09/07/25(Sun)15:34:07 No.106515181

Anonymous 09/07/25(Sun)15:34:07 No.106515181

File: 1725949242892983.png (732 KB, 1842x178)

732 KB PNG

>>106515119
The initial test (I've already shown this and confirmed this to be the case) was to see if "uncucking" models is actually possible with further training. We've confirmed that is absolutely possible. Main reason I even bother trying is because many people here were adamant that once you safety tune a model enough that no amount of fine tuning can possibly erode away the guard rails.

What I'm arguing NOW is that training on the entirety of the internet is extremely inefficient. If it is possible to fine-tune a decent model with significantly less data than the entire internet, then that theoretically could mean you could have better models at lower parameters .... Keyword, theoretical. I'm not claiming that's actually the case currently.

>Isn't the whole point of your idea to make a better RP model than what we have now???

That's not necessarily what I've been arguing for the past hour or so. I'm talking about training scale, not whether or not we can make the models better. If you're referring to making the model less prone to refuse certain things and less likely to produce flowery advertiser friendly trash then doing that via training is trivial. Pic rel is from a fine-tuned llama model. The fine-tune model produced this while the safety-cucked version it's based off of either refused entirely or was extremely dodgy.

Anonymous
09/07/25(Sun)15:34:15 No.106515183

Anonymous 09/07/25(Sun)15:34:15 No.106515183

>>106515165
>Applied URL filtering using a blocklist to remove adult content
>Applied a fastText language classifier to keep only English text with a score ≥ 0.65
yeees.

Anonymous
09/07/25(Sun)15:34:31 No.106515186

Anonymous 09/07/25(Sun)15:34:31 No.106515186

>>106515153
CommonCrawl should have data from 2007. You just have to do language/spam filtering yourself.

Anonymous
09/07/25(Sun)15:35:00 No.106515193

Anonymous 09/07/25(Sun)15:35:00 No.106515193

>>106515071
>>106515170
does it take voice files or is it strictly those demo voices? i don't see any HF spaces.

man last TTS i used was zonos, back in january i think.

Anonymous
09/07/25(Sun)15:35:28 No.106515197

Anonymous 09/07/25(Sun)15:35:28 No.106515197

>>106514888
>>106515177
So it's what I'm getting from this is that you want a model that is good at role-playing about.... Programming?

Anonymous
09/07/25(Sun)15:35:50 No.106515199

Anonymous 09/07/25(Sun)15:35:50 No.106515199

>>106515193
Yes, it takes wav files for voice cloning. Works fine with 10-40s or so.

Anonymous
09/07/25(Sun)15:36:05 No.106515203

Anonymous 09/07/25(Sun)15:36:05 No.106515203

>>106515181
>better models
IMO it wouldn't be better if it doesn't have what you consider the internet bloat.

Anonymous
09/07/25(Sun)15:36:24 No.106515206

Anonymous 09/07/25(Sun)15:36:24 No.106515206

>>106515165
I don't care the majority of the training tokens are going to be ao3 anyway I just need something a bit more noisy in the background to keep it learning and hopefully improve generalization

Anonymous
09/07/25(Sun)15:36:27 No.106515207

Anonymous 09/07/25(Sun)15:36:27 No.106515207

>>106515197
I want it to be good at roleplay about anything I fucking want at a moment's notice, which includes programming or whatever else I enjoy.

Anonymous
09/07/25(Sun)15:36:36 No.106515209

Anonymous 09/07/25(Sun)15:36:36 No.106515209

>>106515181
>If it is possible to fine-tune a decent model with significantly less data than the entire internet
Earlier you were talking about pretraining.

Anonymous
09/07/25(Sun)15:37:07 No.106515214

Anonymous 09/07/25(Sun)15:37:07 No.106515214

>>106515207
BLOAT.

Anonymous
09/07/25(Sun)15:37:58 No.106515220

Anonymous 09/07/25(Sun)15:37:58 No.106515220

>>106514823
>he didn't buy nvidia
lmao

Anonymous
09/07/25(Sun)15:38:01 No.106515221

Anonymous 09/07/25(Sun)15:38:01 No.106515221

What if we trained a 0.1B on Nala test and nothing else?

Anonymous
09/07/25(Sun)15:38:51 No.106515226

Anonymous 09/07/25(Sun)15:38:51 No.106515226

>>106515181
so you want a side grade at best to what we currently have, but with no trivia knowledge, which is something anons frequently complain about, to potentially lower the parameter count a bit? that sounds like an awful tradeoff.

Anonymous
09/07/25(Sun)15:39:50 No.106515236

Anonymous 09/07/25(Sun)15:39:50 No.106515236

>>106515199
cool, i noticed there's a comfyui node setup for it. guess ill give that a go in a bit

https://github.com/wildminder/ComfyUI-VibeVoice?tab=readme-ov-file

>>106515221
peak the likes of which the world is not ready for (neither is my dick)

Anonymous
09/07/25(Sun)15:41:09 No.106515246

Anonymous 09/07/25(Sun)15:41:09 No.106515246

>>106515193
I used a sample voice, of one of the bitches from class of 09. The included sample voices are ok though.

Included "alice" sample:
https://voca.ro/19VRhqX2fmcc

my shitty sample:
https://voca.ro/1hoVRSBntjxO

I really like how it handles quotes and speaks them in another 'tone' sometimes.

Anonymous
09/07/25(Sun)15:42:07 No.106515258

Anonymous 09/07/25(Sun)15:42:07 No.106515258

File: 107991372.png (67 KB, 460x460)

67 KB PNG

https://huggingface.co/unsloth/grok-2-GGUF
How many reuploads will it take to get a working version?

Anonymous
09/07/25(Sun)15:42:13 No.106515260

Anonymous 09/07/25(Sun)15:42:13 No.106515260

>>106515207
>Use an intelligent model that is already pre-trained on programming
>Further fine tune it on a SFT roleplay data set with a variety of different scenarios
>????
>Profit

What it sounds like to me is that you want a general purpose model which we already hav e in spades.

Anonymous
09/07/25(Sun)15:42:29 No.106515264

Anonymous 09/07/25(Sun)15:42:29 No.106515264

>>106515226
The trade off being able to run the model in a local machine versus a bloated model filled with useless shit that you need to offload to use.

Anonymous
09/07/25(Sun)15:43:44 No.106515274

Anonymous 09/07/25(Sun)15:43:44 No.106515274

>>106515258
I'm surprised people even want to use elon's garbage. He tried pushing grook-code-fast a week ago too on a lot of providers and it was garbage

Anonymous
09/07/25(Sun)15:43:46 No.106515275

Anonymous 09/07/25(Sun)15:43:46 No.106515275

>>106515246
thats fucking cuhrayzee holy SHIT. nice.
also that girl's a cutie, would you be willing to post the sample you use?
out of context the script you use is completely schizophrenic but i love it, got a good few laughs out of me the way her voice annunciates/exaggerates sometimes.

Anonymous
09/07/25(Sun)15:44:00 No.106515277

Anonymous 09/07/25(Sun)15:44:00 No.106515277

>>106515264
So this entire retarded argument was in fact just poor cope as some had theorized, thanks for wasting the collective thread's time.

Anonymous
09/07/25(Sun)15:44:18 No.106515280

Anonymous 09/07/25(Sun)15:44:18 No.106515280

>>106515260
Yes, exactly, that's what I'm saying. General purpose models are the only suitable models for RP.

Anonymous
09/07/25(Sun)15:45:12 No.106515289

Anonymous 09/07/25(Sun)15:45:12 No.106515289

>>106515260
>general purpose model which we already hav e in spades.
And they're all shit because they're already too filtered.

Anonymous
09/07/25(Sun)15:45:24 No.106515291

Anonymous 09/07/25(Sun)15:45:24 No.106515291

>>106515236
comfyui is full of telemetry now so we really need a new UI for vv

Anonymous
09/07/25(Sun)15:46:16 No.106515300

Anonymous 09/07/25(Sun)15:46:16 No.106515300

>>106515280
Now the question is, is it possible to make GOOD general purpose models on less than an internet's worth of data while being decent? I'm assuming your answer to that is that's not possible.

>>106515289
>What is sft training

Anonymous
09/07/25(Sun)15:46:17 No.106515301

Anonymous 09/07/25(Sun)15:46:17 No.106515301

File: file.png (114 KB, 580x491)

114 KB PNG

>2M context
kek

Anonymous
09/07/25(Sun)15:46:30 No.106515304

Anonymous 09/07/25(Sun)15:46:30 No.106515304

File: local assistant.png (416 KB, 798x1128)

416 KB PNG

/lmg/ btfo
https://www.reddit.com/r/LocalLLaMA/comments/1nb0ern/fully_local_natural_speech_to_speech_on_iphone/
https://apps.apple.com/us/app/locally-ai-private-ai-chat/id6741426692

Anonymous
09/07/25(Sun)15:47:26 No.106515310

Anonymous 09/07/25(Sun)15:47:26 No.106515310

https://voca.ro/1n5vlenAX1pf

Anonymous
09/07/25(Sun)15:47:43 No.106515314

Anonymous 09/07/25(Sun)15:47:43 No.106515314

>>106515300
>Now the question is, is it possible to make GOOD general purpose models on less than an internet's worth of data while being decent? I'm assuming your answer to that is that's not possible.
That assumption is correct.

Anonymous
09/07/25(Sun)15:47:50 No.106515315

Anonymous 09/07/25(Sun)15:47:50 No.106515315

>>106515304
The MNN app was better. This is just a redditors cheap knock off of the OpenAI app.

Anonymous
09/07/25(Sun)15:47:52 No.106515317

Anonymous 09/07/25(Sun)15:47:52 No.106515317

>>106515300
>What is sft training
NOT a solution to filtered pre-train data, you cannot make it learn worth a shit after it was already lobotomized.

Anonymous
09/07/25(Sun)15:48:40 No.106515323

Anonymous 09/07/25(Sun)15:48:40 No.106515323

>>106515209
Meant to say pre-train.

>which is something anons frequently complain
And something just as many anons clean doesn't matter.

Anonymous
09/07/25(Sun)15:49:34 No.106515326

Anonymous 09/07/25(Sun)15:49:34 No.106515326

File: 1742752755504751.png (960 KB, 1816x276)

960 KB PNG

>>106515317
???

Anonymous
09/07/25(Sun)15:50:01 No.106515333

Anonymous 09/07/25(Sun)15:50:01 No.106515333

>>106515323
The "just rag it in bro" posters are not being serious.

Anonymous
09/07/25(Sun)15:50:29 No.106515337

Anonymous 09/07/25(Sun)15:50:29 No.106515337

>>106515304
>I am here to answer quest-eons and to provide helpful re-sponses

Anonymous
09/07/25(Sun)15:50:52 No.106515339

Anonymous 09/07/25(Sun)15:50:52 No.106515339

>>106515326
Congratulation on making the model say pussy. You won, that is totally what I meant.

Anonymous
09/07/25(Sun)15:50:52 No.106515340

Anonymous 09/07/25(Sun)15:50:52 No.106515340

>>106515181
https://www.youtube.com/watch?v=LQCU36pkH7c

Anonymous
09/07/25(Sun)15:50:53 No.106515341

Anonymous 09/07/25(Sun)15:50:53 No.106515341

>>106515333
With that assumption I could say the same thing about literally everything you said and vice versa.

Anonymous
09/07/25(Sun)15:51:12 No.106515347

Anonymous 09/07/25(Sun)15:51:12 No.106515347

>>106515304
>ganyouhelpme

Anonymous
09/07/25(Sun)15:51:19 No.106515350

Anonymous 09/07/25(Sun)15:51:19 No.106515350

>>106515341
Have you tried RAG?

Anonymous
09/07/25(Sun)15:51:45 No.106515355

Anonymous 09/07/25(Sun)15:51:45 No.106515355

>>106515339
So you can agree "uncucking safety tuned models is impossible" is a nonsensical claim right?

Anonymous
09/07/25(Sun)15:51:57 No.106515357

Anonymous 09/07/25(Sun)15:51:57 No.106515357

File: 1742596391245834.jpg (187 KB, 608x646)

187 KB JPG

>>106515326
>It's so small, ..., almost like it was made of my cock
Your AI bot just called your cock small lmao

Anonymous
09/07/25(Sun)15:52:47 No.106515362

Anonymous 09/07/25(Sun)15:52:47 No.106515362

File: chrome_6g1ierAshN.png (566 KB, 701x1255)

566 KB PNG

>>106515304
White people tech.

Anonymous
09/07/25(Sun)15:53:01 No.106515363

Anonymous 09/07/25(Sun)15:53:01 No.106515363

>>106515355
You're absolutely right, I don't even know why you're arguing with anons since clearly you can just do things and make the best model ever?

Anonymous
09/07/25(Sun)15:54:27 No.106515379

Anonymous 09/07/25(Sun)15:54:27 No.106515379

Not all "safety tuned" models are the same. gp-toss is basically unsalvageable garbage. GLM without thinking and with a prefill will basically never refuse and it can write some fucked up shit.

Anonymous
09/07/25(Sun)15:56:12 No.106515395

Anonymous 09/07/25(Sun)15:56:12 No.106515395

>>106515357
N....no it's NOT! It's perfectly reasonably sized my mom says so!

>>106515363
Why are you more upset that I don't agree with your hyper specific and autistic opinions? No one claimed they can make the best model ever.

Anonymous
09/07/25(Sun)15:56:15 No.106515396

Anonymous 09/07/25(Sun)15:56:15 No.106515396

>>106515379
This really hits on the fundamentals of LLM safety, you're so smart for pointing this out!

Anonymous
09/07/25(Sun)16:00:57 No.106515421

Anonymous 09/07/25(Sun)16:00:57 No.106515421

>>106515395
>Why are you more upset that I don't agree with your hyper specific and autistic opinions? No one claimed they can make the best model ever.
Assuming you mean me by "hyper specific and autistic", that post was not made by me.

Anonymous
09/07/25(Sun)16:03:15 No.106515437

Anonymous 09/07/25(Sun)16:03:15 No.106515437

>>106515379
>GLM without thinking and with a prefill will basically never refuse and it can write some fucked up shit.
So does gpt-oss but the latter feels more creative than GLM. That model is too prone to repetition and it breaks down with context pretty fast. The honeymoon period didn't last long and I now use gpt-oss as my main model.

Anonymous
09/07/25(Sun)16:03:28 No.106515439

Anonymous 09/07/25(Sun)16:03:28 No.106515439

File: 1748840868893020.png (1.77 MB, 1010x926)

1.77 MB PNG

So I heard a little while back that the zucc wants to create "a personal superintelligence". Does he mean he wants all people to be able to use "super intelligent models"? What is his end goal here?

Anonymous
09/07/25(Sun)16:04:12 No.106515442

Anonymous 09/07/25(Sun)16:04:12 No.106515442

>>106515437
Nta. So turning off "thinking" results in better quality and less refusals?

Anonymous
09/07/25(Sun)16:04:37 No.106515444

Anonymous 09/07/25(Sun)16:04:37 No.106515444

>>106515439
he wants to create the perfect RAG

Anonymous
09/07/25(Sun)16:05:11 No.106515450

Anonymous 09/07/25(Sun)16:05:11 No.106515450

>>106515439
All people with a Facebook account maybe, Meta is API first now thanks to Wang's wisdom.

Anonymous
09/07/25(Sun)16:06:23 No.106515459

Anonymous 09/07/25(Sun)16:06:23 No.106515459

best model to be a kemoshota and get fucked by big bad wolves?

Anonymous
09/07/25(Sun)16:07:34 No.106515463

Anonymous 09/07/25(Sun)16:07:34 No.106515463

>>106515459
that sounds like a lot of bloat knowledge.

Anonymous
09/07/25(Sun)16:07:48 No.106515466

Anonymous 09/07/25(Sun)16:07:48 No.106515466

>>106515459
Well, I learned a new word today.

Anonymous
09/07/25(Sun)16:08:28 No.106515472

Anonymous 09/07/25(Sun)16:08:28 No.106515472

>>106515463
bloat? thats essential knowledge

Anonymous
09/07/25(Sun)16:08:37 No.106515473

Anonymous 09/07/25(Sun)16:08:37 No.106515473

>>106515459
Gemma 300M with RAG

Anonymous
09/07/25(Sun)16:08:57 No.106515476

Anonymous 09/07/25(Sun)16:08:57 No.106515476

>>106515459
Male or female wolves?

Anonymous
09/07/25(Sun)16:10:14 No.106515485

Anonymous 09/07/25(Sun)16:10:14 No.106515485

File: snort.gif (34 KB, 500x400)

34 KB GIF

>>106515476
both

Anonymous
09/07/25(Sun)16:10:50 No.106515491

Anonymous 09/07/25(Sun)16:10:50 No.106515491

>>106515459
real life you big faggot haha lol owned

Anonymous
09/07/25(Sun)16:11:10 No.106515496

Anonymous 09/07/25(Sun)16:11:10 No.106515496

File: 1742276907743626.png (3 MB, 1862x5014)

3 MB PNG

is this true?

Anonymous
09/07/25(Sun)16:12:16 No.106515504

Anonymous 09/07/25(Sun)16:12:16 No.106515504

>>106515496
Fuck no, we can't do anything worthwhile, except for that one anon of course.

Anonymous
09/07/25(Sun)16:12:30 No.106515509

Anonymous 09/07/25(Sun)16:12:30 No.106515509

>>106515496
it is, i was there.
>>106515491
i cant be a cute creature in real life anon kun...

Anonymous
09/07/25(Sun)16:13:18 No.106515520

Anonymous 09/07/25(Sun)16:13:18 No.106515520

>>106515310
Seems like you increased the lethargy and dementia setting a little too high

Anonymous
09/07/25(Sun)16:15:36 No.106515538

Anonymous 09/07/25(Sun)16:15:36 No.106515538

>>106515442
I like it more with thinking enabled. Usually what I have to do is grab one with a refusal and flip some of the words. After leaving one or two in context, it starts doing the thinking in an uncensored way. Then I like being able to edit the thinking part to customize the response.

Anonymous
09/07/25(Sun)16:15:40 No.106515539

Anonymous 09/07/25(Sun)16:15:40 No.106515539

>>106515301
Holy slowpoke

Anonymous
09/07/25(Sun)16:16:18 No.106515545

Anonymous 09/07/25(Sun)16:16:18 No.106515545

>>106515181
>If it is possible to fine-tune a decent model with significantly less data than the entire internet, then that theoretically could mean you could have better models at lower parameters
Huh? Training on less data would reduce training cost, but not parameters. If anything you'd need more parameters to reach the same level. More data makes a model more parameter-efficient, not less. You're confused, anon. People figured this out a long time ago when they moved past chinchilla scaling.
Current models ARE probably inefficient at RP because they're not being designed for it, but this doesn't mean you can skip out on training.

Anonymous
09/07/25(Sun)16:17:49 No.106515559

Anonymous 09/07/25(Sun)16:17:49 No.106515559

File: 30474 - SoyBooru.png (118 KB, 337x390)

118 KB PNG

More kiwis soon! (Qwen)

Anonymous
09/07/25(Sun)16:18:06 No.106515561

Anonymous 09/07/25(Sun)16:18:06 No.106515561

>>106515496
I think these kind of things should be documented. I'm pretty sure a lot of stuff discovered back then are still not in any paper yet

Anonymous
09/07/25(Sun)16:20:00 No.106515574

Anonymous 09/07/25(Sun)16:20:00 No.106515574

File: 1734560768546931.gif (3.12 MB, 498x370)

3.12 MB GIF

>>106515561
>lot of stuff discovered back then are still not in any paper yet
like what?

Anonymous
09/07/25(Sun)16:24:00 No.106515612

Anonymous 09/07/25(Sun)16:24:00 No.106515612

>>106515545
Not the entire internet.

Anonymous
09/07/25(Sun)16:25:20 No.106515623

Anonymous 09/07/25(Sun)16:25:20 No.106515623

>>106515291
It's open source. Show where.

Anonymous
09/07/25(Sun)16:28:11 No.106515646

Anonymous 09/07/25(Sun)16:28:11 No.106515646

>>106515545
NTA, but since larger models memorize more, they're able to recall more of the rare information seen during pretraining. To some extent (it's not just that, admittedly), the stronger RP capabilities of those models are because of that. A smaller model pretrained primarily for the purpose of simulating human interactions, conversations and RP (instead of improving math benchmarks, etc) could potentially match the capabilities of larger models in that area.

Of course we'll never have that as long as we have anons who care about models doing the tsundere bubble sort in Erlang or proving the theory of relativity while making you a blowjob as a mesugaki.

Anonymous
09/07/25(Sun)16:28:40 No.106515651

Anonymous 09/07/25(Sun)16:28:40 No.106515651

>>106515612
I think the last few years has been pretty bad for ai generated content. and all the political radicalization in the past decade or so. I think ideally you would use the entire pre 2012 internet.

Anonymous
09/07/25(Sun)16:28:57 No.106515654

Anonymous 09/07/25(Sun)16:28:57 No.106515654

>>106515612
Yes, we'll filter out the spam and garbage.

Anonymous
09/07/25(Sun)16:29:44 No.106515663

Anonymous 09/07/25(Sun)16:29:44 No.106515663

>>106515654
define garbage?

Anonymous
09/07/25(Sun)16:29:56 No.106515666

Anonymous 09/07/25(Sun)16:29:56 No.106515666

>>106515654
You need to filter more than that to be efficient.

Anonymous
09/07/25(Sun)16:30:10 No.106515669

Anonymous 09/07/25(Sun)16:30:10 No.106515669

>>106515545
This is correct. Bigger models are more sample efficient, so they an learn more from less data. In contrast, small models need more data to reach a given level of quality.

Anonymous
09/07/25(Sun)16:30:41 No.106515675

Anonymous 09/07/25(Sun)16:30:41 No.106515675

>>106515663
"the a congo sex the the a congo congo nigeria vagina pussy pussy the the" kind of documents.

Anonymous
09/07/25(Sun)16:31:02 No.106515679

Anonymous 09/07/25(Sun)16:31:02 No.106515679

Most posters itt are severely autistic.

Anonymous
09/07/25(Sun)16:31:57 No.106515692

Anonymous 09/07/25(Sun)16:31:57 No.106515692

>>106515679
Yeah, can't believe anyone would argue for hours about how to train models instead of just doing it and shutting everyone up for good.

Anonymous
09/07/25(Sun)16:35:41 No.106515710

Anonymous 09/07/25(Sun)16:35:41 No.106515710

>>106515679
https://vocaroo.com/12RPstjPnT74

Anonymous
09/07/25(Sun)16:36:43 No.106515719

Anonymous 09/07/25(Sun)16:36:43 No.106515719

>>106512596
>you're a spoiled child if you expect AIniggerdevs to stop writing python slop code that requires command line manual installation in 2025
EXE. I want EXE. Where is the EXE.

Anonymous
09/07/25(Sun)16:37:22 No.106515723

Anonymous 09/07/25(Sun)16:37:22 No.106515723

>>106515692
Sure, just download the entire internet, set up a filtering pipeline, then give me the money to pretrain a full model on it.

Anonymous
09/07/25(Sun)16:37:26 No.106515725

Anonymous 09/07/25(Sun)16:37:26 No.106515725

>>106515692
It all started when I suggested that LLMs won't acquire significant "tacit knowledge" until they've seen large amounts of data, and that this could be expedited with targeted training data...

Anonymous
09/07/25(Sun)16:38:29 No.106515740

Anonymous 09/07/25(Sun)16:38:29 No.106515740

>>106514698
Yes, its been explained before, the context search isn't literal, most NIAH tests involve having context like "John put some mayonnaise on his hamburger and hot dog." and then asking "What condiment did John put on his hamburger?". NoLiMA goes and asks something like "John got some french fries. What condiment(s) would he likely put on it?". That requires actual reasoning and connecting the dots on the context you have to extrapolate correctly things which is harder when you aren't as said literally matching what you have seen in the context to the question asking about it.

Anonymous
09/07/25(Sun)16:38:30 No.106515741

Anonymous 09/07/25(Sun)16:38:30 No.106515741

>>106515692
https://vocaroo.com/1nWbsRIXibi3

Anonymous
09/07/25(Sun)16:39:26 No.106515749

Anonymous 09/07/25(Sun)16:39:26 No.106515749

File: Screenshot_20250907_163241.png (80 KB, 1523x759)

80 KB PNG

>>106515692
it takes time, anon

Anonymous
09/07/25(Sun)16:40:12 No.106515756

Anonymous 09/07/25(Sun)16:40:12 No.106515756

>>106515679
I should hope so. Autistic people are how we get our best technological breakthroughs.

Anonymous
09/07/25(Sun)16:40:29 No.106515758

Anonymous 09/07/25(Sun)16:40:29 No.106515758

File: nip.jpg (620 KB, 2569x1927)

620 KB JPG

Why hasn't OpenHands finetuned a coding model on Qwen3-coder yet? why use Qwen2-coder

Anonymous
09/07/25(Sun)16:40:33 No.106515759

Anonymous 09/07/25(Sun)16:40:33 No.106515759

>>106515719
Exactly. llama.cpp is popular because you can download it as an exe file and run, no pythonshit needed.

Anonymous
09/07/25(Sun)16:40:48 No.106515760

Anonymous 09/07/25(Sun)16:40:48 No.106515760

>>106515741
god damn thats so good ladies and gentlemen the best tts out there

Anonymous
09/07/25(Sun)16:41:32 No.106515766

Anonymous 09/07/25(Sun)16:41:32 No.106515766

>>106514823
>>decide to check on ipex-llm to see if they finally updated to support latest models like oss-gpt
They are all in on vLLM and for good reason too because of the enterprise and project BattleMatrix. They do what they can with ipex-llm and contributions to llama.cpp but it is lower priority and neglected. Mainline llama.cpp SYCL isn't that bad, but you can see the neglect when a crashing bug was fixed in https://github.com/ggml-org/llama.cpp/pull/15582 but there was a mistake done and it wasn't followed up on with two weeks and counting to get it merged in. Sad.

Anonymous
09/07/25(Sun)16:42:10 No.106515769

Anonymous 09/07/25(Sun)16:42:10 No.106515769

gay thread. 80% of the population is newfags 80% tof the population is troons.

Anonymous
09/07/25(Sun)16:43:47 No.106515783

Anonymous 09/07/25(Sun)16:43:47 No.106515783

>>106515769
>80% tof the population is troons.
It's not quite that bad yet nonnie.

Anonymous
09/07/25(Sun)16:44:00 No.106515784

Anonymous 09/07/25(Sun)16:44:00 No.106515784

>>106515766
>vLLM
Are they actually directly contributing to vLLM to have native ipex support or do you still have to go through ipex-llm to use vLLM with ipex?

Anonymous
09/07/25(Sun)16:44:37 No.106515787

Anonymous 09/07/25(Sun)16:44:37 No.106515787

>>106515769
Saar! You forgot India mention! India AI superpower 2025 Gemini Google. Kindly say 80% posters Indian thank you saar.

Anonymous
09/07/25(Sun)16:45:23 No.106515790

Anonymous 09/07/25(Sun)16:45:23 No.106515790

>>106515787
80% of population is indeed indian too.

Anonymous
09/07/25(Sun)16:47:29 No.106515802

Anonymous 09/07/25(Sun)16:47:29 No.106515802

>>106515790
so 80% indian train gays? Waow!

Anonymous
09/07/25(Sun)16:47:56 No.106515803

Anonymous 09/07/25(Sun)16:47:56 No.106515803

>>106515760
https://vocaroo.com/121za8zMgKiQ

Anonymous
09/07/25(Sun)16:48:39 No.106515810

Anonymous 09/07/25(Sun)16:48:39 No.106515810

Will we soon look at "coding" the same way we look at "calculating"? Prior to calculators and computers, we used to have rooms of humans doing things like ballistics calculations.
Now you still need to to know *math* to use a calculator or spreadsheet effectively to solve problems that span more than a single operation, but you don't need to do *arithmetic* any more.
tl;dr vibecoding normies are just monkeys banging on a calculator to get "8008135". There's higher-order knowledge needed to make software.

Anonymous
09/07/25(Sun)16:51:30 No.106515827

Anonymous 09/07/25(Sun)16:51:30 No.106515827

>>106515803
shit, a few seconds in I knew I was going to contract orange man cancer

Anonymous
09/07/25(Sun)16:51:38 No.106515828

Anonymous 09/07/25(Sun)16:51:38 No.106515828

>>106515784
https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html
You have to build the wheels yourself but they are contributing as regularly as a company going bankrupt with limited resources is doing
https://github.com/vllm-project/vllm/commit/e599e2c65ee32abcc986733ab0a55becea158bb4
This is on par with their Pytorch cadence. This was the last SYCL related commit to llama.cpp in comparison and it wasn't even done by Intel.
https://github.com/ggml-org/llama.cpp/commit/8b696861364360770e9f61a3422d32941a477824

Anonymous
09/07/25(Sun)16:51:50 No.106515830

Anonymous 09/07/25(Sun)16:51:50 No.106515830

>>106515803
But I can't use voice clips from my favorite anime voice actress

Anonymous
09/07/25(Sun)16:51:55 No.106515831

Anonymous 09/07/25(Sun)16:51:55 No.106515831

File: _86597221_gettyimages-464(...).jpg (121 KB, 976x549)

121 KB JPG

>>106515802
Yes saar India love trains saar very prod of the country

Anonymous
09/07/25(Sun)16:53:42 No.106515846

Anonymous 09/07/25(Sun)16:53:42 No.106515846

>>106515831
MAKE NEW THREAD BLOODY

Anonymous
09/07/25(Sun)16:54:08 No.106515848

Anonymous 09/07/25(Sun)16:54:08 No.106515848

>>106515803
nice

Anonymous
09/07/25(Sun)16:54:48 No.106515852

Anonymous 09/07/25(Sun)16:54:48 No.106515852

File: 1730545865388820.png (1.65 MB, 712x1188)

1.65 MB PNG

>>106515719
>>106515759
If you get filtered by CLI you deserve suffering.

Anonymous
09/07/25(Sun)16:55:34 No.106515855

Anonymous 09/07/25(Sun)16:55:34 No.106515855

>>106515679
I have never been diagnosed with autism.

Anonymous
09/07/25(Sun)16:59:20 No.106515873

Anonymous 09/07/25(Sun)16:59:20 No.106515873

>>106515852
Fuck off pyshitter.

Anonymous
09/07/25(Sun)16:59:30 No.106515875

Anonymous 09/07/25(Sun)16:59:30 No.106515875

>>106515855
Only a doctor can diagnose autism.

Anonymous
09/07/25(Sun)16:59:47 No.106515877

Anonymous 09/07/25(Sun)16:59:47 No.106515877

>>106515803
what settings are you using? mine are all coming out completely schizophrenic.

Anonymous
09/07/25(Sun)16:59:53 No.106515879

Anonymous 09/07/25(Sun)16:59:53 No.106515879

>>106515719
>>106515759
>>106502028
https://vocaroo.com/1j4yGPQKczdx

Anonymous
09/07/25(Sun)17:02:03 No.106515896

Anonymous 09/07/25(Sun)17:02:03 No.106515896

>>106515877
30 steps. It depends on the voice sample too.

Anonymous
09/07/25(Sun)17:03:47 No.106515905

Anonymous 09/07/25(Sun)17:03:47 No.106515905

>>106515852
Damn, Python looks like that?
I don't care that she's a bit slow, she's bloated in all the right places.

Anonymous
09/07/25(Sun)17:03:52 No.106515907

Anonymous 09/07/25(Sun)17:03:52 No.106515907

>>106502028
Actually incredibly based opinion, but troons will disagree.

Anonymous
09/07/25(Sun)17:05:39 No.106515920

Anonymous 09/07/25(Sun)17:05:39 No.106515920

Has anyone sussed out any best practices for vibevoice samples? I'm not sure yet if it's better to go for longer samples or to trim it down closer to the length of the audio you're trying to make.

Anonymous
09/07/25(Sun)17:06:28 No.106515926

Anonymous 09/07/25(Sun)17:06:28 No.106515926

>>106502028
That is what kobold is for. And from kobold you can fall into a trap of oobashit or you can go straight to llamacpp. Once you have it set up it honestly isn't that bad. I don't even have a bat, just have the commands in a textfile and it werks. Even without bat it is actually faster than oobashit and kobold.

Anonymous
09/07/25(Sun)17:06:38 No.106515929

Anonymous 09/07/25(Sun)17:06:38 No.106515929

>>106515896
does this shit need a longer sample? same 8 second sample i used with zonos is completely wonked out with default settings in the node setup, adding steps doesn't change it.

Anonymous
09/07/25(Sun)17:07:01 No.106515932

Anonymous 09/07/25(Sun)17:07:01 No.106515932

>>106515437
>The honeymoon period didn't last long
this has been the entire history of the GLM models and only retards keep pushing them

Anonymous
09/07/25(Sun)17:08:45 No.106515944

Anonymous 09/07/25(Sun)17:08:45 No.106515944

File: 1753483768624402.png (820 KB, 724x510)

820 KB PNG

>>106515907
>>106501412
>SPEAK LIKE A HUMAN BEING YOU SYNTHETIC MONSTER

Anonymous
09/07/25(Sun)17:09:25 No.106515948

Anonymous 09/07/25(Sun)17:09:25 No.106515948

>>106515437
>and I now use gpt-oss as my main model.
what?

Anonymous
09/07/25(Sun)17:12:57 No.106515963

Anonymous 09/07/25(Sun)17:12:57 No.106515963

>>106515929
I used a 23s sample for that one.

Anonymous
09/07/25(Sun)17:13:38 No.106515972

Anonymous 09/07/25(Sun)17:13:38 No.106515972

>>106515437
>I now use gpt-oss as my main model
Bro, you're supposed to turn off the model thinking not your own

Anonymous
09/07/25(Sun)17:14:58 No.106515985

Anonymous 09/07/25(Sun)17:14:58 No.106515985

>>106515929
most of my samples are a minimum of 40 seconds but two minutes gives the best results
smallest on is the Mandy sample I used here >>106515879 at like 38 seconds and I cleaned it to the best of my abilities but some of the background noises still bleed thru

Anonymous
09/07/25(Sun)17:15:21 No.106515987

Anonymous 09/07/25(Sun)17:15:21 No.106515987

>>106515926
https://vocaroo.com/1bFeQGTMqTTf

Anonymous
09/07/25(Sun)17:16:17 No.106515994

Anonymous 09/07/25(Sun)17:16:17 No.106515994

>>106515972
you never had any thinking when you thought glm was a good model bro

Anonymous
09/07/25(Sun)17:16:24 No.106515999

Anonymous 09/07/25(Sun)17:16:24 No.106515999

>>106515496
>Holo prompt
Was ist das? Google returns garbage.

Anonymous
09/07/25(Sun)17:16:56 No.106516002

Anonymous 09/07/25(Sun)17:16:56 No.106516002

>>106515963
noted. I guess the combination of my sample being kinda fast + only 8 seconds it really didn't like that.
here's alucard reading this post >>106515787
(30 steps seems like the max it needs for a quality boost)
https://voca.ro/1i3Yya3rUVn6

>>106515985
cool thanks for the info, gonna be a challenge to get that character over 8 seconds but at least alucard had that 10+ seconds kek

Anonymous
09/07/25(Sun)17:18:36 No.106516013

Anonymous 09/07/25(Sun)17:18:36 No.106516013

>>106515999
go to the link in the image and read the thread

Anonymous
09/07/25(Sun)17:18:49 No.106516017

Anonymous 09/07/25(Sun)17:18:49 No.106516017

>>106515999(me)
ah fuck, should've scrolled down the image.

Anonymous
09/07/25(Sun)17:21:27 No.106516037

Anonymous 09/07/25(Sun)17:21:27 No.106516037

>>106515830
Why not?
https://vocaroo.com/1beCnoUdgpID

Anonymous
09/07/25(Sun)17:21:43 No.106516038

Anonymous 09/07/25(Sun)17:21:43 No.106516038

File: dipsyByzantine1.png (3.44 MB, 1024x1536)

3.44 MB PNG

>>106515810
The real change will come when there are "vibecode" specific languages created.
It's only a matter of time.

Anonymous
09/07/25(Sun)17:23:29 No.106516045

Anonymous 09/07/25(Sun)17:23:29 No.106516045

>>106516038
It's called English, r-tard.

Anonymous
09/07/25(Sun)17:24:48 No.106516059

Anonymous 09/07/25(Sun)17:24:48 No.106516059

yeah my results are aaaalll over the place, but this turned out really nicely.

https://voca.ro/16gmTFt1O8vf

Anonymous
09/07/25(Sun)17:27:12 No.106516074

Anonymous 09/07/25(Sun)17:27:12 No.106516074

>>106516038
isn't that just python?

Anonymous
09/07/25(Sun)17:27:31 No.106516077

Anonymous 09/07/25(Sun)17:27:31 No.106516077

>>106516059
I found out that ComfyUI implementation is all over the place. Python demo is way more reliable and it's more consistent.
I don't know if it's because of Cumrag itself or are the implemented nodes bad. I can only guess.

Anonymous
09/07/25(Sun)17:27:53 No.106516080

Anonymous 09/07/25(Sun)17:27:53 No.106516080

>>106515907
>>106515719
https://voca.ro/1b5FwnOiykK6

Anonymous
09/07/25(Sun)17:28:18 No.106516084

Anonymous 09/07/25(Sun)17:28:18 No.106516084

>>106515759
llama.c when

Anonymous
09/07/25(Sun)17:28:21 No.106516085

Anonymous 09/07/25(Sun)17:28:21 No.106516085

>>106516038
Javascript already exists, anon. People have been vibecoding JS since before AI existed.

Anonymous
09/07/25(Sun)17:28:33 No.106516087

Anonymous 09/07/25(Sun)17:28:33 No.106516087

>>106516059
I think it struggles with voices that have a very high dynamic frequency range like Peach there. It's difficult to get a sample for certain seiyuus where they aren't peaky like that, since that's part of the appeal.

Anonymous
09/07/25(Sun)17:28:53 No.106516091

Anonymous 09/07/25(Sun)17:28:53 No.106516091

>>106516077
could ya link me the python demo? thanku.
glad to know i'm not alone.

Anonymous
09/07/25(Sun)17:29:08 No.106516093

Anonymous 09/07/25(Sun)17:29:08 No.106516093

>>106516074
and javascript for when interfaces are needed

Anonymous
09/07/25(Sun)17:30:06 No.106516100

Anonymous 09/07/25(Sun)17:30:06 No.106516100

>>106516080
kek

Anonymous
09/07/25(Sun)17:30:31 No.106516106

Anonymous 09/07/25(Sun)17:30:31 No.106516106

>>106516080
I think I recognize that voice a little but can't quite place it, is it a cartoon girl bully?

Anonymous
09/07/25(Sun)17:30:58 No.106516108

Anonymous 09/07/25(Sun)17:30:58 No.106516108

>>106516091
https://github.com/vibevoice-community/VibeVoice/

Anonymous
09/07/25(Sun)17:31:03 No.106516109

Anonymous 09/07/25(Sun)17:31:03 No.106516109

>>106515803
make him do porn noises

Anonymous
09/07/25(Sun)17:31:57 No.106516117

Anonymous 09/07/25(Sun)17:31:57 No.106516117

>>106514051
>the quality of the output starting to go downhill if I generate anything longer than 30 seconds.

Not true with the native implementation (clone Microsoft/VibeVoice)

I did not go over 4 min/ However, it stays consistent all along

Anonymous
09/07/25(Sun)17:33:15 No.106516130

Anonymous 09/07/25(Sun)17:33:15 No.106516130

>>106516106
It's the Witch from Slay The Princess.

Anonymous
09/07/25(Sun)17:33:22 No.106516133

Anonymous 09/07/25(Sun)17:33:22 No.106516133

>>106516037
https://vocaroo.com/1iIp0ji2b59p

Anonymous
09/07/25(Sun)17:34:12 No.106516140

Anonymous 09/07/25(Sun)17:34:12 No.106516140

>>106516091
>>106516108
Forgot that you'll need to look at inference_from_file.py
and do something like
>python demo/inference_from_file.py --model_path ./VibeVoice-1.5B --txt_path ./test.txt --speaker_names Faggot
>voices go into demo/voice and are named en-Faggot_male for example

Anonymous
09/07/25(Sun)17:34:34 No.106516147

Anonymous 09/07/25(Sun)17:34:34 No.106516147

>>106516117
It sadly gets pretty rough even with that when you try to do a 40 minute script. It slowly starts getting worse.

Anonymous
09/07/25(Sun)17:36:05 No.106516160

Anonymous 09/07/25(Sun)17:36:05 No.106516160

>>106516130
never played that game and checked her imdb, barely any roles and mostly online shit
weird

Anonymous
09/07/25(Sun)17:38:33 No.106516176

Anonymous 09/07/25(Sun)17:38:33 No.106516176

amerilards cant into bakery

Anonymous
09/07/25(Sun)17:38:51 No.106516177

Anonymous 09/07/25(Sun)17:38:51 No.106516177

>>106516108
>>106516140
thanks. i noticed there's two repos for this too, the wildminder one and one by a guy named Fabio Sarracino.

Might just uninstall wildminder's and risk getting AIDS from Fabio. Worth a shot.

Anonymous
09/07/25(Sun)17:40:30 No.106516188

Anonymous 09/07/25(Sun)17:40:30 No.106516188

File: tumblr_nb0xnpEI6t1s7ifuuo1_500.gif (1.5 MB, 499x339)

1.5 MB GIF

>>106515810
I've been thinking a lot about this old Twilight Zone episode that depicted future programmers as just people with microphones that speak to the machines.
There was also a time when compilers were looked down. They saved time by letting you write in C, but often times the result was not performant, didn't output valid Assembler, and you ended up having to write your own anyway. Nowadays, almost no one has to write hand rolled Assembler anymore, and attempting to do so outside a few niches would result in worse code than what the compiler is capable of writing.
The technology is still new. I'm sure the transition to higher-order knowledge work is inevitable, but it's probably still decades away.

Anonymous
09/07/25(Sun)17:42:06 No.106516198

Anonymous 09/07/25(Sun)17:42:06 No.106516198

>>106516038
Tokens to represent ASM opcodes?
direct cpu token interpretation?
Token microcode?

Anonymous
09/07/25(Sun)17:42:37 No.106516202

Anonymous 09/07/25(Sun)17:42:37 No.106516202

>>106516147
It's built on Qwen 2.5, so naturally it will start to degrade when you try to use the full context it claims to support. Just chunk it. At least you can do far bigger chunks than with other TTS.

Anonymous
09/07/25(Sun)17:44:03 No.106516214

Anonymous 09/07/25(Sun)17:44:03 No.106516214

>>106516038
Shouldn't it be sygma instead of capital C

naoshite kure

Anonymous
09/07/25(Sun)17:44:23 No.106516218

Anonymous 09/07/25(Sun)17:44:23 No.106516218

>>106516109
https://vocaroo.com/1oL6yJxfeEIp

Anonymous
09/07/25(Sun)17:44:28 No.106516219

Anonymous 09/07/25(Sun)17:44:28 No.106516219

>>106516198
Then we build a language on top of that.
We need more layers of abstraction.

Anonymous
09/07/25(Sun)17:44:59 No.106516221

Anonymous 09/07/25(Sun)17:44:59 No.106516221

>>106516198
The vast majority of enterprise LOB apps and startup shovelware does not need to go that low level. The trend is always towards more abstractions, not less. If anything, a new language designed for use by LLMs would be an abstraction over Python.

Anonymous
09/07/25(Sun)17:46:39 No.106516244

Anonymous 09/07/25(Sun)17:46:39 No.106516244

>>106516160
Yeah, as best I can tell, she's a pretty mid streamer with a ton of untapped voice acting talent. I think she only got the Slay The Princess role after sending something directly to the devs. I hope she goes out and gets more roles, because she knocked it out of the park with the one she got.

Anonymous
09/07/25(Sun)17:47:24 No.106516248

Anonymous 09/07/25(Sun)17:47:24 No.106516248

>>106516218
L.O.L.
You made me weekend!

Anonymous
09/07/25(Sun)17:47:32 No.106516250

Anonymous 09/07/25(Sun)17:47:32 No.106516250

>>106516218
After hearing some porn noise samples posted here, I can now conclude that rumors about VV being pulled due to NSFW usage are false.

Anonymous
09/07/25(Sun)17:49:44 No.106516266

Anonymous 09/07/25(Sun)17:49:44 No.106516266

>>106516218
Lol

Anonymous
09/07/25(Sun)17:50:27 No.106516275

Anonymous 09/07/25(Sun)17:50:27 No.106516275

>>106516147
what kind of "limit" is it? it is the github demo with gradio

Loaded example: 1p_Ch2EN.txt with 1 speakers
Loaded example: 1p_abs.txt with 1 speakers
Loaded example: 2p_goat.txt with 2 speakers
Loaded example: 2p_music.txt with 2 speakers
Loaded example: 2p_short.txt with 2 speakers
Loaded example: 2p_yayi.txt with 2 speakers
Loaded example: 3p_gpt5.txt with 3 speakers
Skipping 4p_climate_100min.txt: duration 100 minutes exceeds 15-minute limit
Skipping 4p_climate_45min.txt: duration 45 minutes exceeds 15-minute limit
Successfully loaded 7 example scripts
 Launching demo on port 7860

Anonymous
09/07/25(Sun)17:50:29 No.106516276

Anonymous 09/07/25(Sun)17:50:29 No.106516276

>>106516038
we tried a few times making programming languages that are close to natural language and easy for humans, but results were mediocre.
SQL is a particular disaster that will haunt us forever.

Anonymous
09/07/25(Sun)17:51:42 No.106516288

Anonymous 09/07/25(Sun)17:51:42 No.106516288

>>106516275
Oh, so it's supposed to be used with shorter scripts. Is it only the 1.5B one that can do long scripts?

Anonymous
09/07/25(Sun)17:53:37 No.106516305

Anonymous 09/07/25(Sun)17:53:37 No.106516305

>>106516218
lmaooooo, that's why I love this site, I know I'll find kino shit like that at some point

Anonymous
09/07/25(Sun)17:54:25 No.106516309

Anonymous 09/07/25(Sun)17:54:25 No.106516309

>>106516288
The model card says 7B can do 45 minutes and 1.5B can do 90 minutes.

Anonymous
09/07/25(Sun)17:55:57 No.106516320

Anonymous 09/07/25(Sun)17:55:57 No.106516320

>>106516309
It's still mostly intelligible at 45 minutes, but it does hurt your ears, so it's not a complete lie.

Anonymous
09/07/25(Sun)17:56:10 No.106516325

Anonymous 09/07/25(Sun)17:56:10 No.106516325

File: Screenshot from 2025-09-0(...).png (38 KB, 1035x350)

38 KB PNG

>>106516288

hmmm...

Anonymous
09/07/25(Sun)17:58:40 No.106516349

Anonymous 09/07/25(Sun)17:58:40 No.106516349

>>106516276
I think it'll be something far different, that probably won't make intuitive (human) sense. Basically human unreadable.
I get the point, can't train what you don't have examples of, and there's lots of python and JS to copy from.
But I expect there will be some intermediary language, that LLM (or whatever) can manipulate really easily, and humans won't be able to understand at all.

Anonymous
09/07/25(Sun)18:00:31 No.106516362

Anonymous 09/07/25(Sun)18:00:31 No.106516362

https://voca.ro/1gZ6xankFzjP

Anonymous
09/07/25(Sun)18:01:55 No.106516373

Anonymous 09/07/25(Sun)18:01:55 No.106516373

>>106516349
eh, just skip the middle man and train it on machine code directly at this point.
Chunk generated output into a VM, see if it works, reward/punish the model, repeat.

Anonymous
09/07/25(Sun)18:02:23 No.106516379

Anonymous 09/07/25(Sun)18:02:23 No.106516379

>>106516368
>>106516368
>>106516368

Anonymous
09/07/25(Sun)18:03:46 No.106516389

Anonymous 09/07/25(Sun)18:03:46 No.106516389

>>106516373
>eh, just skip the middle man and train it on machine code directly at this point.
Do you have any idea how many tokens it would need to spit out to do even the most trivial tasks?

Anonymous
09/07/25(Sun)18:06:44 No.106516412

Anonymous 09/07/25(Sun)18:06:44 No.106516412

>>106515679
high functioning autist hobby that filters most people
your average joe wouldn't know how any of this works except thinking that chatgpt is some kind of magic word machine that pulls stuff out of thin air

Anonymous
09/07/25(Sun)18:17:27 No.106516481

Anonymous 09/07/25(Sun)18:17:27 No.106516481

>>106516412
>high functioning autist forum that filters most people
ftfy

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.