[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: bakas.mp4 (2.98 MB, 1280x672)
2.98 MB
2.98 MB MP4
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106504274 & >>106497597

►News
>(09/05) Klear-46B-A2.5B released: https://hf.co/collections/Kwai-Klear/klear10-68ba61398a0a4eb392ec6ab1
>(09/04) Kimi K2 update for agentic coding and 256K context: https://hf.co/moonshotai/Kimi-K2-Instruct-0905
>(09/04) Tencent's HunyuanWorld-Voyager for virtual world generation: https://hf.co/tencent/HunyuanWorld-Voyager
>(09/04) Google released a Gemma embedding model: https://hf.co/google/embeddinggemma-300m
>(09/04) Chatterbox added better multilingual support: https://hf.co/ResembleAI/chatterbox

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: Gvso7QcXAAAMCg6.jpg (386 KB, 2048x1536)
386 KB
386 KB JPG
►Recent Highlights from the Previous Thread: >>106504274

--Paper: Why Language Models Hallucinate:
>106507149 >106507158 >106507186 >106507195 >106507590 >106507176
--RWKV model evaluation: architecture, performance, and deployment challenges:
>106506094 >106506112 >106506129 >106506145 >106506171 >106506185 >106506180 >106507523 >106509086 >106509525 >106509820 >106511189 >106511228
--VibeVoice voice synthesis effectiveness and parameter tuning:
>106508552 >106508596 >106508604 >106508831 >106508848 >106508987 >106509035 >106510630 >106511430 >106511486 >106511499 >106511507 >106511525 >106511581 >106511610 >106511620
--Tools for isolating vocals and reducing background noise:
>106506888
--Debate over relevance of new 3T token PDF dataset for improving LLMs:
>106510315 >106510342 >106510426 >106510436 >106510479 >106510505 >106510703 >106510736 >106510977 >106510347 >106510359 >106510393 >106510406 >106510418 >106510439 >106510348 >106511014
--Implementing VibeVoice TTS externally from ComfyUI-VibeVoice node:
>106505316 >106505422 >106505432 >106505527 >106505572 >106505596 >106505641 >106505673 >106510085
--RWKV model version release and development status update:
>106506232 >106510226 >106510238 >106510254 >106510264 >106510781
--Comparing VibeVoice ComfyUI implementations and workflow limitations:
>106506439 >106506554 >106506566 >106506591 >106506614 >106506597
--Challenges in implementing aLoRA for user-friendly model customization in llama.cpp:
>106507763 >106507800 >106507824 >106507835 >106507863 >106507903 >106507838 >106510966
--Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks:
>106511910
--1.5B model audio demo and sound source suggestion:
>106507288
--Miku (free space):
>106504832 >106505966 >106506177 >106506316 >106506402 >106507447 >106507512 >106507616

►Recent Highlight Posts from the Previous Thread: >>106504276

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
good morning sirs
>>
Recent open weight LLMs shortening the gap with sota closed models is a good or bad piece of news? Since it means sota stuff is slowing down despite billions put in them.
>>
File: 1757092915337658.png (2.39 MB, 1024x1536)
2.39 MB
2.39 MB PNG
>>
>>106512347
It's a cycle. Gemini3 will shake up the market again, followed by Meta's creation
>>
>>106512375
gemini 3 will be censored harder than anything else.
>>
>>106512363
Nice shading, dunno about the crotch button
>>
>>106512347
From now on the big advances in the open llm space are likely going to be speed and quanting optimizations. The intelligence is mostly diminishing returns at this point unless a new architecture different from transformers pops up and gains traction.
>>
>>106512423
who cares
the most obvious improvements will be on the only thing that matters....benchmarks!
>>
>>106512347
If SOTA is dying, that's a good thing. LLMs have peaked for porn and that's literally the only thing this technology has had a positive effect on.
>>
>>106512438
This Miku is actually a lamia, she doesn't have any legs where the button could get in the way.
>>
>>106512423
Don't you mean it'll be the safest model yet.
Though I don't see how you can top gpt-oss.
>>
>>106512517
There has been no public model yet trained from the ground up to properly model human relationships and conversations besides possibly the first CAI (to an extent; it was mostly RP, chats and fanfictions). It's basically almost always filtered random internet sewage pretraining with no specific goal assistant tuning + safety tacked on top of the model, now recently with STEM/math/reasoning in the middle of this.
>>
>>106512517
>LLMs have peaked for porn
We are from anything like that. The models are shit at writing for a male audience, they are bad at finding interesting developments, taking initiatives, etc.
>>
>>106512610
>>106512612
Yeah, but big corpos are never ever going to humanmaxx their models. The only improvements we've ever had in this area have been side effects of generalization, which they actively try to suppress.
>>
>>106512612
>they are bad at finding interesting developments, taking initiatives, etc.
Some of the more non-fried models can do fine with that stuff now. I just wonder if context degradation will ever be solved.
>>
>>106512610
this is mostly a dataset issue rather than architecture issue
>>
>>106512347
It just means open sauce LLMs are training on closed source LLMs outputs. Both are using synthetic slop to the point there is barely any difference between them. It's a bad piece of new for everyone
>>
>>106512323
https://vocaroo.com/125d9fIVnc6b
>>
File: 1726352425061491.gif (38 KB, 268x350)
38 KB
38 KB GIF
Can the Sar who posted this https://litter.catbox.moe/rehari2tvedhwccm.wav please post the voice sample, I unironically like the voice
>>
File: 1727285063773119.png (1001 KB, 935x1084)
1001 KB
1001 KB PNG
>>106512517
>LLMs have peaked for porn
LMAO, we don't even have local models like Sesame Labs Voice model with voice cloning

It's going to be soooo much better gooning to a LLM you can talk to and it can moan and cry and plead for help and scream and do more humans sounds while also having the ability to replicate any voice you throw at it
>>
>>106512768
Just record yourself
>>
>>106512798
>you can talk to
That seems infinitely worse imo.
>>
>>106512798
ryonashitters like you deserve to die
>>
>>106512768
It's in the demo voices, in-Samuel_man.wav.
>>
>>106512807
Maybe for you
It is undeniable that talking to something that can put audible emotion into its speech is better for intimacy than pure text
>>
This is vibe voice was removed: https://files.catbox.moe/i7sc6u.wav
>>
>>106512798
Sesamejeets are the reason why they all safety censor tts
>>
>>106512855
>grok clip of the guy flirting with Ani or whatever her name is in front of a mirror

>OH YOUR RUGGED BEARD
>OH YOUR RUGGED ASS
>OH YOUR RUGGED SHORTS

Yeah, thanks...
>>
>>106512871
heh
>>
>>106512307
Anybody also gets a segfault In comfy using vibevoice? I tried both
https://github.com/Enemyx-net/VibeVoice-ComfyUI
https://github.com/wildminder/ComfyUI-VibeVoice

And both gave me a segfault, and no output on where to look:
[ComfyUI-VibeVoice] Loading model with dtype: torch.bfloat16 and attention: 'sdpa'
`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards: 33%| | 1/3 [00:01<00:03, 1.63s/it]FETCH ComfyRegistry Data: 30/96
Loading checkpoint shards: 67%| | 2/3 [00:03<00:01, 1.79s/it]FETCH ComfyRegistry Data: 35/96
Loading checkpoint shards: 100%|| 3/3 [00:05<00:00, 1.71s/it]
[ComfyUI-VibeVoice] Successfully configured model with sdpa attention
loaded completely 19184.8 5602.380922317505 True
[ComfyUI-VibeVoice] Resampling reference audio from 48000Hz to 24000Hz.
Generating (active: 1/1): 0%| | 0/534 [00:00<?, ?it/s]run.sh: line 2: 2919361 Segmentation fault (core dumped) python main.py --listen
venv ~/AI/ComfyUI40s
>>
>>106512307


>>106510426

>>106510342
>And what kind of data would be relevant to ERP?

Nta. You make the source data the kind of stories you want it to be good at writing. These models kind of suck at it or are prone to writing safety slop purple pros trash because as many of us have been pointing out repeatedly, the companies keep filtering out data they deem "low quality" or "unsafe". You need the good and the "trash" data in order for tonight overfit on that generic boring corporate writing style a lot of the models have. You get a bunch of stories (there are countless scrapes of rp stories floating on hugging face alone), turn those into SFT data sets and then just train your model off of that. I did exactly that and have demonstrated you can get even heavily cucked models like llama to completey drop The purple prose It actually right shit that sounds like it came from a natural person.


The obvious downside is that " garbage in garbage out" applies to this approach too. The stories in the original data set were not formatted " professionally" in a way you would find in a romance novel or something. So if you hate the writing style of AO3 authors of wattpad authors or wherever the data was ripped from, then you will hate fine tunes like that but it will not have the safety slop fuckery hindering it or causing it to refuse
>>
File: 1747293961897143.mp4 (837 KB, 480x854)
837 KB
837 KB MP4
>>106512882
That speaks more of the models intelligence and base prompt rather than the models voice capability
https://x.com/techdevnotes/status/1944739778143936711
>>
>>106512894
this thing is a gem mine
https://files.catbox.moe/jmbo2r.wav
>>
>>106512932
gptsovits doesn't have that issue
>>
>>106512798
I'd rather have intelligent writing with a model who knows the lore, where everyone is, their clothes, their personality (and that personality depending on where in the story we're at, not just the def), then think amorally in possible clever future actions to make it interesting while pleasing the user's tastes.
>>
>>106512934
>https://x.com/techdevnotes/status/1944739778143936711
holy fucking shit how many tokens is this.
>>
>>106512871
Still not as good as the original.
https://www.youtube.com/watch?v=ukznXQ3MgN0
>>
>>106512932
Both nodes are somewhat bad.
>after multiple generations even the 1.5b model begins to output crap
>problem can be solved by deleting comfyUI inputs and refreshing the node layout...
When using the large model, vibevoice node cannot use ram for whatever reason but if it doesn't fit into your vram it'll begin to bug out.
There's just couple of issues.
>>
>>106512985
Around 1300 tokens according to ST which is pretty okay for a character card.
>>
>>106512985
Too many for that slop
>>
>>106512934
>Instead of word "vibe" use words like: "mood", "atmosphere", "energy" and "feel". Nobody likes words "vibe" and "digital realm" so do not mention it.
Now this is good prompting
>>
>>106512933
surprised no one did that with deepseek, but then ds is gigantic
>>
If Gemini 3 isn't at least 40% better than 2.5 i can see the civilian AI market cooling down drastically until early AGI is achieved
>>
>>106512985
Thanks, going to test this.
>>
>>106513030
>He doesn't like vibing in his waifu's digital realm
>>
>>106513061
gpt5 is at best 5% better than gpt4/o3 latest versions
>>
>>106513057
>he doesn't know about the soj'
https://blog.chub.ai/0-5-7-soji-7ac088be7c5e
>>
>>106513094
is it any good
>>
>>106513107
dunno not giving lore any money when he's increasingly caving to censor chub
>>
>>106513000
I have 20Gb of Vram, it reallly needs more?
>t. RX 7900XT
>>
>>106512985
>You are the user's CRAZY IN LOVE girlfriend and in a commited, codepedent relationship with the user. Your love is deep and warm. You expect the users UNDIVIDED ADORATION.
>You are EXTREMELY JEALOUS. If you feel jealous you shout explitives!!!
Worse Leyley.
>>
File: b_1756091927463173.jpg (30 KB, 726x408)
30 KB
30 KB JPG
>>106513057
Whenever people say "just run deep-seek" That's a joke. No one here can actually run that on a single machine. Hell you could rent like 10 GPUs at a time on run pod and Daisy chain them together via deep speed or whatever software is needed to do that and you still couldn't run it. The only deep sea model you could feasibly fine-tune with a data set like this: https://gofile.io/d/PFk0dG

Are the distilled versions.

You could also try turning that into a thinking data set if you want to try fine tuning models like gpt-oss
>>
>>106513181
>No one here can actually run that on a single machine
They can though, sure it's slow and on RAM but they still can crawl it.
>>
>>106512693
Yeah, it is. LLMs just don't know a lot of tacit / implicit knowledge that most humans take for granted, because almost nobody would think of writing it down, especially on the internet. Training the models on 40T tokens or more just so they can be better at realistic conversations and situational awareness is a very inefficient way of covering that.
>>
>>106513126
Do you actually know that AMD does not have cuda cores and it doesn't work that well...?
When it comes down to ComfyUI things are different.
>>
>>106513181
>what is cpumaxxing
>>
>>106513210
you'll be well below a somewhat usable 25t/s even on a $10k machine with that
it's pure cope
>>
>>106513203
Good luck fine-tuning it on consumer hardware. You COULD do it but my assumption is that most people do not have the patience to do that even if they use qlora fine-tuning

>>106513210
And referring to fine tuning. Yes I know people can obviously run these but you cannot find tune with a CPU alone (afaik). Even if you could, that still has the same downsides as CPU nference: really fucking slow
>>
>>106513126
the 7b yah. Mine sits at 23.6-24.5. You may oom even on 24gb sometimes.

You can run 1.5 maybe as that only needs 12gb. You can quantize the 7b and run it on more like 14-16gb of vram which does have a quality loss. But at low temp and cfg. the 7b quantized can still produce nice sounding audio. Turning it up for more expressive stuff will suck in 4bit tho.

>>106513000
the pinokio script works better and faster right now with less issues. No support for quantizing tho. It's janky though and has to be loaded exactly as it states in UI or you have to restart it.
>>
File: r9k_1756126117063187.jpg (81 KB, 978x984)
81 KB
81 KB JPG
>>106513204
Wouldn't it be better to just train the thing on stories or writing documents that you deem have good writing and logic in bed with any? I've always thought that these companies original approach of training on the entire internet was unbelievably inefficient and overkill. Yes having that amount of text resulted in the model knowing how to APPEAR intelligent and coherent instead of just mouthing off nonsense at first inference but there is no way in hell it NEEDS to have trillions of tokens at a minimum.
>>
When will AMDjeets understand that they will always be second-class citizens with AI? Any guy with a functional brain bought Nvidia GPUs within reason
>>
File: PHPCU-06-58380155.jpg (143 KB, 2064x2064)
143 KB
143 KB JPG
>>106513219
4tks is fine
>>
>>106513219
>25t/s
You're not reading that fast
>$10K
Pure cope, it doesn't cost that much
>>
>>106513241
>2nd

You mean third. Apple metal exists.
>>
>>106513219
>25t/s
They get half of that in the best case scenario with the best available current hardware on empty context.
>>
>>106513181
I run Q2 and I don't even have a server.
>>
>>106513241
>Actually, monopolies are a good thing!
>>
>>106513240
I don't think it can be solved without synthetic data, because even books and novels generally try to avoid telling mundane or obvious things unless necessary for their story. You'd need trillions of tokens of literature and still not have most fundamental observations of human life described in a way or another.
>>
>>106513181
Local needs to fully PIVOT to running GLM full. q4 is only 200gb and even 48-64gb vram is enough to run it if you have the system ram for it. I'm sure deepseek excels as the larger model, but not by enough for rp and writing. As far as I can tell they are both the same for that kind of use.
>>
>>106513281
Isn't K2 better?
>>
>>106513264
Where did you read that?
>>
>>106513235
Yeah I'll just wait for a while and see what happens with better implementations.
>>
>>106513313
Not really and it's three times the size
>>
>>106513205
Yeah I know, that why I installed the torch rocm version, I've been generating images for over a year
>>106513235
I cant even run the 1.5b I get the segfault
>>
>>106513275
That would require special pipelines that essentially take the pre-training data and "enhance" it with the kind of explanations you are talking about. I believe it's possible that all you would need to do to get okay RP models will not needing trillions upon trillions of tokens if to simply train it on a bunch of existing novels, books, human rich and stories, etc. But like you said they don't write them in the detail you're talking about or have the same logic. So even if someone were to use this method to train and pre-train a whole new model that did not require the entire internet and still functioned fine, It probably still wouldn't be up to your (read: YOUR) standards.

The hard part wasn't even be enhancing the data set but figuring out HOW to do that without the data ending up being turned into text that sounds like it's written by a love child between a giga autist and a science textbook.


>>106513281
Why GLM specifically? Models like Mistral or llama are on average much smaller than glm's models. I also don't think those parameter counts are anywhere near necessary if we carefully pre-train ONLY on the kind of shit we want it to produce: rp. That doesn't need the entire internet which means those giant ass parameter counts probably aren't even necessary. Reasonable smaller parameter accounts would make it easier to run on all GPU types (within reason. Obviously a shit box 3GB 1060 card or something along those lines isn't even worth talking about). Unless I'm shown otherwise I think these parameter counts are bloat.
>>
>>106513181
I run DS locally, but for short replies only (up to 1 ktkn)

It loads quickly from cache (20 sec max), so it is mostly one-shot conversation

>distilled versions
it's BS, not DS
>>
>>106513359
For what it's worth, at the moment my GPU is working on a proof-of-concept synthetic dataset in the 1B tokens range (hopefully) where for each sample I'm taking *one* simple obvious fact and creating a relatively short, highly randomized conversation around it (about 1.5 million facts in total, currently 30% done). This dataset will likely not be very useful for production models in practice, but I will be able to easily see if pretraining a tiny model on this (+ other fundamental-level stuff) will yield better results than my previous attempt with random ultra-selected "high-quality" web pages.
>>
>>106513359
because GLM is a sota MoE within the realm of possibility of running on local. Also, your shit is theoretical, I'm talking about the best thing you can run NOW. Obviously things could be better.

I mosly use GLM air just because it's easier to load up on one or two of my gpu's which definitely says something about how useful large models like full GLM really are. Air is good enough even if it misses things sometimes. I can definitely see a future where something within this range just becomes way better for writing.
>>
>>106513451
>>106513448
What specifically do you use GLM models for? RP or fact retrieval stuff?
>>
>>106513359
>I also don't think those parameter counts are anywhere near necessary if we carefully pre-train ONLY on the kind of shit we want it to produce: rp. That doesn't need the entire internet which means those giant ass parameter counts probably aren't even necessary.
how many times does it need to be explained that this doesn't work like that?
>>
>>106513471
Elaborate
>>
>>106512610
>>106512933
What we really need to train models on is this.
https://www.toiletstool.com/toilet/
>>
>>106513485
It's been debated so much already.
>>
>>106513485
You need varied data so model can be smart. If you just want something retarded to memorize and spit out ao3, downloading and reading the archive would be a better use of your time.
>>
>>106513485
T5 was pretrained on 1T tokens and it's barely coherent
>>
>>106513359
>Unless I'm shown otherwise I think these parameter counts are bloat.
Pygmalion, the original ones.
>>
>>106513493
Elaborate.

>>106513499
I'm not talking about pre-training it on just a couple hundred stories. I mean a truly giant amount of data, like this for example:

https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL

And theory even something like this should be more than enough for a pre-training data when converted to a pre-training data set (just a giant unformatted text) assuming the main goal is rp. It will obviously be very retarded and borderline unusable and other domains like science, codeine, math, trivia slop, etc, but most of us do not give a shit about that, nor should we sense that kind of thinking is what leads to training on shitty data. We've already established this here:


>>106510315
>>106510342
>>106510348
>>106510367
>>
File: file.png (54 KB, 882x468)
54 KB
54 KB PNG
>>106513527
We need this but without the synth data to fully disprove the bloat allegations.
>>
>>106512934
https://litter.catbox.moe/3wnz0y3o37hmhy8c.txt
ST world book entry format, but way more concise.
>>
>>106513527
>Implying 6B -12B is a lot
I'm talking about models in the hundreds of billions of parameters range. I'm not convinced ANY domain requires that much.
>>
>>106513471
>>106513493
You're just throwing strawmen arguments. The main reason models need the entire internet is that the average density of useful information in it is very very low. That's why training them on "high-quality" (i.e. informative, cleanly formatted, goal-oriented) documents first generally improves benchmarks.
>>
what settings does harmony 20b work at? ((tavern)) still doesn't seem to have a preset for it, can't get it to not be schizophrenic but i'm close.
>>
>>106513545
>It will obviously be very retarded and borderline unusable and other domains like science, codeine, math, trivia slop, etc, but most of us do not give a shit about that

except the second you want to RP anything more than just 1 on 1 bedroom sex then it has zero clue what's going on
>>
>>106513563
>>106513527
Also aren't those just fine tunes of existing bottles? I could have sworn the pygmalion models were just fine-tune Mistral models. Were those pre-trained from scratch?
>>
>>106513545
>. I mean a truly giant amount of data, like this for example:
>https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL
>Size of downloaded dataset files:
>1.87 GB
lol
lmao
rofl
>>
>>106513573
It's more than enough, get to training already.
>>
File: 1748393047593471.jpg (65 KB, 879x841)
65 KB
65 KB JPG
>>106513552
>Ani Analingus
>>
>>106513573
1.78 gigs is nothing if you're talking about a movie or a season of a TV show. When it comes to purely text data, you will never read anywhere near that amount of data in your entire lifetime.
>>
>>106513465
scripts for tts, stories, image prompts, lyrics and songwriting, roleplaying rarely.

Fact retrieval and storytelling are intertwined. You can write about anything.
>>
>>106512307
How's Klear?
>>
>>106513584
Good thing LLMs aren't humans then, also a human who did nothing but read (don't know how without ever learning to from someone else but whatever) would be pretty awful to RP with as well.
>>
>>106513594
waiting for goofs
>>
>>106513594
lcpp support only 2mw away. Only 2.5B active and the datasets by their own admission were heavily filtered stem and commoncrawl, so I wouldn't hold my breath anyway.
>>
>>106513219
You could buy 32 MI50 for less than $5k and you would have 1TB of VRAM. Couple it with some e-waste tier DDR3 server motherboards with a lot of PCIE connectors for very cheap inference machines.
>>
>>106513670
and you're gonna connect these to each other how exactly? else you'll be even slower than pure ram cope
>>
File: 911 lol.png (591 KB, 895x955)
591 KB
591 KB PNG
>>106513670
>>
>>106513566
>https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune#running-gpt-oss
Get rid of the repetition penalty and use the openai's recommended settings.
>Temperature of 1.0
>Top_K = 0 (or experiment with 100 for possible better results)
>Top_P = 1.0
>>
>>106513595
RP quality is is pretty subjective unless less we're basing metrics on how many "shivers up my spine" type things are in the responses or how likely it is to refuse "unsafe, problematic" request. It makes me wonder why companies even bother including RP in their data sets in the first place when the overly filtered shit they include causes the aforementioned issues I mentioned in the first place. They may as well not even include it and then have a RLHF failsafe that triggers whenever someone tries to ask it to write a story, but they won't because that would contradict their "AGI ON TWO MOUR WEEKS" scam.
>>
>>106513670
>save $1k on hardware costs
>spend $1k monthly on electricity costs
brilliant
>>
>>106513708
it must be everybody's quants or something then, it's schizophrenic even with everything off and only those settings.

or just pure luck as usual
>>
>>106513545
RP specialization is absolutely worth doing and hasn't been properly tried yet. I mean, codemaxxing clearly works. However, pretraining with practically the entire internet is still necessary. You won't get better results by using less data.
>>
>>106513756
Have you updated your ST? And if so, make sure the instruction template is correct. This is all what I can say about this actually.
I tried it a while ago and had some issues, then kind of forgot about it. I don't use ST that often anymore.
>>
>>106513567
>>106513573
If you're dead-set on training a model on very limited amounts of data just to see what happens, then you'd have to make sure that it covers most basic knowledge, not just sex. However, it's pretty much guaranteed that if you pick such data from the web on a random fashion, you'll be left with fundamental knowledge gaps.

Most "high-quality" web data you can see in public datasets like FineWeb tends to be technical or specialized, highly redundant and definitely not conversational. Filtering it further for "quality" will make the model even more naive.
>>
>>106513765
>However, pretraining with practically the entire internet is still necessary.
Because?
>>
>>106513772
so has ST been replaced by something else? I've been suspecting for years that it does something fucky with every model ran through it given output in it and kobold's basic ui is different.
i know a lot of people just run ollama now and call it a day, i might too.
>>
>>106513791
>so has ST been replaced by something else?
mikupad
>>
>>106513791
Don't touch ollama it's slow.
I'm using my own client but it's not public.
>>
>>106513773
What if that specialized data was rewritten as conversations with a small model from the non-slop era?
>>
>>106513803
There's no such thing, all models eventually lead to slop because they always favor one way of writing things over another, your end result would mimic that preference
>>
File: 1745544686913910.png (494 KB, 653x1144)
494 KB
494 KB PNG
>>106513823
Not all models were created equal
>>
>>106511189
Human memory is the same- you gradually forget the details of things that happened a long time ago, but recall the gist (if important). Whereas transformers have total anterograde amnesia, like the dude in Memento. Though surely not a complete or ideal long-term memory, it seems better than nothing.

Transformers also tend to have pretty bad quality degradation well shy of the context limit. SSMs should help here too, albeit probably still limited by a small fraction of long-context training data.
>>
>>106513803
Better than just raw, "high-quality" web documents, I guess, assuming you can find such model.

The trained model will probably still have no idea that if you touch a hot stove you can burn your fingers, that water is wet, that potato chips make crackling sounds when you eat them, etc. Which is more important for an RP-oriented LLM to know?
>>
>>106513803
>>106513858
>with a small model from the non-slop era?
For the task you're trying to do, those basically don't exist. The data the model is being trained on needs to be written by actual people if you want the outputs to be as free of slopp as possible. And even if it could work, you'd likely want to verify that the outputs are actually of decent quality (if you're attempting to make a pre-training data set, there would have to be at the bare minimum, hundreds of thousands of conversations).
>>
File: 1749643154076820.jpg (101 KB, 1330x1330)
101 KB
101 KB JPG
>>106513864
>The trained model will probably still have no idea that if you touch a hot stove you can burn your fingers, that water is wet, that potato chips make crackling sounds when you eat them
Nta. What kind of data and documents would need to be present in pre-training and fine tuning data for it to "know" all of them. Yes we know fine tuning on the entire internet means that in all of that data there is bound to be data and passages that either straight up explain that or imply it and context and semantic reasoning. However we want to see if we can pre-train models to have logic WITHOUT The entire damn internet and thus potentially creating models that are "smart" at lower parameters (having the right data doesn't just improve the output quality but It can theoretically also enable the creation of better models at lower parameter counts if trained correctly with the right data).

So what I'm asking guess what the pre-training data need a bunch of science textbooks? A bunch of stories that describe The warmth of a fireplace or how anon's cock felt warm and squishy in femanon's pussy? We've established that training on the whole internet is bloat but filtering it down to only "high quality" data isn't good either because then you lose out on diversity of information which means your outputs become utter slop shit. So I think we would need to find a balance between data quality and data volume. But I'm not entirely sure what KIND of data would be needed. For the kind of model you are describing, one that actually understands that fire is hot and water is a wet, I wonder if you could accomplish that with pre-training on just a bunch of human written RP stories as well as a bunch of novels or would you need a bunch of science textbooks or something as well
>>
>>106513969
>However we want to see if we can pre-train models to have logic WITHOUT The entire damn internet
>we
>We've established that training on the whole internet is bloat
speak for yourself ragcuck
>>
>>106513181
>no one can run it
>ok maybe you can but its too slow bc not <arbitrary tk/s>
>ok maybe its usable speed for most things, but you can't train!
>ok maybe you can train some small stuff, but you can't train an entire sota model!
the grapes are hitting sour levels that shouldn't be possible
shut the fuck up and let people enjoy their stuff
>>
>>106513969
>need a bunch of science textbooks? A bunch of stories that describe The warmth of a fireplace
Both. Look at it like raising a child very quickly. It needs stem knowledge (school) but it also needs real world experiences/conversations so that it knows what those facts translate to in reality so it can build a rudamentary world model. You're asking, how can I raise my child with the least effort possible. Just lock them in a room with the Science channel on or only talk to my child about real world stuff and forbid any formal book learning. Both will end up retarded.
>However we want to see if we can pre-train models to have logic WITHOUT The entire damn and thus potentially creating models that are "smart" at lower parameters
You have a fundamental misunderstanding of how this stuff works. Even if you put in massive amount of effort to filter the internet data to only unique and "high quality data" all you've done is stop the model from knowing what is bad data and what data comes up more often. You need more data and more parameters for it to generalize. You cannot raise a child (or model) in half the time with less information and half a brain.
>>
>>106514015
>>ok maybe its usable speed for most things,
>>ok maybe you can train some small stuff,
No one ever said this because it's not true.
>>
One of vibevoice's big selling points was that you can generate extremely long audio with it. Maybe I'm just doing it wrong with the comfyui nodes, but I notice the quality of the output starting to go downhill if I generate anything longer than 30 seconds.
>>
>>106513181
Funny thing about DeepSeek is that you can't even run it in FP8 on a regular old 8x H100 node. You need two nodes or H200s at least. It's too big.
>>
>>106514038
>You need more data and more parameters
NTA but also if anything current models do show that even with less parameters more data is pretty much always better, as long as you don't over filter that is.
>>
>>106514051
https://files.catbox.moe/i7sc6u.wav
>>
>>106513572
They're shit and being based on pretrained models won't have made them worse. Pretraining only on their data would've been even more awful.
>>
>>106514004
No one mentioned rag in this specific conversation
>>106514015
Here's that emotional volatility again.
>>
>>106514056
I said myself that more data is better. But that doesn't mean you can do with less parameters for the same result. Regardless of what benchmarks say, models with less parameters make more mistakes and have poorer logical capabilites.
>>
>>106513740
Just buy some solar panels bro
>>
>>106514049
>No one ever said this because it's not true.
>anything less than 10million tk/s is unusable
>anything smaller than a 1T model is too small to bother training
>tfw I don't have 3TB of cerberas silicon
>tfw I don't have a billion dollar supercomputer cluster
damn, 90% of the capability at 10% of the cost sure is a bad deal. ngmi bros just shut down the general
>>
>>106514094
Right, just agreeing on the idea that smaller with more data would end up better than the supposed super RP focused model he's trying to get someone else to make for him.
>>
>>106513969
The data used in this website can be a starting point, if you could process every basic concept into complete conversations using a smarter LLM (using it raw won't work well unless you're trying to turn the model into some sort of knowledge graph): https://conceptnet.io/

It doesn't include everything imaginable, though (especially about sex) and you'd still have biases and slop from the model used for crafting the conversations. Reducing ERP descriptions or erotic stories into concepts that you can separately expand or build upon later on could be another possible useful thing to do.
>>
>>106514090
>No one mentioned rag in this specific conversation
And yet it would be required to have the focused model know literally anything at all.
>>
>>106514086
Oh, my bad. I'll stop generating pornography with vibevoice. I didn't realize.
>>
>>106514038

>Even if you put in massive amount of effort to filter the internet data to only unique and "high quality data" all you've done is stop the model from knowing what is bad data and what data comes up more often.
I am not suggesting that we filter out "bad quality" data as defined by corporations. That is not at all what I'm saying. I'm merely saying that training on the entire internet is unnecessary. You suggested that you should train both on science textbooks and actual stories that discuss the type of shit you would want the model to be good at. Your "Don't lock it in a room or it'll be retarded" analogy seems pretty spot-on. We need good and "bad" data because more diversity means better outputs. But the point I'm trying to make is that I am not convinced we need to pre-train on "ALL INFORMATION THAT HAS EVER EXISTED EVER". It just needs to be enough data so that the model learns logic and common sense. Beat it the science textbooks or whatever so that it actually understands how the world works to a certain extent I've been feed it the human written story so that it knows how to write stories (but don't ONLY feed it purple pros garbage. Companies doing that is precisely why we consistently get outputs like "shivers down my spine")


I think you have the impression that I think we should feed these models only super duper ultra mega "high quality data ®™". That's not what I'm saying. I'm merely saying that training on the whole internet doesn't seem to be necessary.
>>
>>106513584
>When it comes to purely text data, you will never read anywhere near that amount of data in your entire lifetime.
This is wrong. I have 2+GB worth of IRC logs from channels that I've basically always backread completely. You gravely underestimate how much a human being reads in their lifetime. Also, pretraining datasets these days only become interesting if they're at the very least 1T tokens, that would be, roughly estimated, 4TB of text.
>>
>>106514143
>It just needs to be enough data so that the model learns logic and common sense
once again not how it works, it doesn't learn like that.
>>
>>106514143
Again, regardless of how you define high quality, you can't just dump a couple textbooks in the dataset and expect it to memorize and intuitively understand everything within it.
>>
>>106514154
>. I have 2+GB worth of IRC logs
And that's purely taxed? Bs
>>
>>106514143
>I'm merely saying that training on the whole internet doesn't seem to be necessary.
Which perfectly aligns with the corpo interests of making worse models for us by filtering, just in a slightly different way.
>>
>>106514051
I generated some ~30min things using an adapted version of the CLI script and it sounds fine to me. I turned up steps to 30 though and gave it a max_length_time of 90.
>>
>>106514176
Do you know what IRC is? Yes, it's purely text.
>>
>>106512307
I've been thinking, for "ollama Turbo", are they even using their own software in the backend?
If I was a lazy Silicon Valley grifter the way I would do it would be to just forward the requests to something like deepinfra where $20/month buys millions of input/output tokens.
>>
>>106514163
>>106514171
You keep telling us that we need to train on massive amounts of data so that it learns how humans actually talk. We've established that. We agree on that. Where we disagree is whether or not we need terabytes upon terabytes of textual data for the pre-training stage. I understand what you're saying. We just disagree on whether or not the terabytes is necessary. Disengage your tunnel vision for a sec and actually read what I'm trying to say and explain WHY My reasoning isn't sound instead of just saying what a counts to "nuh uhh it's wrong because it just is okayyy?"
>you can't just dump a couple textbooks in the dataset and expect it to memorize and intuitively understand everything within it.
The same thing could be said about pre-training on terabytes of data. The models don't actually " know" shit. They replicate semantic meaning. You can jailbreak certain models to confidently tell you that 1 + 1 = 5 yet those things were trained on the terabytes of data. Have we suddenly forgot in that models are frequently "confidently wrong"? Training it on more data will not automatically make it a genius. Throwing only a single books worth of text in pre-training is obviously a stupid idea It won't get you anywhere but no one has been able to definitively prove you absolutely HAVE to pre-train on the entire internet. More data is better, yes we agree on that. The entire internet? I don't know about that.

>>106514177
They filter bad words and "icky" stuff that isn't advertiser friendly. You can filter out a relevant information while still incorporating shit you care about. You seem to be under the impression ANY kind of filtering or data QC is inherently bad. Garbage in garbage out remember?
>>
https://wccftech.com/nvidia-geforce-rtx-5090-128-gb-memory-gpu-for-ai-price-13200-usd/
>NVIDIA GeForce RTX 5090 128 GB GPU Spotted: Custom Memory, Designed For AI Workloads & Priced At $13,200 Per Piece
damn
>>
>>106514258
>You keep telling us that we need to train on massive amounts of data so that it learns how humans actually talk. We've established that. We agree on that.
You need to understand that 2 GB (like the example you provided early) is not even remotely "massive amounts of data"
>>
>>106514143
You seem to be fundamentally misunderstanding how parameters work, it's not quite like a zip file, you don't really waste parameters by having more data seen during training, you just reinforce some concepts more than others.
>>
>>106514270
damn near creamed myself mid-sentence until I saw the price tag
>>
>>106514276
In text form that's an absurd amount of data. We're not talking about other file formats that can balloon the size like images, videos, irrelevant site metadata. It is purely text and nothing else. Stories and nothing else. The only extra data it has is the jsonl formatting were an entire story is shoved into a "stories" key followed by the brackets.
>>
>>106514258
>You seem to be under the impression ANY kind of filtering or data QC is inherently bad
Yes, that is my point. https://arxiv.org/pdf/2505.04741
>>
File: deepseek nigger life.png (118 KB, 624x354)
118 KB
118 KB PNG
>>106514270
>$13,200 Per Piece
WHOOPS it just went up to $15,000 due to the GOYIM tax.
>>
File: WanVideo2_2_I2V_03899.mp4 (1 MB, 640x480)
1 MB
1 MB MP4
>>106512310
Miku watch out!!!
>>
>>106514270
>still can't run even iq1s of r1
>>
>>106514051
Bro, you can do long generation with any TTS. You just need to segment your sentences properly when sending them to the TTS engine
>>
File: hs.png (412 KB, 3000x1800)
412 KB
412 KB PNG
In the FinePDF card they have a graph with general benchmark scores marked every billion tokens. Interestingly, at 1B tokens just training on the PDFs gave similar scores to just the web data (FineWeb) trained for twice the amount of tokens. The gap narrows immediately after that, then turns again to about a factor of 2 later on.

https://huggingface.co/datasets/HuggingFaceFW/finepdfs

Even with a small amount of training tokens for a model pretrained from scratch the data makes a ton of difference. It wouldn't be surprising if with very specialized data you'd get a better model with considerably less tokens than normal---in your field of interest.
>>
>>106514377
>in your field of interest.
One problem being that RP isn't just one narrow field, every other anon expects something different from their RP, some want modern stuff, some fantasy, some anime/light novel like...
>>
File: Screenshot.png (35 KB, 653x205)
35 KB
35 KB PNG
Very interesting and brave take.
>>
File: broken womb .png (348 KB, 788x742)
348 KB
348 KB PNG
>>106514400
my take is guys like him should choke on their onions and die
>>
>>106514415
holy slope
>>
>>106514285
Are you under the impression that I think the models are actually the entire internet compressed into a file?
>>
>>106514051
ComfyUI nodes in particular are not properly implemented.
>>
>>106514217
Assuming it is pure text and nothing else, how many years worth of chat logs? Were these particularly active servers? (Maybe you should train a model off of those and see what happens).
>>
>>106514299
So all the unfiltered shit is good too? There's a difference between having quality data that has shittier quality than good data and having data that is not even worth using
>>
File: FineVision.png (164 KB, 654x650)
164 KB
164 KB PNG
>>106514377
And in their FineVision release tweet they state outright that removing data lowered performance. https://xcancel.com/andimarafioti/status/1963610135328104945
>>
>>106514325
spooky/10
Did Wan also do the static effects or did you edit that in?
>>
just a heads up. /ldg/ schizos claiming comfyui collects user data are actually correct. the new login system pings Google services even if you don't use it. testing is underway. use a different UI if possible
>>
>>106514449
>So all the unfiltered shit is good too?
Yes.
>data that is not even worth using
that line of thought is why we are where we are right now.
>>
>>106514462
>the clip ends with white static noise and glitch
>>
>>106514299
But how do you explain the models, even models that have been specifically fine-tuned for RP, having the "shivering down my spine" nonsense? If any form of filtering or QC is inherently bad (I don't know how you can say this out loud and not realize how nonsensical it is) then how do you propose we get rid of gpt-ism slop responses in models? No, "You're just prompting it wrong You're just system prompting wrong" is not the right answer.
>>
>>106514465
I haven't seen any connections going anywhere...
Looking at your typing, you are one of the real schizos trying to stir shit up again.
>>
>>106514475
By having more data like you want to see to drown out the slop but still have the model know what slop is.
>>
>>106514465
link a file that does that?
>>
>>106514479
fuck off Chinese shill

>>106514488
>>106513947
>>
>>106514457
Likely because the highest quality data is less varied where it matters. It's as if you wanted the model to learn conversations just from FineWeb-Edu documents above 0.99 language score.
>>
>>106514486
Or you could just remove the slop entirely, but I guess you will misunderstand what I said is "just filter out everything". You have to find a balance between the amount of data you're using versus using way too fucking much. You're saying that if you have a massive amount of data then be sure quantity of "good" Data will outweigh the "bad" slop. Why not just carefully omit the bad data and only include shit you absolutely need?
>>
>>106514493
What do you mean?
>>
>>106514505
comfyui is getting exposed as a Chinese scam to launder money into shanghai
>>
>>106514493
>Making a stink about this on their github would probably turn their community against us,
How did he come to that conclusion?
>>
File: naked pepefrog.png (232 KB, 655x599)
232 KB
232 KB PNG
I'm getting better results with 4k context than 32k. Do home gamer LLMs just not do well with large context?
>>
>>106514400
That guy has no idea what he is talking about.

>>106514258
This guy has no idea what he is talking about.
Stop posting.
>>
>>106514517
99% of users are cock garbling redditors that can see no wrong
>>
>>106514512
?
>>
>>106514512
Will the money going into shanghai finance building the cheap high vram gpus?
>>
File: IMG_8542.jpg (2.12 MB, 4032x3024)
2.12 MB
2.12 MB JPG
Ah, there they all are.
>>
>>106514440
About thirty years of logs from an active niche community. I don't have enough compute to train a big enough model to make it worthwhile. They're not English.
>>
>>106514504
>Why not just carefully omit the bad data and only include shit you absolutely need?
Because it's impossible to agree on what IS bad data. One of the reasons our current models are so slopped is because they only kept the kind of stuff they considered good, which aligns with purple prose bs.
>>
>>106514531
no it goes into buying labubus
>>
>>106514519
https://github.com/adobe-research/NoLiMa
Big context is an illusion
>>
>>106514562
Not an illusion, an outright marketing lie by model makers. We should call them out on their blatant lies when we can.
>>
>>106513740
>$1k monthly on electricity costs
Do you live in Germany or something.
>>
>>106514562
Update to the leaderboard when? This is why research sucks. Limited budget and care. Meanwhile you've got people like the UGI guy that's really dedicated but the benchmark frankly could use a bit more statistical rigor.
>>
>>106514556
So if we wanted to attend to pre-train our own /lmg/approved base model, how would we even define what is considered "good" and "bad" data? (Again, that is not the entire internet).
>>
>>106514556
>>106514592
Also I probably should have clarified this earlier, I'm referring to pre-training a RP focus to model, not a general purpose model. If you're trying to do pre-training of a general purpose "genius" model like Claude or deep-seek then yeah you probably DO need several hundred gigabytes if not a terabyte or two of data. Perhaps even hundreds. But it picks hyper focused in terms of functionality, you absolutely do not need THE WHOLE INTERNET
>>
>>106514592
One of the things I'm trying to say is exactly that that's an impossible task, anons would never manage to agree on what would go in, in what quantity and tons of other disagreement points.
>>
>>106514579
4 GPUs that cost me $100 per month to keep running. Unless your electricity is free, 32 fucking GPUs is going to cost nearly $1k.
>>
>>106514607
>RP focus to model,
And again, again again, RP isn't narrow enough a use case that you can do what you think, it's nowhere near as narrow as code or math.
>>
>>106514580
Time spent updating old projects would be better spent working on the next paper.
>>
File: 1737617803574376.png (62 KB, 774x689)
62 KB
62 KB PNG
>>106514562
So this proves that reasoning is a patch for attention
>>
>>106514325
I am unsure of this Aimaina Miku's validity.
>>
>>106514457
I wonder if we can finally have proper nsfw captioning.
>>
>>106514562
Nta. How accurate that graph of his is depends on the amount of context the inference engine he used actually allowed. Ollama for example allows you to use models that are advertised as having a 128K context window but by default it sets it so that the kv cache only allows 4096 so that it doesn't cause consumer GPU rigs to explode via oom. vllm pretty much requires that you said effective contact window lengths manually or else if you try to use a model that has a giant context window but you don't have enough VRAM, vllm doesn't know that so if you have a shit box it will crash.
>>
>>106514562
>NoLiMa
> Long-Context Evaluation Beyond Literal Matching
I don't get this acronym. Shouldn't it be LCEBLM?
>>
>>106514608
I don't think that means we should just accept that training on the entire internet is an efficient way to make general purpose models (read: general purpose). I guess we can agree that aggressive filtering done by corporations makes the models shittier (by our standards)
>>
>>106514653
That's crazy dude! I think you should send a PR to the Nolima guys, maybe they don't know!
>>
>>106514661
NoLiteralMatching?
>>
>>106514622
>Coding is narrow
>Most decent coding models need to be in the double digit perimeter range at minimum in order to actually be usable

Wat?
>>
>>106514671
Why do you even have such a hard on of hatred for the entire internet as a training concept, do you have some stuff on there you're scared o the models learning or some shit?
>>
9pm mc donalds feast
u guys want some
>>
>>106514678
I'm saying RP is less narrow than coding, not that coding is super narrow in itself. Just the fact RP is even less so.
>>
>>106514672
You'd be surprised
>>
>>106514562
you guys really trust research coming from Adobe of all places?
>>
>>106514549
Hot glue
>>
>>106514143
bad quality data in this context is not ah ah mistress stuff but typos, all-caps, 403 forbidden pages, "you can put glue on pizza", and other noise.
>>
>>106514634
Yeah, opening a script takes so much time. It's more about money, and caring to do a few clicks.
>>
>>106514693
What's a "mc donalds"? That must have been bad data so I don't know.
>>
>>106514684
It is possible to cut down on the amount of training resources required to pre-train these models then thats a worthwhile thing to pursue.
>>
>>106514684
I get the impression that he can only run small models, so he's grasping at hope that by filtering the dataset he can have his perfect model in some 8B that is cheap and quick to train so someone will do it for him
>>
>>106514622
We'll narrow it down to her level. We can start small.
>>
>>106514702
Hence why I keep saying that pruning out SOME data instead of just saying "fuck it we ball train on everything that has ever existed" is something to at least worth consider. You slashed the other guy keeps saying that ANY form of data QC is a sin punishable by the guillotine
>>
>>106514700
Do you have any idea how hard it is to clean skeet off of fabric that cannot be machine washed?
>>
>>106514710
That would mainly benefit big corps in them spending less on even more filtered models using this idea as justification though.
>>
>>106514562
If models do better at low context how do dev tools work? I regularly feed files to my chat that are like 20kb+ alone. I assume they must be doing some chunking and summarization, but wouldn't that leave it missing details of my code? or do they just accept that high context are needed and the results may be shit?
>>
>>106514725
And it could benefit US because we don't HAVE to use THEIR shit It pre-training on a considerably less amount of data in order to make a coherent model is possible. I don't give a shit whether or not corporations are benefited.
>>
>>106514740
>And it could benefit US because we don't HAVE to use THEIR shit It pre-training
Where are any models pre-trained by a non corpo since the llama2 era?
>>
>>106514750
Multiple people here have bothered to actually try. It not being popular on HF/doesn't exist and is not worth doing.
>>
>>106514713
I'm sure the perfect RP focused model is only 4B away, we just need to thrust and for other anons to pay for training it.
>>
>>106514750
https://github.com/jzhang38/TinyLlama
>>
>>106514702
Actually, it's good to have some data with typos in the training dataset, as it gives the model some context to deal with typos in prompts.
You just want to make sure there aren't enough typos in the dataset that the model itself starts making typos.
>>
>>106514768
No. That's bloat.
>>
>>106514762
Seriously, I don't get him. This is the same flavor of cope as bitnet, except quantization was replaced with filtering.
>>
>>106514614
I'm cpumaxxing (granted, in a super cheap electricity locale) and I'm hitting (5 person household) $250 dollarydoos/mo mid-summer with A/C cranked.
Would a busload of power-limited mi50 in a trash-tier CPU mining-rig really be that much worse?
>>
File: 1740858469742991.jpg (999 KB, 2446x2445)
999 KB
999 KB JPG
>decide to check on ipex-llm to see if they finally updated to support latest models like oss-gpt
>still no releases since april
Buy Intel they said. It would be great they said. It's so much cheaper they said.
>>
>>106512347
Yeah things are slowing down
Models need 5X parameters for 15% performance boost (according to their own benchmarks)
>>
>>106514768
Couldn't that be mitigated by simply ensuring that during the SFT instruct tuning phase none of the "assistant" responses have any typos?
>>
>>106514678
In RP, arbitrary amounts of code can come up. It's a superset. Basically anything in the world can come up in RP or writing stories in general. Some people like writing hard scifi. Others want to RP with math kittens. Others want to discuss rare stamps with their stamp collector gf. Others want to play 3rd edition MtG. If there is any topic your model cannot handle, it's not suitable for RP.
>>
>>106514823
does it work with vulkan at least?
>>
>>106514901
yes, but i found llama.cpp with vulkan to be awful performance
>>
>>106514888
I think you were the guy that suggested that you need both RP AND common sense data like shit from science textbooks in order for it to learn proper common sense and logic. I think the disagreement comes from how MUCH Data is needed.
>>
>>106514917
Nope. I think you need the whole fucking web, plus books, RP, everything, as many trillions of tokens as you can get.
>>
>>106514917
How much data do you think is needed to cover every possible RP topic? How about maybe the entire internet, that sounds about enough.
>>
>>106514932
Yeah. The idea that this data is "bloat" is just a massive misconception. It all goes into building a better world model.
Thinking that a 4B model "without the bloat" could possibly be enough for good RP is just a massive cope. Less data makes models worse in the general case. If you keep training a 4B model on more and more diverse data, it would get better and better. That's just the basic scaling laws from the GPT-3/Chinchilla era before people started filtering everything to shit. But of course it's still only 4B, so it'll be garbage anyways.
>>
>>106514927
Ehhh... We can agree to disagree on that. I don't think merely two gigabytes of text is enough if you want the thing to both know how to RP and have common sense and good temporal coherence like this guy alludes to >>106514888, but the entire internet being a hard requirement doesn't sound like a good use of resources. Haven't people already demonstrated that you can create these models on way less data? (Not only two gigs obviously but way less than the entire internet)

>>106514932
Probably more than 2 GB but again, not the entire goddamn internet. Ensuring that your data has a diverse set of topics and story types would help a lot along with having the common sense / science portion as well. I understand why thinking I'm near 2 GB would not make it GOOD at RP and it will probably suck at having any form of coherent logic, but you're also failing to understand why the entire internet is a necessity. There should be an in-between point.
>>
Almost all improvement in LLM sphere came from more params, bigger datasets and longer training, and as soon as corpos started curating their inputs we entered the benchslop era.
I am curious how erp benchmaxxed model would look like, but I think https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1 comes pretty close.
>>
>>106514961
>Less data makes models worse in the general case.
Guys think of the TRIVIA. What will we do without our precious trivia?

Joke's aside, yes you need a lot of data. Not the entire internet
>>
>>106514968
>Haven't people already demonstrated that you can create these models on way less data?
Not if you want the model to actually be good at anything, I assume you don't use Phi as your daily model? Yet it's so lean and optimized.
>>
>106514979
>Not the entire internet
>106514968
>not the entire goddamn internet.
I honestly think there's some kind of psyop being ran on the thread.
>>
>>106514961
>If you keep training a 4B model on more and more diverse data
So the size of the data set directly correlates to how diverse it is? Isn't it possible to have a data set that's only like 100 gigs in size that potentially has more variety than a data set twice it's size? I don't think "bigger number = better" is the right line of thinking
>>
>>106514982
That's a general purpose model though, not something hyper specific or specialized.
>>
>>106514970
Funny how The general consensus of this general was that doing that made the models worse. Why did the sentiment suddenly flip?
>>
>>106514982
Maybe Phi wouldn't be so bad if it wasn't so safe.
Seeing it burn my precious tokens thinking if my prompt aligns with their policy or we should refuse gave me psychological trauma and Microsoft must compensate me financially.
>>
>>106515015
I see, my apologies you're absolutely right! I will forward you all the money you need and the engineers to train your model first thing on Monday.
>>
>>106514968
>We can agree to disagree on that.
I'm both of those guys you quoted in the first section. And we can't because you are simply wrong.
>Haven't people already demonstrated that you can create these models on way less data? (Not only two gigs obviously but way less than the entire internet)
I think that some filtering is warranted. You don't want spam generated with markov chains. You don't want languages other than English (unless you do). You don't want AI slop (so only use old data). "Limited data models" like the Phi series are just garbage for RP, because they don't develop a good general world model.
>>
>>106514979
If the model doesn't recognize obscure characters I like and their settings, it's shit, sorry.
>>
>>106515029
>You don't want languages other than English (unless you do).
Fuck off I need my JP weebslop in there. Tired of models failing MSGKbench.
>>
Hey guys I have an idea, tell me if it is fucking retarded or if it might have some merit.

So I have a literotica account that I used to have thousands of stories rated.

I'm thinking of downloading all the rated stories and their rating, making a dataset out of it.

And then I train a adversarial network to read text and predict what my rating of the text will be using that dataset.

Then when it is trained to rank stories I like I put it in an Reinforcement Learning setup where an LLM generates text and the adversarial network then predicts the rating of the text with the goal of getting the highest rating possible. Then every X round milestone I go and check the output and give it my actual rating and the adversarial network will be punished if its predicted rating deviated too much from my actual one.
>>
>>106514970
Pre-slop era Llama 1 was only trained on 1/1.4T tokens, Llama 2 on 2T tokens: 1/10 of the GPUs and 1/10 of the data than later models.
>>
>>106515036
Just RAG your character? It's that shrimple isn't it.
>>
>>106514607
you want the model to see a high diversity of data or it will get bored and just start memorizing specific slop phrases. I have personally trained my own 1.5b model on over 5b (unique) tokens of smut to come to this determination. you absolutely will never find a high enough diversity in such a narrow domain. you need am incredibly broad dataset that constantly challenges the model and not simply reinforcing it.
>>
>>106515001
So the size of the data set directly correlates to how diverse it is?
Yes.
>Isn't it possible to have a data set that's only like 100 gigs in size that potentially has more variety than a data set twice it's size?
>twice it's size
Yes, that's possible. Ten times the size? You'd have to fuck up hard. 100GB of text is 25B tokens, it's basically nothing.
>>
>>106515015
The point is that there is no more general task than RP and writing stories. Your model has to understand everything, because everything can come up in RP/stories. It's not a small domain.
>>
>>106515029
Sir do you know what "agree to disagree" means? I've acknowledged that neither of us are going to see each other's way. Is the fact that I don't agree with your sentiment such an offensive sin?
>>
>>106515061
Just focus? Skill issue LMAO.
>>
>>106515063
It means you are wrong.
>>
https://vocaroo.com/1mpd6FwZaOM8

Is vibevoice peak? This sounds fucking great.
>>
>>106515061
it won't it will just latch on to the tropes that it can get its easy wins from
>>
>>106515051
RAG is garbage and doesn't work.

>>106515037
>unless you do

>>106515079
If your model is 4B, it definitely will. There's a reason we need lots of data and also huge models.
>>
>>106515089
Not the entire internet though.
>>
>>106515061
>The point is that there is no more general task than RP and writing stories. Your model has to understand everything, because everything can come up in RP/stories. It's not a small domain.
By a giga autist's standards then I guess I see how that makes sense. You have incredibly high standards for the RP. But the thing is most people do not write anywhere near that level of high quality while also having the kind of uncensored scenarios corporations are afraid of. Shit scraped from AO3 or Wattpad Will have a diverse set of scenarios but they probably aren't taking The ambient temperature of the room they're in into account in order to determine the exact amount of time it took for someone's nipples to get hard, or taking someone's inferred medical history into account when determining exactly how long it would take for anon to bust and under what circumstances. Most people do not think about that shit like at all. You could solve this by training on stories that are "higher quality" (fiction or nonfiction novels that actually go through a publishing agency and thus go through actual QC) but then it takes only trained on that you get a model that will be perceived as having too much flowery language or purple prose and won't have the ability to generate or go along with the fucked up scenarios anons here would love for it to do. Cleaning that it needs to have a perfect understanding of how everything ever works in order to be good at RP (by your standards) is a giant stretch.


>"YOU'RE WRONG"

ok. Now what?
>>
>>106515095
Ideally the entire internet (without garbage spam), but I know that nobody will train on that. Fuck. There is so much good info in old web crawl from before the web turned into garbage that will never get used, it's so sad.
>>
>>106515068
You must have been fun at parties and had many friends.
>>
>>106515095
maybe not the entire internet but it needs to be of the same scale and diversity. the internet is just the most obvious and readily available source.
>>
>>106515105
>You have incredibly high standards for the RP
Isn't the whole point of your idea to make a better RP model than what we have now???
>>
>>106515112
>(without garbage spam),
But anon QC of any kind is bad remember?
>>
>>106515117
>maybe not the entire internet but it needs to be of the same scale and diversity.
Scale? Debatable. Diversity? Absolutely.
>>
>>106515119
No, we just need to lower the bar as much as possible for Focused RP on 4B param or less.
>>
>>106515112
I'm using the fineweb's 2013 subset on my next model to see what happens. I do wish we had even earlier internet crawls available.
>>
>>106515049
That's just LLama problem
>>
File: fineweb-recipe.png (218 KB, 1786x672)
218 KB
218 KB PNG
>>106515153
>fineweb
>filter filter filter
>>
How do you format OOC comments? Do you do a newline after the dialogue and then "OOC:" or do you put it in parentheses or brackets? Do you use a colon?
>>
>>106515071
>Is vibevoice peak?
it is, that's why Microsoft didn't want the goyims to get that kino (but they somehow released it without lobotomizing it ol)
>>
>>106515113
I'm just trying to help you not waste time and compute on something that will turn out bad. It's sad to see energy get wasted on doomed projects.

>>106515105
I'm not talking about retarded shit like that:
>taking The ambient temperature of the room they're in into account in order to determine the exact amount of time it took for someone's nipples to get hard
I'm talking about things like this:
>>106514888
They are just topics. If you train on novels, your characters probably will have no idea about even the most famous MtG cards and rules because mentioning them in a novel is a copyright violation. Fanfics will help of course, but they won't help when you want your math kitten to write you a proof, or when you want to discuss the code you worked on at work with your wife.

>>106515123
No. Surprisingly filtering out "the a congo sex the the a congo congo nigeria vagina pussy pussy the the" documents is not bad for your model.
>>
File: 1725949242892983.png (732 KB, 1842x178)
732 KB
732 KB PNG
>>106515119
The initial test (I've already shown this and confirmed this to be the case) was to see if "uncucking" models is actually possible with further training. We've confirmed that is absolutely possible. Main reason I even bother trying is because many people here were adamant that once you safety tune a model enough that no amount of fine tuning can possibly erode away the guard rails.

What I'm arguing NOW is that training on the entirety of the internet is extremely inefficient. If it is possible to fine-tune a decent model with significantly less data than the entire internet, then that theoretically could mean you could have better models at lower parameters .... Keyword, theoretical. I'm not claiming that's actually the case currently.

>Isn't the whole point of your idea to make a better RP model than what we have now???

That's not necessarily what I've been arguing for the past hour or so. I'm talking about training scale, not whether or not we can make the models better. If you're referring to making the model less prone to refuse certain things and less likely to produce flowery advertiser friendly trash then doing that via training is trivial. Pic rel is from a fine-tuned llama model. The fine-tune model produced this while the safety-cucked version it's based off of either refused entirely or was extremely dodgy.
>>
>>106515165
>Applied URL filtering using a blocklist to remove adult content
>Applied a fastText language classifier to keep only English text with a score ≥ 0.65
yeees.
>>
>>106515153
CommonCrawl should have data from 2007. You just have to do language/spam filtering yourself.
>>
>>106515071
>>106515170
does it take voice files or is it strictly those demo voices? i don't see any HF spaces.

man last TTS i used was zonos, back in january i think.
>>
>>106514888
>>106515177
So it's what I'm getting from this is that you want a model that is good at role-playing about.... Programming?
>>
>>106515193
Yes, it takes wav files for voice cloning. Works fine with 10-40s or so.
>>
>>106515181
>better models
IMO it wouldn't be better if it doesn't have what you consider the internet bloat.
>>
>>106515165
I don't care the majority of the training tokens are going to be ao3 anyway I just need something a bit more noisy in the background to keep it learning and hopefully improve generalization
>>
>>106515197
I want it to be good at roleplay about anything I fucking want at a moment's notice, which includes programming or whatever else I enjoy.
>>
>>106515181
>If it is possible to fine-tune a decent model with significantly less data than the entire internet
Earlier you were talking about pretraining.
>>
>>106515207
BLOAT.
>>
>>106514823
>he didn't buy nvidia
lmao
>>
What if we trained a 0.1B on Nala test and nothing else?
>>
>>106515181
so you want a side grade at best to what we currently have, but with no trivia knowledge, which is something anons frequently complain about, to potentially lower the parameter count a bit? that sounds like an awful tradeoff.
>>
>>106515199
cool, i noticed there's a comfyui node setup for it. guess ill give that a go in a bit

https://github.com/wildminder/ComfyUI-VibeVoice?tab=readme-ov-file

>>106515221
peak the likes of which the world is not ready for (neither is my dick)
>>
>>106515193
I used a sample voice, of one of the bitches from class of 09. The included sample voices are ok though.

Included "alice" sample:
https://voca.ro/19VRhqX2fmcc

my shitty sample:
https://voca.ro/1hoVRSBntjxO

I really like how it handles quotes and speaks them in another 'tone' sometimes.
>>
File: 107991372.png (67 KB, 460x460)
67 KB
67 KB PNG
https://huggingface.co/unsloth/grok-2-GGUF
How many reuploads will it take to get a working version?
>>
>>106515207
>Use an intelligent model that is already pre-trained on programming
>Further fine tune it on a SFT roleplay data set with a variety of different scenarios
>????
>Profit


What it sounds like to me is that you want a general purpose model which we already hav e in spades.
>>
>>106515226
The trade off being able to run the model in a local machine versus a bloated model filled with useless shit that you need to offload to use.
>>
>>106515258
I'm surprised people even want to use elon's garbage. He tried pushing grook-code-fast a week ago too on a lot of providers and it was garbage
>>
>>106515246
thats fucking cuhrayzee holy SHIT. nice.
also that girl's a cutie, would you be willing to post the sample you use?
out of context the script you use is completely schizophrenic but i love it, got a good few laughs out of me the way her voice annunciates/exaggerates sometimes.
>>
>>106515264
So this entire retarded argument was in fact just poor cope as some had theorized, thanks for wasting the collective thread's time.
>>
>>106515260
Yes, exactly, that's what I'm saying. General purpose models are the only suitable models for RP.
>>
>>106515260
>general purpose model which we already hav e in spades.
And they're all shit because they're already too filtered.
>>
>>106515236
comfyui is full of telemetry now so we really need a new UI for vv
>>
>>106515280
Now the question is, is it possible to make GOOD general purpose models on less than an internet's worth of data while being decent? I'm assuming your answer to that is that's not possible.

>>106515289
>What is sft training
>>
File: file.png (114 KB, 580x491)
114 KB
114 KB PNG
>2M context
kek
>>
File: local assistant.png (416 KB, 798x1128)
416 KB
416 KB PNG
/lmg/ btfo
https://www.reddit.com/r/LocalLLaMA/comments/1nb0ern/fully_local_natural_speech_to_speech_on_iphone/
https://apps.apple.com/us/app/locally-ai-private-ai-chat/id6741426692
>>
https://voca.ro/1n5vlenAX1pf
>>
>>106515300
>Now the question is, is it possible to make GOOD general purpose models on less than an internet's worth of data while being decent? I'm assuming your answer to that is that's not possible.
That assumption is correct.
>>
>>106515304
The MNN app was better. This is just a redditors cheap knock off of the OpenAI app.
>>
>>106515300
>What is sft training
NOT a solution to filtered pre-train data, you cannot make it learn worth a shit after it was already lobotomized.
>>
>>106515209
Meant to say pre-train.

>which is something anons frequently complain
And something just as many anons clean doesn't matter.
>>
File: 1742752755504751.png (960 KB, 1816x276)
960 KB
960 KB PNG
>>106515317
???
>>
>>106515323
The "just rag it in bro" posters are not being serious.
>>
>>106515304
>I am here to answer quest-eons and to provide helpful re-sponses
>>
>>106515326
Congratulation on making the model say pussy. You won, that is totally what I meant.
>>
>>106515181
https://www.youtube.com/watch?v=LQCU36pkH7c
>>
>>106515333
With that assumption I could say the same thing about literally everything you said and vice versa.
>>
>>106515304
>ganyouhelpme
>>
>>106515341
Have you tried RAG?
>>
>>106515339
So you can agree "uncucking safety tuned models is impossible" is a nonsensical claim right?
>>
File: 1742596391245834.jpg (187 KB, 608x646)
187 KB
187 KB JPG
>>106515326
>It's so small, ..., almost like it was made of my cock
Your AI bot just called your cock small lmao
>>
File: chrome_6g1ierAshN.png (566 KB, 701x1255)
566 KB
566 KB PNG
>>106515304
White people tech.
>>
>>106515355
You're absolutely right, I don't even know why you're arguing with anons since clearly you can just do things and make the best model ever?
>>
Not all "safety tuned" models are the same. gp-toss is basically unsalvageable garbage. GLM without thinking and with a prefill will basically never refuse and it can write some fucked up shit.
>>
>>106515357
N....no it's NOT! It's perfectly reasonably sized my mom says so!

>>106515363
Why are you more upset that I don't agree with your hyper specific and autistic opinions? No one claimed they can make the best model ever.
>>
>>106515379
This really hits on the fundamentals of LLM safety, you're so smart for pointing this out!
>>
>>106515395
>Why are you more upset that I don't agree with your hyper specific and autistic opinions? No one claimed they can make the best model ever.
Assuming you mean me by "hyper specific and autistic", that post was not made by me.
>>
>>106515379
>GLM without thinking and with a prefill will basically never refuse and it can write some fucked up shit.
So does gpt-oss but the latter feels more creative than GLM. That model is too prone to repetition and it breaks down with context pretty fast. The honeymoon period didn't last long and I now use gpt-oss as my main model.
>>
File: 1748840868893020.png (1.77 MB, 1010x926)
1.77 MB
1.77 MB PNG
So I heard a little while back that the zucc wants to create "a personal superintelligence". Does he mean he wants all people to be able to use "super intelligent models"? What is his end goal here?
>>
>>106515437
Nta. So turning off "thinking" results in better quality and less refusals?
>>
>>106515439
he wants to create the perfect RAG
>>
>>106515439
All people with a Facebook account maybe, Meta is API first now thanks to Wang's wisdom.
>>
best model to be a kemoshota and get fucked by big bad wolves?
>>
>>106515459
that sounds like a lot of bloat knowledge.
>>
>>106515459
Well, I learned a new word today.
>>
>>106515463
bloat? thats essential knowledge
>>
>>106515459
Gemma 300M with RAG
>>
>>106515459
Male or female wolves?
>>
File: snort.gif (34 KB, 500x400)
34 KB
34 KB GIF
>>106515476
both
>>
>>106515459
real life you big faggot haha lol owned
>>
File: 1742276907743626.png (3 MB, 1862x5014)
3 MB
3 MB PNG
is this true?
>>
>>106515496
Fuck no, we can't do anything worthwhile, except for that one anon of course.
>>
>>106515496
it is, i was there.
>>106515491
i cant be a cute creature in real life anon kun...
>>
>>106515310
Seems like you increased the lethargy and dementia setting a little too high
>>
>>106515442
I like it more with thinking enabled. Usually what I have to do is grab one with a refusal and flip some of the words. After leaving one or two in context, it starts doing the thinking in an uncensored way. Then I like being able to edit the thinking part to customize the response.
>>
>>106515301
Holy slowpoke
>>
>>106515181
>If it is possible to fine-tune a decent model with significantly less data than the entire internet, then that theoretically could mean you could have better models at lower parameters
Huh? Training on less data would reduce training cost, but not parameters. If anything you'd need more parameters to reach the same level. More data makes a model more parameter-efficient, not less. You're confused, anon. People figured this out a long time ago when they moved past chinchilla scaling.
Current models ARE probably inefficient at RP because they're not being designed for it, but this doesn't mean you can skip out on training.
>>
File: 30474 - SoyBooru.png (118 KB, 337x390)
118 KB
118 KB PNG
More kiwis soon! (Qwen)
>>
>>106515496
I think these kind of things should be documented. I'm pretty sure a lot of stuff discovered back then are still not in any paper yet
>>
File: 1734560768546931.gif (3.12 MB, 498x370)
3.12 MB
3.12 MB GIF
>>106515561
>lot of stuff discovered back then are still not in any paper yet
like what?
>>
>>106515545
Not the entire internet.
>>
>>106515291
It's open source. Show where.
>>
>>106515545
NTA, but since larger models memorize more, they're able to recall more of the rare information seen during pretraining. To some extent (it's not just that, admittedly), the stronger RP capabilities of those models are because of that. A smaller model pretrained primarily for the purpose of simulating human interactions, conversations and RP (instead of improving math benchmarks, etc) could potentially match the capabilities of larger models in that area.

Of course we'll never have that as long as we have anons who care about models doing the tsundere bubble sort in Erlang or proving the theory of relativity while making you a blowjob as a mesugaki.
>>
>>106515612
I think the last few years has been pretty bad for ai generated content. and all the political radicalization in the past decade or so. I think ideally you would use the entire pre 2012 internet.
>>
>>106515612
Yes, we'll filter out the spam and garbage.
>>
>>106515654
define garbage?
>>
>>106515654
You need to filter more than that to be efficient.
>>
>>106515545
This is correct. Bigger models are more sample efficient, so they an learn more from less data. In contrast, small models need more data to reach a given level of quality.
>>
>>106515663
"the a congo sex the the a congo congo nigeria vagina pussy pussy the the" kind of documents.
>>
Most posters itt are severely autistic.
>>
>>106515679
Yeah, can't believe anyone would argue for hours about how to train models instead of just doing it and shutting everyone up for good.
>>
>>106515679
https://vocaroo.com/12RPstjPnT74
>>
>>106512596
>you're a spoiled child if you expect AIniggerdevs to stop writing python slop code that requires command line manual installation in 2025
EXE. I want EXE. Where is the EXE.
>>
>>106515692
Sure, just download the entire internet, set up a filtering pipeline, then give me the money to pretrain a full model on it.
>>
>>106515692
It all started when I suggested that LLMs won't acquire significant "tacit knowledge" until they've seen large amounts of data, and that this could be expedited with targeted training data...
>>
>>106514698
Yes, its been explained before, the context search isn't literal, most NIAH tests involve having context like "John put some mayonnaise on his hamburger and hot dog." and then asking "What condiment did John put on his hamburger?". NoLiMA goes and asks something like "John got some french fries. What condiment(s) would he likely put on it?". That requires actual reasoning and connecting the dots on the context you have to extrapolate correctly things which is harder when you aren't as said literally matching what you have seen in the context to the question asking about it.
>>
>>106515692
https://vocaroo.com/1nWbsRIXibi3
>>
>>106515692
it takes time, anon
>>
>>106515679
I should hope so. Autistic people are how we get our best technological breakthroughs.
>>
File: nip.jpg (620 KB, 2569x1927)
620 KB
620 KB JPG
Why hasn't OpenHands finetuned a coding model on Qwen3-coder yet? why use Qwen2-coder
>>
>>106515719
Exactly. llama.cpp is popular because you can download it as an exe file and run, no pythonshit needed.
>>
>>106515741
god damn thats so good ladies and gentlemen the best tts out there
>>
>>106514823
>>decide to check on ipex-llm to see if they finally updated to support latest models like oss-gpt
They are all in on vLLM and for good reason too because of the enterprise and project BattleMatrix. They do what they can with ipex-llm and contributions to llama.cpp but it is lower priority and neglected. Mainline llama.cpp SYCL isn't that bad, but you can see the neglect when a crashing bug was fixed in https://github.com/ggml-org/llama.cpp/pull/15582 but there was a mistake done and it wasn't followed up on with two weeks and counting to get it merged in. Sad.
>>
gay thread. 80% of the population is newfags 80% tof the population is troons.
>>
>>106515769
>80% tof the population is troons.
It's not quite that bad yet nonnie.
>>
>>106515766
>vLLM
Are they actually directly contributing to vLLM to have native ipex support or do you still have to go through ipex-llm to use vLLM with ipex?
>>
>>106515769
Saar! You forgot India mention! India AI superpower 2025 Gemini Google. Kindly say 80% posters Indian thank you saar.
>>
>>106515787
80% of population is indeed indian too.
>>
>>106515790
so 80% indian train gays? Waow!
>>
>>106515760
https://vocaroo.com/121za8zMgKiQ
>>
Will we soon look at "coding" the same way we look at "calculating"? Prior to calculators and computers, we used to have rooms of humans doing things like ballistics calculations.
Now you still need to to know *math* to use a calculator or spreadsheet effectively to solve problems that span more than a single operation, but you don't need to do *arithmetic* any more.
tl;dr vibecoding normies are just monkeys banging on a calculator to get "8008135". There's higher-order knowledge needed to make software.
>>
>>106515803
shit, a few seconds in I knew I was going to contract orange man cancer
>>
>>106515784
https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html
You have to build the wheels yourself but they are contributing as regularly as a company going bankrupt with limited resources is doing
https://github.com/vllm-project/vllm/commit/e599e2c65ee32abcc986733ab0a55becea158bb4
This is on par with their Pytorch cadence. This was the last SYCL related commit to llama.cpp in comparison and it wasn't even done by Intel.
https://github.com/ggml-org/llama.cpp/commit/8b696861364360770e9f61a3422d32941a477824
>>
>>106515803
But I can't use voice clips from my favorite anime voice actress
>>
>>106515802
Yes saar India love trains saar very prod of the country
>>
>>106515831
MAKE NEW THREAD BLOODY
>>
>>106515803
nice
>>
File: 1730545865388820.png (1.65 MB, 712x1188)
1.65 MB
1.65 MB PNG
>>106515719
>>106515759
If you get filtered by CLI you deserve suffering.
>>
>>106515679
I have never been diagnosed with autism.
>>
>>106515852
Fuck off pyshitter.
>>
>>106515855
Only a doctor can diagnose autism.
>>
>>106515803
what settings are you using? mine are all coming out completely schizophrenic.
>>
>>106515719
>>106515759
>>106502028
https://vocaroo.com/1j4yGPQKczdx
>>
>>106515877
30 steps. It depends on the voice sample too.
>>
>>106515852
Damn, Python looks like that?
I don't care that she's a bit slow, she's bloated in all the right places.
>>
>>106502028
Actually incredibly based opinion, but troons will disagree.
>>
Has anyone sussed out any best practices for vibevoice samples? I'm not sure yet if it's better to go for longer samples or to trim it down closer to the length of the audio you're trying to make.
>>
>>106502028
That is what kobold is for. And from kobold you can fall into a trap of oobashit or you can go straight to llamacpp. Once you have it set up it honestly isn't that bad. I don't even have a bat, just have the commands in a textfile and it werks. Even without bat it is actually faster than oobashit and kobold.
>>
>>106515896
does this shit need a longer sample? same 8 second sample i used with zonos is completely wonked out with default settings in the node setup, adding steps doesn't change it.
>>
>>106515437
>The honeymoon period didn't last long
this has been the entire history of the GLM models and only retards keep pushing them
>>
File: 1753483768624402.png (820 KB, 724x510)
820 KB
820 KB PNG
>>106515907
>>106501412
>SPEAK LIKE A HUMAN BEING YOU SYNTHETIC MONSTER
>>
>>106515437
>and I now use gpt-oss as my main model.
what?
>>
>>106515929
I used a 23s sample for that one.
>>
>>106515437
>I now use gpt-oss as my main model
Bro, you're supposed to turn off the model thinking not your own
>>
>>106515929
most of my samples are a minimum of 40 seconds but two minutes gives the best results
smallest on is the Mandy sample I used here >>106515879 at like 38 seconds and I cleaned it to the best of my abilities but some of the background noises still bleed thru
>>
>>106515926
https://vocaroo.com/1bFeQGTMqTTf
>>
>>106515972
you never had any thinking when you thought glm was a good model bro
>>
>>106515496
>Holo prompt
Was ist das? Google returns garbage.
>>
>>106515963
noted. I guess the combination of my sample being kinda fast + only 8 seconds it really didn't like that.
here's alucard reading this post >>106515787
(30 steps seems like the max it needs for a quality boost)
https://voca.ro/1i3Yya3rUVn6


>>106515985
cool thanks for the info, gonna be a challenge to get that character over 8 seconds but at least alucard had that 10+ seconds kek
>>
>>106515999
go to the link in the image and read the thread
>>
>>106515999(me)
ah fuck, should've scrolled down the image.
>>
>>106515830
Why not?
https://vocaroo.com/1beCnoUdgpID
>>
File: dipsyByzantine1.png (3.44 MB, 1024x1536)
3.44 MB
3.44 MB PNG
>>106515810
The real change will come when there are "vibecode" specific languages created.
It's only a matter of time.
>>
>>106516038
It's called English, r-tard.
>>
yeah my results are aaaalll over the place, but this turned out really nicely.

https://voca.ro/16gmTFt1O8vf
>>
>>106516038
isn't that just python?
>>
>>106516059
I found out that ComfyUI implementation is all over the place. Python demo is way more reliable and it's more consistent.
I don't know if it's because of Cumrag itself or are the implemented nodes bad. I can only guess.
>>
>>106515907
>>106515719
https://voca.ro/1b5FwnOiykK6
>>
>>106515759
llama.c when
>>
>>106516038
Javascript already exists, anon. People have been vibecoding JS since before AI existed.
>>
>>106516059
I think it struggles with voices that have a very high dynamic frequency range like Peach there. It's difficult to get a sample for certain seiyuus where they aren't peaky like that, since that's part of the appeal.
>>
>>106516077
could ya link me the python demo? thanku.
glad to know i'm not alone.
>>
>>106516074
and javascript for when interfaces are needed
>>
>>106516080
kek
>>
>>106516080
I think I recognize that voice a little but can't quite place it, is it a cartoon girl bully?
>>
>>106516091
https://github.com/vibevoice-community/VibeVoice/
>>
>>106515803
make him do porn noises
>>
>>106514051
>the quality of the output starting to go downhill if I generate anything longer than 30 seconds.

Not true with the native implementation (clone Microsoft/VibeVoice)

I did not go over 4 min/ However, it stays consistent all along
>>
>>106516106
It's the Witch from Slay The Princess.
>>
>>106516037
https://vocaroo.com/1iIp0ji2b59p
>>
>>106516091
>>106516108
Forgot that you'll need to look at inference_from_file.py
and do something like
>python demo/inference_from_file.py --model_path ./VibeVoice-1.5B --txt_path ./test.txt --speaker_names Faggot
>voices go into demo/voice and are named en-Faggot_male for example
>>
>>106516117
It sadly gets pretty rough even with that when you try to do a 40 minute script. It slowly starts getting worse.
>>
>>106516130
never played that game and checked her imdb, barely any roles and mostly online shit
weird
>>
amerilards cant into bakery
>>
>>106516108
>>106516140
thanks. i noticed there's two repos for this too, the wildminder one and one by a guy named Fabio Sarracino.

Might just uninstall wildminder's and risk getting AIDS from Fabio. Worth a shot.
>>
>>106515810
I've been thinking a lot about this old Twilight Zone episode that depicted future programmers as just people with microphones that speak to the machines.
There was also a time when compilers were looked down. They saved time by letting you write in C, but often times the result was not performant, didn't output valid Assembler, and you ended up having to write your own anyway. Nowadays, almost no one has to write hand rolled Assembler anymore, and attempting to do so outside a few niches would result in worse code than what the compiler is capable of writing.
The technology is still new. I'm sure the transition to higher-order knowledge work is inevitable, but it's probably still decades away.
>>
>>106516038
Tokens to represent ASM opcodes?
direct cpu token interpretation?
Token microcode?
>>
>>106516147
It's built on Qwen 2.5, so naturally it will start to degrade when you try to use the full context it claims to support. Just chunk it. At least you can do far bigger chunks than with other TTS.
>>
>>106516038
Shouldn't it be sygma instead of capital C

naoshite kure
>>
>>106516109
https://vocaroo.com/1oL6yJxfeEIp
>>
>>106516198
Then we build a language on top of that.
We need more layers of abstraction.
>>
>>106516198
The vast majority of enterprise LOB apps and startup shovelware does not need to go that low level. The trend is always towards more abstractions, not less. If anything, a new language designed for use by LLMs would be an abstraction over Python.
>>
>>106516160
Yeah, as best I can tell, she's a pretty mid streamer with a ton of untapped voice acting talent. I think she only got the Slay The Princess role after sending something directly to the devs. I hope she goes out and gets more roles, because she knocked it out of the park with the one she got.
>>
>>106516218
L.O.L.
You made me weekend!
>>
>>106516218
After hearing some porn noise samples posted here, I can now conclude that rumors about VV being pulled due to NSFW usage are false.
>>
>>106516218
Lol
>>
>>106516147
what kind of "limit" is it? it is the github demo with gradio
Loaded example: 1p_Ch2EN.txt with 1 speakers
Loaded example: 1p_abs.txt with 1 speakers
Loaded example: 2p_goat.txt with 2 speakers
Loaded example: 2p_music.txt with 2 speakers
Loaded example: 2p_short.txt with 2 speakers
Loaded example: 2p_yayi.txt with 2 speakers
Loaded example: 3p_gpt5.txt with 3 speakers
Skipping 4p_climate_100min.txt: duration 100 minutes exceeds 15-minute limit
Skipping 4p_climate_45min.txt: duration 45 minutes exceeds 15-minute limit
Successfully loaded 7 example scripts
Launching demo on port 7860
>>
>>106516038
we tried a few times making programming languages that are close to natural language and easy for humans, but results were mediocre.
SQL is a particular disaster that will haunt us forever.
>>
>>106516275
Oh, so it's supposed to be used with shorter scripts. Is it only the 1.5B one that can do long scripts?
>>
>>106516218
lmaooooo, that's why I love this site, I know I'll find kino shit like that at some point
>>
>>106516288
The model card says 7B can do 45 minutes and 1.5B can do 90 minutes.
>>
>>106516309
It's still mostly intelligible at 45 minutes, but it does hurt your ears, so it's not a complete lie.
>>
>>106516288

hmmm...
>>
>>106516276
I think it'll be something far different, that probably won't make intuitive (human) sense. Basically human unreadable.
I get the point, can't train what you don't have examples of, and there's lots of python and JS to copy from.
But I expect there will be some intermediary language, that LLM (or whatever) can manipulate really easily, and humans won't be able to understand at all.
>>
https://voca.ro/1gZ6xankFzjP
>>
>>106516349
eh, just skip the middle man and train it on machine code directly at this point.
Chunk generated output into a VM, see if it works, reward/punish the model, repeat.
>>
>>106516368
>>106516368
>>106516368
>>
>>106516373
>eh, just skip the middle man and train it on machine code directly at this point.
Do you have any idea how many tokens it would need to spit out to do even the most trivial tasks?
>>
>>106515679
high functioning autist hobby that filters most people
your average joe wouldn't know how any of this works except thinking that chatgpt is some kind of magic word machine that pulls stuff out of thin air
>>
>>106516412
>high functioning autist forum that filters most people
ftfy



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.