/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/15/25(Mon)08:31:33 No.106593104

File: 1729161978418371.jpg (607 KB, 1080x1920)

607 KB JPG

/lmg/ - Local Models General Anonymous 09/15/25(Mon)08:31:33 No.106593104 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106582475 & >>106575202

►News
>(09/14) model : add grok-2 support #15539 merged: https://github.com/ggml-org/llama.cpp/pull/15539
>(09/11) Qwen3-Next-80B-A3B released: https://hf.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d
>(09/11) ERNIE-4.5-21B-A3B-Thinking released: https://hf.co/baidu/ERNIE-4.5-21B-A3B-Thinking
>(09/09) Ling & Ring mini 2.0 16B-A1.4B released: https://hf.co/inclusionAI/Ring-mini-2.0
>(09/09) K2 Think (no relation) 32B released: https://hf.co/LLM360/K2-Think

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/15/25(Mon)08:32:11 No.106593110

Anonymous 09/15/25(Mon)08:32:11 No.106593110

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>106582475

--Paper: Steering MoE LLMs via expert activation/deactivation for behavior control:
>106586569 >106586649 >106586696
--Papers:
>106589525
--Node-based agent circuit for multi-model daydreaming experiments:
>106591301 >106591335 >106591411 >106591447 >106591518 >106591560 >106591683
--DDR5 RAM purchase recommendation for glm air over waiting for Arc B60:
>106585865 >106585907 >106586028 >106586157 >106586691 >106587973 >106588740 >106588044
--MoE architecture enables larger models to be faster through selective parameter activation:
>106587275 >106587302 >106587405 >106587419
--glm 4.5 air setup issues in Silly Tavern template configuration:
>106586816 >106586886 >106587013 >106587027
--Qwen model dataset imbalances and performance tradeoffs:
>106582623 >106582643 >106583124 >106583138 >106583143 >106583155 >106586595 >106583147 >106592024 >106592033 >106592110 >106592242
--VibeVoice model availability, quality tradeoffs, and reverse-engineering challenges:
>106585909 >106585930 >106585940 >106588461 >106586039 >106586587 >106586610 >106586647 >106587720 >106586704 >106587007 >106587090 >106588243
--CPU offloading performance trade-offs for mid-sized MOE models:
>106583262 >106583338
--IndexTTS 2 speed and interface improvements for text-to-speech:
>106585295 >106585756
--Grok-2 support merged into llama.cpp:
>106587526 >106589842 >106589942 >106589949 >106590115
--Critique of flawed AI-generated writing despite model advancements:
>106592247
--ROCm 7.0 RC1 boosts AMD's AI performance, challenging NVIDIA dominance:
>106589235 >106589359 >106589362
--Parameter tuning suggestions for K2 model version differences:
>106584425 >106584478 >106585603
--Miku (free space):
>106584024 >106584226 >106584417 >106587589 >106587800 >106589360 >106589741 >106589764 >106592033 >106589913

►Recent Highlight Posts from the Previous Thread: >>106582480

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/15/25(Mon)08:34:56 No.106593132

Anonymous 09/15/25(Mon)08:34:56 No.106593132

qwenext goofs???????????????????????????????????????

Anonymous
09/15/25(Mon)08:35:21 No.106593142

Anonymous 09/15/25(Mon)08:35:21 No.106593142

>>106593104
I want to do backpropagation with her, if you know what I mean.

Anonymous
09/15/25(Mon)08:36:05 No.106593148

Anonymous 09/15/25(Mon)08:36:05 No.106593148

File: 1744660410909828.jpg (1.66 MB, 1166x2527)

1.66 MB JPG

>OP image is a random non-slop Miku I posted a few threads ago

Anonymous
09/15/25(Mon)08:36:32 No.106593152

Anonymous 09/15/25(Mon)08:36:32 No.106593152

>>106593104
PANTYHOSE FEET

Anonymous
09/15/25(Mon)08:37:42 No.106593164

Anonymous 09/15/25(Mon)08:37:42 No.106593164

>>106593132
Boil some rice, put it on a plate and let it dry and then eat it for a similar experience

Anonymous
09/15/25(Mon)08:38:24 No.106593173

Anonymous 09/15/25(Mon)08:38:24 No.106593173

>>106593164
yes but
WHERE ARE THE GOOFS?

Anonymous
09/15/25(Mon)08:40:03 No.106593186

Anonymous 09/15/25(Mon)08:40:03 No.106593186

>>106593180
goofs for this feel?

Anonymous
09/15/25(Mon)08:40:11 No.106593188

Anonymous 09/15/25(Mon)08:40:11 No.106593188

>>106593180
I look like this, say this, and also fail to quote posts

Anonymous
09/15/25(Mon)08:42:02 No.106593202

Anonymous 09/15/25(Mon)08:42:02 No.106593202

Jaks are a sign of a diseased mind.

Anonymous
09/15/25(Mon)08:42:09 No.106593203

Anonymous 09/15/25(Mon)08:42:09 No.106593203

File: 1737232912753522.png (996 KB, 1648x1300)

996 KB PNG

>>106593196

Anonymous
09/15/25(Mon)08:42:15 No.106593205

Anonymous 09/15/25(Mon)08:42:15 No.106593205

what prompts this schizophrenia? I just want my hecking wholesomechungus 'THE CAKE IS A LE LIE' qwen 80b goofs

Anonymous
09/15/25(Mon)08:44:27 No.106593228

Anonymous 09/15/25(Mon)08:44:27 No.106593228

>>106593208
Me when using gpt oss

Anonymous
09/15/25(Mon)08:44:53 No.106593234

Anonymous 09/15/25(Mon)08:44:53 No.106593234

this thread ggoofy af

Anonymous
09/15/25(Mon)08:45:17 No.106593238

Anonymous 09/15/25(Mon)08:45:17 No.106593238

The melting man is back
he's much softer than before
did you borrow a personality
or did you steal it all on your own?

Anonymous
09/15/25(Mon)08:47:35 No.106593259

Anonymous 09/15/25(Mon)08:47:35 No.106593259

File: 1755970028871316.jpg (520 KB, 1824x1248)

520 KB JPG

Anonymous
09/15/25(Mon)08:52:41 No.106593301

Anonymous 09/15/25(Mon)08:52:41 No.106593301

File: cypher.jpg (51 KB, 768x384)

51 KB JPG

>decide to take a break from /lmg/ and doomscroll on twitter for a bit.
>it's not X, it's Y
>the smell of stale cigarette smoke and regrets
>fake greentext pasta spaced into paragraphs
>you hit on the core of the issue
>shivers, ozone, Elara, emojis
how do I unsee

Anonymous
09/15/25(Mon)08:52:52 No.106593302

Anonymous 09/15/25(Mon)08:52:52 No.106593302

File: 1751389967120259.png (351 KB, 503x461)

351 KB PNG

https://files.catbox.moe/eegitb.jpg

Anonymous
09/15/25(Mon)08:53:55 No.106593314

Anonymous 09/15/25(Mon)08:53:55 No.106593314

>>106593302
I fell for it last time, ain't happening again.

Anonymous
09/15/25(Mon)08:54:57 No.106593324

Anonymous 09/15/25(Mon)08:54:57 No.106593324

>>106593302
thicku miku

Anonymous
09/15/25(Mon)08:54:57 No.106593325

Anonymous 09/15/25(Mon)08:54:57 No.106593325

>>106593302
nigga that's nuts

Anonymous
09/15/25(Mon)08:55:17 No.106593329

Anonymous 09/15/25(Mon)08:55:17 No.106593329

>>106593302
Meh.

Anonymous
09/15/25(Mon)09:02:13 No.106593386

Anonymous 09/15/25(Mon)09:02:13 No.106593386

https://old.reddit.com/r/LocalLLaMA/comments/1nhgd9k/the_glm_team_dropped_me_a_mail/
lol glm has employees doing social media engagement
wonder if one of them is among the people shitting this thread right now

Anonymous
09/15/25(Mon)09:02:49 No.106593393

Anonymous 09/15/25(Mon)09:02:49 No.106593393

OP just delete thread if you can

Anonymous
09/15/25(Mon)09:03:45 No.106593404

Anonymous 09/15/25(Mon)09:03:45 No.106593404

>>106593393
nah, fuck qatroons

Anonymous
09/15/25(Mon)09:04:36 No.106593412

Anonymous 09/15/25(Mon)09:04:36 No.106593412

>>106593386
You are even more gullible than reddit.
Or something worse.

Anonymous
09/15/25(Mon)09:04:43 No.106593413

Anonymous 09/15/25(Mon)09:04:43 No.106593413

>>106593393
Let the retard seethe. It's not like he can do anything.

Anonymous
09/15/25(Mon)09:05:22 No.106593420

Anonymous 09/15/25(Mon)09:05:22 No.106593420

>>106593386
why would GLM shit up the thread where their models are praised?
>>106593413
what's the shitter even angry about? Is it the thread mascot debate again?

Anonymous
09/15/25(Mon)09:05:23 No.106593421

Anonymous 09/15/25(Mon)09:05:23 No.106593421

>>106591301
I was thinking of fucking around with those sorts of workflows to see if I can make a smaller model perform better by making it go through steps before providing a final response. Almost like a thinking workflow that tries to extract as much information from the big picture to then focus on the relevant details and the like.
I got caught up with other projects and ended up forgetting about that.

Anonymous
09/15/25(Mon)09:06:06 No.106593424

Anonymous 09/15/25(Mon)09:06:06 No.106593424

>>106593420
fuck your thread culture bullshit

Anonymous
09/15/25(Mon)09:06:57 No.106593427

Anonymous 09/15/25(Mon)09:06:57 No.106593427

>>106593421
What's the UI in the quoted reply? Seems cool.
>>106593424
fuck you I didn't even advocate for "thread culture" I was just asking a question you dork

Anonymous
09/15/25(Mon)09:07:15 No.106593429

Anonymous 09/15/25(Mon)09:07:15 No.106593429

I ask again , just in case. Can "Mistral-Nemo-Instruct-2407-GGUF" handle beyond 16K context?

Anonymous
09/15/25(Mon)09:08:19 No.106593441

Anonymous 09/15/25(Mon)09:08:19 No.106593441

>>106593429
Try it. Only you can know if it can handle it to your satisfaction.

Anonymous
09/15/25(Mon)09:08:22 No.106593443

Anonymous 09/15/25(Mon)09:08:22 No.106593443

>>106593429
Technically yes but realistically no. Just try it out for yourself, the model could fit on a 6G card ffs

Anonymous
09/15/25(Mon)09:08:28 No.106593444

Anonymous 09/15/25(Mon)09:08:28 No.106593444

>>106593427
>What's the UI in the quoted reply?
Not sure, but I know of two UIs that can do that kind of thing, NoAssTavern (simpler and recommended), and astrsk (don't even download it, has telemetry and shit).

Anonymous
09/15/25(Mon)09:09:27 No.106593448

Anonymous 09/15/25(Mon)09:09:27 No.106593448

>>106593429
it creates mustard gas

Anonymous
09/15/25(Mon)09:09:43 No.106593453

Anonymous 09/15/25(Mon)09:09:43 No.106593453

>>106593429
No.

Anonymous
09/15/25(Mon)09:09:46 No.106593454

Anonymous 09/15/25(Mon)09:09:46 No.106593454

>>106593429
Yes, of course.
It will perform worse than it does at, say, 4k context, however.

Anonymous
09/15/25(Mon)09:11:01 No.106593462

Anonymous 09/15/25(Mon)09:11:01 No.106593462

>>106593420
>why would GLM shit up the thread where their models are praised?
you assumed I was talking about the meme spammer. I don't even pay attention to his image spam, it doesn't register in my eyes, image posters are to be ignored.
I was talking about people who praise this garbage model like you, you are the reason this is a garbage thread
spammer is just a minor annoyance that will go away after a b&, the retards never go away though

Anonymous
09/15/25(Mon)09:13:37 No.106593482

Anonymous 09/15/25(Mon)09:13:37 No.106593482

>>106593462
>image posters are to be ignored.
sir this is image baords

Anonymous
09/15/25(Mon)09:14:21 No.106593487

Anonymous 09/15/25(Mon)09:14:21 No.106593487

>>106593444
Huh, I stumbled upon another interesting UI called "talemate" mentioned in one of the NoAssTavern's issues.
https://github.com/vegu-ai/talemate
>>106593462
Every model smaller than Deepseek is garbo, get a grip. Smaller models like Air are the only thing most people can run. Fucking hell, you see how often Rocinante gets mentioned here? What is there to discuss with "non-shit" models if nobody can run them you dickweed?

Anonymous
09/15/25(Mon)09:15:41 No.106593494

Anonymous 09/15/25(Mon)09:15:41 No.106593494

>>106593487
>talemate
Alright, that looks promising.

Anonymous
09/15/25(Mon)09:16:47 No.106593504

Anonymous 09/15/25(Mon)09:16:47 No.106593504

>>106593487
>if nobody can run them
then let's close this so called local model general if no one is even doing local?

Anonymous
09/15/25(Mon)09:18:29 No.106593511

Anonymous 09/15/25(Mon)09:18:29 No.106593511

>>106593504
>if no one is even doing local
Nobody is using anything smaller than deepseek? news to me...

Anonymous
09/15/25(Mon)09:19:27 No.106593520

Anonymous 09/15/25(Mon)09:19:27 No.106593520

>>106593504
I am running the local sir
GLM chan very large

Anonymous
09/15/25(Mon)09:20:00 No.106593524

Anonymous 09/15/25(Mon)09:20:00 No.106593524

>>106593511
deepseek 8b

Anonymous
09/15/25(Mon)09:20:02 No.106593525

Anonymous 09/15/25(Mon)09:20:02 No.106593525

>>106593511
>Every model smaller than Deepseek is garbo
you said it yourself it's time to stop

Anonymous
09/15/25(Mon)09:20:19 No.106593526

Anonymous 09/15/25(Mon)09:20:19 No.106593526

After I stopped shitposting in this thread the quality of it became even worse. I can't believe it.

Anonymous
09/15/25(Mon)09:21:19 No.106593533

Anonymous 09/15/25(Mon)09:21:19 No.106593533

>>106593526
You're absolutely right! This really delves into the tapestry of how shit lmg is!

Anonymous
09/15/25(Mon)09:21:47 No.106593539

Anonymous 09/15/25(Mon)09:21:47 No.106593539

File: 1746722380902789.mp4 (3.82 MB, 480x852)

3.82 MB MP4

>>106593420
like kids need a reason to be angry

Anonymous
09/15/25(Mon)09:23:04 No.106593550

Anonymous 09/15/25(Mon)09:23:04 No.106593550

>>106593539
>itt raises the kid experince

Anonymous
09/15/25(Mon)09:23:10 No.106593553

Anonymous 09/15/25(Mon)09:23:10 No.106593553

>>106593525
It's garbo compared to large, cloud-hosted models but it's still fun. If the only car you have is a shitbox, do you throw it away? Come on, man.

Anonymous
09/15/25(Mon)09:23:50 No.106593558

Anonymous 09/15/25(Mon)09:23:50 No.106593558

File: 1755964542474429.jpg (186 KB, 768x1024)

186 KB JPG

>>106593393
Delete your posts

Anonymous
09/15/25(Mon)09:24:12 No.106593559

Anonymous 09/15/25(Mon)09:24:12 No.106593559

>>106593553
>If the only car you have is a shitbox, do you throw it away?
yes, take the bus and train (API) like a normal person

Anonymous
09/15/25(Mon)09:24:18 No.106593561

Anonymous 09/15/25(Mon)09:24:18 No.106593561

>>106593525
maybe I love garbo

Anonymous
09/15/25(Mon)09:24:37 No.106593566

Anonymous 09/15/25(Mon)09:24:37 No.106593566

>>106593539
While it doesn't change my position on it at all, I suddenly understand where the proponents of age verification are coming from.

Anonymous
09/15/25(Mon)09:26:00 No.106593574

Anonymous 09/15/25(Mon)09:26:00 No.106593574

>>106593566
That wouldn't help tho as clearly an adult is helping and encouraging the corruption

Anonymous
09/15/25(Mon)09:26:02 No.106593575

Anonymous 09/15/25(Mon)09:26:02 No.106593575

>>106593301
You cannot close your eyes once they've been opened

Anonymous
09/15/25(Mon)09:26:19 No.106593578

Anonymous 09/15/25(Mon)09:26:19 No.106593578

>>106593566
lmao you actually think age checks are to protect kids?

Anonymous
09/15/25(Mon)09:27:05 No.106593587

Anonymous 09/15/25(Mon)09:27:05 No.106593587

>>106593575
anon is you okay, you can close the eyes

Anonymous
09/15/25(Mon)09:27:06 No.106593588

Anonymous 09/15/25(Mon)09:27:06 No.106593588

>>106593559
Nah I think I'll stick to my shitbox. I can drive it when and where ever I want, and it won't suddenly change routes and timetables. But I support your ability to choose, just don't pretend like the only options are public transport or a lambo...

Anonymous
09/15/25(Mon)09:27:16 No.106593590

Anonymous 09/15/25(Mon)09:27:16 No.106593590

>>106593301
If you get into imagegen, you'll see it everywhere.

Anonymous
09/15/25(Mon)09:27:23 No.106593591

Anonymous 09/15/25(Mon)09:27:23 No.106593591

>>106593574
It wouldn't, but I get the emotional reaction.

Anonymous
09/15/25(Mon)09:29:53 No.106593609

Anonymous 09/15/25(Mon)09:29:53 No.106593609

Thanks this is very helpfuls.

Anonymous
09/15/25(Mon)09:31:02 No.106593616

Anonymous 09/15/25(Mon)09:31:02 No.106593616

>>106593612
I do not like this miku

Anonymous
09/15/25(Mon)09:31:05 No.106593617

Anonymous 09/15/25(Mon)09:31:05 No.106593617

>>106593587
Im fine. Thanks for asking

Anonymous
09/15/25(Mon)09:42:12 No.106593690

Anonymous 09/15/25(Mon)09:42:12 No.106593690

can i get a short stack miku pls

Anonymous
09/15/25(Mon)09:42:38 No.106593693

Anonymous 09/15/25(Mon)09:42:38 No.106593693

File: 1754402174485487.png (2.62 MB, 1024x1536)

2.62 MB PNG

>>106593629

Anonymous
09/15/25(Mon)09:43:04 No.106593698

Anonymous 09/15/25(Mon)09:43:04 No.106593698

>>106593690
best xhe can steal is shart miku

Anonymous
09/15/25(Mon)09:43:44 No.106593704

Anonymous 09/15/25(Mon)09:43:44 No.106593704

File: 1757278579632716.jpg (1.43 MB, 2000x1500)

1.43 MB JPG

>>106593690
No. You get a baby Miku instead.

Anonymous
09/15/25(Mon)09:44:04 No.106593709

Anonymous 09/15/25(Mon)09:44:04 No.106593709

Is NoobAI still the meta or have things moved on

Anonymous
09/15/25(Mon)09:44:29 No.106593711

Anonymous 09/15/25(Mon)09:44:29 No.106593711

>>106593694
>>106593698
>>106593704
my day is ruined

Anonymous
09/15/25(Mon)09:47:55 No.106593743

Anonymous 09/15/25(Mon)09:47:55 No.106593743

>>106593709
ponyv7 releases this month

Anonymous
09/15/25(Mon)09:49:10 No.106593756

Anonymous 09/15/25(Mon)09:49:10 No.106593756

>>106593743
oh? can it be downloaded or is it online only?

Anonymous
09/15/25(Mon)09:49:37 No.106593764

Anonymous 09/15/25(Mon)09:49:37 No.106593764

>>106593743
back to your board barney

Anonymous
09/15/25(Mon)09:50:13 No.106593774

Anonymous 09/15/25(Mon)09:50:13 No.106593774

>>106593743
more sdxl slop?

Anonymous
09/15/25(Mon)09:50:50 No.106593777

Anonymous 09/15/25(Mon)09:50:50 No.106593777

>>106593774
as opposed to what then?

Anonymous
09/15/25(Mon)09:51:57 No.106593787

Anonymous 09/15/25(Mon)09:51:57 No.106593787

>>106593777
you haven't heard about the current best local model called chroma?

Anonymous
09/15/25(Mon)09:52:26 No.106593790

Anonymous 09/15/25(Mon)09:52:26 No.106593790

>>106593777
idk, I haven't kept up with image gen, I wish we had something integrated with LLMs instead of CLIP

Anonymous
09/15/25(Mon)09:52:45 No.106593793

Anonymous 09/15/25(Mon)09:52:45 No.106593793

>>106593777
Chroma SOTA 4futures!

Anonymous
09/15/25(Mon)09:52:49 No.106593794

Anonymous 09/15/25(Mon)09:52:49 No.106593794

>>106593787
Can it match noobAI/pony for character stuff?

Anonymous
09/15/25(Mon)09:53:03 No.106593796

Anonymous 09/15/25(Mon)09:53:03 No.106593796

>>106593787
That's just a rip off of ligma

Anonymous
09/15/25(Mon)09:54:04 No.106593804

Anonymous 09/15/25(Mon)09:54:04 No.106593804

>>106593774
Wasn't it gonna be based on some random shit nobody has ever used
>AuraFlow
Yep.

Anonymous
09/15/25(Mon)09:54:46 No.106593813

Anonymous 09/15/25(Mon)09:54:46 No.106593813

>>106593756
weights
>>106593774
it's based on auraflow

Anonymous
09/15/25(Mon)09:55:34 No.106593820

Anonymous 09/15/25(Mon)09:55:34 No.106593820

>>106593813
>weights
ok, can it be downloaded or is it online only?

Anonymous
09/15/25(Mon)09:56:12 No.106593829

Anonymous 09/15/25(Mon)09:56:12 No.106593829

>>106593820
Yes you will be able to download it

Anonymous
09/15/25(Mon)09:56:36 No.106593832

Anonymous 09/15/25(Mon)09:56:36 No.106593832

>>106593829
Thank you.

Anonymous
09/15/25(Mon)09:59:19 No.106593857

Anonymous 09/15/25(Mon)09:59:19 No.106593857

>>106593832
You're not welcome

Anonymous
09/15/25(Mon)10:00:07 No.106593862

Anonymous 09/15/25(Mon)10:00:07 No.106593862

>>106593832
You're free to leave

Anonymous
09/15/25(Mon)10:00:42 No.106593869

Anonymous 09/15/25(Mon)10:00:42 No.106593869

File: 1736470160461856.png (1.75 MB, 894x766)

1.75 MB PNG

>>106593104
Good morning /lmg/ frens. I've got a question:

So it it pretty much confirmed and fact that you HAVE to use at least a 12B model I order for it to be "smart"? (Not forgetting important details mentioned earlier in the content)? Based on my own testing 7B - 8B models struggle immensely with this. What has your experience been like with the different sized parameter models?

Anonymous
09/15/25(Mon)10:01:49 No.106593881

Anonymous 09/15/25(Mon)10:01:49 No.106593881

>>106593869
If you don't train on The Entire Internet a simple 4B is more than enough for the narrow use case of ERP.

Anonymous
09/15/25(Mon)10:05:34 No.106593914

Anonymous 09/15/25(Mon)10:05:34 No.106593914

>>106593104
mikubutt

Anonymous
09/15/25(Mon)10:06:39 No.106593919

Anonymous 09/15/25(Mon)10:06:39 No.106593919

>>106593914
should've been a miku short stack

Anonymous
09/15/25(Mon)10:07:26 No.106593924

Anonymous 09/15/25(Mon)10:07:26 No.106593924

>>106593869
I wouldn't say smart, but 12b models are about the starting point where you don't need to hold their hand for every reply to get a usable output.

Anonymous
09/15/25(Mon)10:07:36 No.106593927

Anonymous 09/15/25(Mon)10:07:36 No.106593927

>>106593919
*miku shart stacked

Anonymous
09/15/25(Mon)10:08:00 No.106593929

Anonymous 09/15/25(Mon)10:08:00 No.106593929

>>106593539
He's just like me except I'm using a pc

Anonymous
09/15/25(Mon)10:08:02 No.106593930

Anonymous 09/15/25(Mon)10:08:02 No.106593930

VRAMlets:
>image generation
pretty good
>voice cloning/TTS
okay
>text generation (simple)
decent
>text generation (advanced)
really bad

Anonymous
09/15/25(Mon)10:09:26 No.106593935

Anonymous 09/15/25(Mon)10:09:26 No.106593935

>>106593930
What is this (advanced) thing about?

Anonymous
09/15/25(Mon)10:10:21 No.106593941

Anonymous 09/15/25(Mon)10:10:21 No.106593941

>>106593935
DeepSeek K2 4.5

Anonymous
09/15/25(Mon)10:10:27 No.106593942

Anonymous 09/15/25(Mon)10:10:27 No.106593942

>>106593869
I don't think 12B is enough, Nemo is pretty dumb too. GLM-air often mistakes who did what and struggles with theory of mind (secret keeping test and such). I'm not cool enoguh to run larger models though.
>Not forgetting important details mentioned earlier in the content
This one in particular is about specific context training and architecture, not really about parameter size.

Anonymous
09/15/25(Mon)10:11:10 No.106593949

Anonymous 09/15/25(Mon)10:11:10 No.106593949

>>106593935
not brain dead

Anonymous
09/15/25(Mon)10:11:40 No.106593953

Anonymous 09/15/25(Mon)10:11:40 No.106593953

>>106593942
>GLM-air often mistakes who did what and struggles with theory of mind (secret keeping test and such)
Mistral Small 24b and Gemma 27b are guilty of both these things as well.

Anonymous
09/15/25(Mon)10:12:01 No.106593958

Anonymous 09/15/25(Mon)10:12:01 No.106593958

>>106593942
>GLM-air often mistakes who did what
sounds like prompt format issue that nemo used to have early on, probably broken implementation as usual

Anonymous
09/15/25(Mon)10:12:19 No.106593960

Anonymous 09/15/25(Mon)10:12:19 No.106593960

Holy schizo

Anonymous
09/15/25(Mon)10:13:00 No.106593964

Anonymous 09/15/25(Mon)10:13:00 No.106593964

Cursed schizo

Anonymous
09/15/25(Mon)10:18:58 No.106594021

Anonymous 09/15/25(Mon)10:18:58 No.106594021

File: 1727769022327347.png (1.18 MB, 914x594)

1.18 MB PNG

>>106593953
>>106593958
>>106593942
>>106593869
>>106593881
>>106593924

So I guess we have to accept that ALL local LLMs will make fuck ups in some way shape or form? What contribute more to how BADLY It fucks up: perimeter size, architecture, and/or training methods?

Anonymous
09/15/25(Mon)10:19:48 No.106594031

Anonymous 09/15/25(Mon)10:19:48 No.106594031

>>106593857
>>106593862
Bawww.

Anonymous
09/15/25(Mon)10:19:59 No.106594032

Anonymous 09/15/25(Mon)10:19:59 No.106594032

>>106593958
I mostly run it in text completion mode
can't have prompt format issues if you don't format your prompts.

Anonymous
09/15/25(Mon)10:20:28 No.106594034

Anonymous 09/15/25(Mon)10:20:28 No.106594034

>>106593942
>GLM-air often mistakes who did what and struggles with theory of mind (secret keeping test and such).
Funny. I find that it does pretty well in keeping secrets.
Granted, I do prefill the thinking block with instructions to consider exactly those things, which might have some adverse effects in other areas I guess, but still.
To me, the one strong point about GLM is that it actually follows its thinking, instead of something like Qwen that might draft a whole plan in the thinking block then reply with something completely different, even with guidance.

Anonymous
09/15/25(Mon)10:20:36 No.106594037

Anonymous 09/15/25(Mon)10:20:36 No.106594037

>>106594021
And for clarification I'm mostly referring to forgetting details right after you mentioned something, temporal coherence (if a system prompt or previous prompt mentions there in a park, they should stay in the park until stated otherwise or the LLM makes a transition that makes sense), not randomly switching the genders of main characters (this one really likes doing that: >>106593869 , ) etc

Anonymous
09/15/25(Mon)10:21:28 No.106594044

Anonymous 09/15/25(Mon)10:21:28 No.106594044

>>106594021
>What contribute more to how BADLY It fucks up: perimeter size, architecture, and/or training methods?
yes

Anonymous
09/15/25(Mon)10:22:19 No.106594049

Anonymous 09/15/25(Mon)10:22:19 No.106594049

>>106594021
>What contribute more to how BADLY It fucks up: perimeter size, architecture, and/or training methods?
Training on The Entire Internet will do that to you.

Anonymous
09/15/25(Mon)10:28:43 No.106594111

Anonymous 09/15/25(Mon)10:28:43 No.106594111

has someone scrapped AO3 to create a dataset?

Anonymous
09/15/25(Mon)10:30:27 No.106594126

Anonymous 09/15/25(Mon)10:30:27 No.106594126

>>106594111
it's already on most models and yes they did to creators dismay and threats

Anonymous
09/15/25(Mon)10:32:41 No.106594146

Anonymous 09/15/25(Mon)10:32:41 No.106594146

>>106594111
IDK if they specifically from AO3 or from other sites to but here's The closest thing I could find to something like that that hasn't been nuked

https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL

It's not formatted to actually be useful for training but it does have a bunch of raw stories.

Anonymous
09/15/25(Mon)10:40:41 No.106594230

Anonymous 09/15/25(Mon)10:40:41 No.106594230

>>106594146
https://archive.org/details/AO3_final_location

Anonymous
09/15/25(Mon)10:44:26 No.106594262

Anonymous 09/15/25(Mon)10:44:26 No.106594262

>>106594111
its better to just do it yourself so you can filter it how ever you like. its like 40% gay porn by tag. and 50% Harry Potter by universe. it needs balancing if you want it to be useful.

Anonymous
09/15/25(Mon)10:48:18 No.106594302

Anonymous 09/15/25(Mon)10:48:18 No.106594302

File: wild-macintosh.jpg (223 KB, 1125x741)

223 KB JPG

I thought I could get away with running unquanted <4B model CPU-only on an old machine.
Nope, absolutely unusable.
Edge AI Status: Meme.

Anonymous
09/15/25(Mon)10:48:48 No.106594305

Anonymous 09/15/25(Mon)10:48:48 No.106594305

>>106593869
Again, your prompting format is all wrong, if that's Llama 3.

Anonymous
09/15/25(Mon)10:50:02 No.106594319

Anonymous 09/15/25(Mon)10:50:02 No.106594319

>>106594126
Gemma 2/3 and Mistral Small, that I've tested didn't appear to be trained on the ones explicitly tagged as "Explicit" or "Underage".

Anonymous
09/15/25(Mon)10:51:18 No.106594324

Anonymous 09/15/25(Mon)10:51:18 No.106594324

>>106594305
It isn't. Elaborate further if you're certain it is. If you're going to tell someone something is fucked up with the hopes they will unfuck it, at least explain WHY....

Anonymous
09/15/25(Mon)10:51:53 No.106594331

Anonymous 09/15/25(Mon)10:51:53 No.106594331

>>106594319
i mean obviously, why train on low quality illegal shit, the classifier correctly said hell no to that sick shit

Anonymous
09/15/25(Mon)10:54:38 No.106594353

Anonymous 09/15/25(Mon)10:54:38 No.106594353

>>106594324
https://www.llama.com/docs/model-cards-and-prompt-formats/meta-llama-3/

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful AI assistant for travel tips and recommendations<|eot_id|><|start_header_id|>user<|end_header_id|>

What is France's capital?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Bonjour! The capital of France is Paris!<|eot_id|><|start_header_id|>user<|end_header_id|>

What can I do there?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Paris, the City of Light, offers a romantic getaway with must-see attractions like the Eiffel Tower and Louvre Museum, romantic experiences like river cruises and charming neighborhoods, and delicious food and drink options, with helpful tips for making the most of your trip.<|eot_id|><|start_header_id|>user<|end_header_id|>

Give me a detailed list of the attractions I should visit, and time it takes in each one, to plan my trip accordingly.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Anonymous
09/15/25(Mon)10:54:47 No.106594355

Anonymous 09/15/25(Mon)10:54:47 No.106594355

>>106594302
>CPU-only
Yeah, that's going to be pain. Not so much the token generation, but prompt processing is so slow.
There's a reason we use MoE models the way we do, generation on CPU, PP on the GPU.
That said, does whatever device not have a GPU you could use for PP with vulkan?

Anonymous
09/15/25(Mon)10:54:49 No.106594356

Anonymous 09/15/25(Mon)10:54:49 No.106594356

File: fgsfds.png (1.19 MB, 894x766)

1.19 MB PNG

>>106594324
just look right at the middle of the screenshot, man.

Anonymous
09/15/25(Mon)10:58:34 No.106594382

Anonymous 09/15/25(Mon)10:58:34 No.106594382

>>106593104
Can someone recommend best Mistral model? Preferably abliterated

Anonymous
09/15/25(Mon)10:59:17 No.106594387

Anonymous 09/15/25(Mon)10:59:17 No.106594387

>>106594382
The biggest you can run. Any.

Anonymous
09/15/25(Mon)11:00:10 No.106594394

Anonymous 09/15/25(Mon)11:00:10 No.106594394

>>106594382
Medium 3 or Large if you know where to look.

Anonymous
09/15/25(Mon)11:01:19 No.106594401

Anonymous 09/15/25(Mon)11:01:19 No.106594401

>>106594355
Old machine was promoted into a home server after I got new one. I like my home servers to be quiet and low-power, so I don't feel like sticking a GPU in it.

Anonymous
09/15/25(Mon)11:02:09 No.106594408

Anonymous 09/15/25(Mon)11:02:09 No.106594408

>>106594353
>>106594356
That's a fuck up with how axolotl inference outputs. It likes to duplicate portions of text. Here's the correctly formatted text file i inference off of

https://files.catbox.moe/fozpkz.txt

Nothing I thought it is fucked up as far as I can see....

Anonymous
09/15/25(Mon)11:03:28 No.106594419

Anonymous 09/15/25(Mon)11:03:28 No.106594419

File: 1740908472801028.png (238 KB, 465x279)

238 KB PNG

>>106593301
Enjoy the wonderland and see how deep the rabbit hole goes

Anonymous
09/15/25(Mon)11:03:31 No.106594421

Anonymous 09/15/25(Mon)11:03:31 No.106594421

>>106594408
>>106594356
>>106594353
>>106594305
It either way it completed in The exact fashion it was supposed to complete in so I don't see what the hyper fixation on that is.

Anonymous
09/15/25(Mon)11:05:11 No.106594435

Anonymous 09/15/25(Mon)11:05:11 No.106594435

>>106594421
A single extra space can make your model drop 90IQ

Anonymous
09/15/25(Mon)11:06:01 No.106594439

Anonymous 09/15/25(Mon)11:06:01 No.106594439

>>106594421
>I don't see what the hyper fixation
>>106593869
>Not forgetting important details
>Based on my own testing

Anonymous
09/15/25(Mon)11:06:58 No.106594446

Anonymous 09/15/25(Mon)11:06:58 No.106594446

>>106594435
>>106594439
Nta. So what was stopping you from pointing that out the first time?

Anonymous
09/15/25(Mon)11:09:31 No.106594461

Anonymous 09/15/25(Mon)11:09:31 No.106594461

>>106594421
>>106594446
Nta it's technically formatted correctly but also not really. It has duplications of the assistant token towards the middle and the end. remove those and then try again. Not quite sure why ultra autists >>106594353
>>106594435
>>106594439
Were so unwilling to point that out

Anonymous
09/15/25(Mon)11:11:15 No.106594470

Anonymous 09/15/25(Mon)11:11:15 No.106594470

File: gl3cf.png (44 KB, 1083x289)

44 KB PNG

>>106594446
The assumption that anon can google "llama3 chat format".
In that much, I admit I was wrong.
I don't care either way. Anon wanted info on how his chat format is wrong. I provided it.
>>106594461
>it's technically formatted correctly but also not really
It is or it isn't. It is not.

Anonymous
09/15/25(Mon)11:11:37 No.106594475

Anonymous 09/15/25(Mon)11:11:37 No.106594475

>>106594461
>That's a fuck up with how axolotl inference outputs

Anonymous
09/15/25(Mon)11:13:54 No.106594495

Anonymous 09/15/25(Mon)11:13:54 No.106594495

>GLM-4.5-IQ2_M
is it even worth using or would i be wasting my bandwidth?

Anonymous
09/15/25(Mon)11:14:23 No.106594499

Anonymous 09/15/25(Mon)11:14:23 No.106594499

>>106594470
They understood how the formatting works it just had duplicates for some reason. He probably ran The prompt to AI or something and it injected the duplications and they didn't realize.

A simple "hey you have duplicate assistant tokens you might want to remove that" what have sufficed instead of being condescending. You know it's exhausting going out of your way to be that way right?

Not that it would have made much of a difference anyway since anything below 12b is retarded regardless.

Anonymous
09/15/25(Mon)11:16:21 No.106594514

Anonymous 09/15/25(Mon)11:16:21 No.106594514

>>106594499
>anything below 12b is retarded regardless.
completely wrong though that is the fault of training on too much data

Anonymous
09/15/25(Mon)11:17:23 No.106594522

Anonymous 09/15/25(Mon)11:17:23 No.106594522

>>106594514
Who are referring to?

Anonymous
09/15/25(Mon)11:18:28 No.106594527

Anonymous 09/15/25(Mon)11:18:28 No.106594527

>>106594522
every lab right now cramming too much into small models instead of making narrow use case ones

Anonymous
09/15/25(Mon)11:21:16 No.106594559

Anonymous 09/15/25(Mon)11:21:16 No.106594559

>>106594527
You mean something like
>https://huggingface.co/allenai/Flex-creative-2x7B-1T

Anonymous
09/15/25(Mon)11:21:41 No.106594565

Anonymous 09/15/25(Mon)11:21:41 No.106594565

>>106594499
Anon is assessing the quality of models and can't use google, read or follow instructions.
>they, he, they
Be consistent.
I posted the example in llama's site. With his carefully constructed tests, eagle eye and attention for detail, I would have expected him to notice all the empty space between the chat format tokens and the content, which his catbox post clearly doesn't have. The other anon pointed out the template dups.

Anonymous
09/15/25(Mon)11:23:11 No.106594575

Anonymous 09/15/25(Mon)11:23:11 No.106594575

File: 1729339670620776.jpg (162 KB, 1782x964)

162 KB JPG

>>106594559
>data owners can contribute to the development of open language models without giving up control of their data. There is no need to share raw data directly, and data contributors can decide when their data is active in the model, deactivate it at any time, and receive attributions whenever it's used for inference.

What?

Anonymous
09/15/25(Mon)11:23:24 No.106594576

Anonymous 09/15/25(Mon)11:23:24 No.106594576

>>106594559
no what the hell is this abomination fuck allencucks

Anonymous
09/15/25(Mon)11:24:04 No.106594583

Anonymous 09/15/25(Mon)11:24:04 No.106594583

>>106594565
I used the format though, it just had duplications. The only error where the duplications....

Anonymous
09/15/25(Mon)11:24:14 No.106594585

Anonymous 09/15/25(Mon)11:24:14 No.106594585

>>106594387
>>106594394
Ty, I just saw a lot of focused tarins... focused on some specific stuff like RP or philosophy, but I was looking for good one for general purpose research and deep thinking. So wander maybe someone know a good one that is stands out

Anonymous
09/15/25(Mon)11:25:18 No.106594598

Anonymous 09/15/25(Mon)11:25:18 No.106594598

>>106594575
>>106594576
There's also a literal reddit version.
>https://huggingface.co/allenai/Flex-reddit-2x7B-1T

Anonymous
09/15/25(Mon)11:26:26 No.106594609

Anonymous 09/15/25(Mon)11:26:26 No.106594609

>>106594585
What da fak I just spit out lol, I mean *trainings

Anonymous
09/15/25(Mon)11:27:59 No.106594616

Anonymous 09/15/25(Mon)11:27:59 No.106594616

>>106594583
>The only error where the duplications
You're missing the empty lines.

Anonymous
09/15/25(Mon)11:28:22 No.106594619

Anonymous 09/15/25(Mon)11:28:22 No.106594619

does Linux have an alternative to sillytavern yet

Anonymous
09/15/25(Mon)11:28:57 No.106594625

Anonymous 09/15/25(Mon)11:28:57 No.106594625

>>106594598
>>106594559
It claims they can contribute to training without providing the user data.... How the fuck does that even work? Am I misunderstanding what they're saying?

Anonymous
09/15/25(Mon)11:29:03 No.106594626

Anonymous 09/15/25(Mon)11:29:03 No.106594626

>>106594619
does window?

Anonymous
09/15/25(Mon)11:29:31 No.106594630

Anonymous 09/15/25(Mon)11:29:31 No.106594630

>>106594616
Which followed after the duplications right? Removing those should have fixed the incorrect formatting

Anonymous
09/15/25(Mon)11:30:09 No.106594636

Anonymous 09/15/25(Mon)11:30:09 No.106594636

>>106594619
llama.cpp HTTP server + curl

Anonymous
09/15/25(Mon)11:31:48 No.106594648

Anonymous 09/15/25(Mon)11:31:48 No.106594648

File: Screenshot 2025-09-15 at (...).png (100 KB, 761x424)

100 KB PNG

>>106594625
You basically train a smaller domain specific model (expert modules) that can later be part of the larger final product.
>https://www.datocms-assets.com/64837/1752084947-flexolmo-5.pdf

Anonymous
09/15/25(Mon)11:31:53 No.106594650

Anonymous 09/15/25(Mon)11:31:53 No.106594650

>>106594626
I don't use windows

Anonymous
09/15/25(Mon)11:32:15 No.106594652

Anonymous 09/15/25(Mon)11:32:15 No.106594652

beg me to shitpost again so this thread stops being dead.

Anonymous
09/15/25(Mon)11:33:00 No.106594655

Anonymous 09/15/25(Mon)11:33:00 No.106594655

stfu im zorking it

Anonymous
09/15/25(Mon)11:33:01 No.106594656

Anonymous 09/15/25(Mon)11:33:01 No.106594656

Just give me the goof

Anonymous
09/15/25(Mon)11:33:53 No.106594666

Anonymous 09/15/25(Mon)11:33:53 No.106594666

>>106594630
Look at this >>106594353 or llama's site.
After
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
there's an empty line. Every other line is an empty line. Those are not in your catbox file.

Anonymous
09/15/25(Mon)11:34:31 No.106594670

Anonymous 09/15/25(Mon)11:34:31 No.106594670

>>106594652
i beg of sama-sama please just let us rest in piss

Anonymous
09/15/25(Mon)11:36:08 No.106594687

Anonymous 09/15/25(Mon)11:36:08 No.106594687

>>106594044
>>106593104
I'm asking this to everyone: what's the bare minimum parameter size someone should use if they want to have decent RP where the "assistant" isn't retarded?
>>106594666
I don't think those are strictly necessary given that it autocompletes correctly without them. How do you know that's not done just for ease of readability?

Anonymous
09/15/25(Mon)11:37:40 No.106594699

Anonymous 09/15/25(Mon)11:37:40 No.106594699

>>106594687
4B with proper training.

Anonymous
09/15/25(Mon)11:40:27 No.106594729

Anonymous 09/15/25(Mon)11:40:27 No.106594729

>>106594687
you'll have to accept retardation and learn to live with it

Anonymous
09/15/25(Mon)11:41:06 No.106594738

Anonymous 09/15/25(Mon)11:41:06 No.106594738

>>106594687
>How do you know that's not done just for ease of readability?
>>106594470
>I don't care either way. Anon wanted info on how his chat format is wrong. I provided it.

Anonymous
09/15/25(Mon)11:41:28 No.106594743

Anonymous 09/15/25(Mon)11:41:28 No.106594743

>>106594729
I wonder if the deepseek api users over at /aicg/ have to suffer with it anywhere near as much as we do.

Anonymous
09/15/25(Mon)11:41:50 No.106594745

Anonymous 09/15/25(Mon)11:41:50 No.106594745

>>106594652
i dare you to do it again

Anonymous
09/15/25(Mon)11:42:19 No.106594748

Anonymous 09/15/25(Mon)11:42:19 No.106594748

>>106594738
>Doesn't answer the question

Anonymous
09/15/25(Mon)11:42:59 No.106594756

Anonymous 09/15/25(Mon)11:42:59 No.106594756

>>106594743
Yes, I don't recommend reading their thread for your sanity but even they complain about all their models even Opus and such.

Anonymous
09/15/25(Mon)11:45:08 No.106594774

Anonymous 09/15/25(Mon)11:45:08 No.106594774

>>106594756
Damn... So the retardation is inescapable no matter how big or "smart" the model is?

Anonymous
09/15/25(Mon)11:45:59 No.106594780

Anonymous 09/15/25(Mon)11:45:59 No.106594780

>>106594687
The thing is, retarded is a spectrum.
Some people will have more tolerance for certain errors and certain magnitudes of errors than others, so the lower boundary us fuzzy as hell and a model can be perfectly serviceable in one scenario while fucking up another.
Some people will tell you 12B is enough, others will say 70B dense, other's will tell you to not bother unless you can go for the biggest best-est thing because retardation exists even in the best models, just to a much lesser extent.
Etc etc.
tl;dr : There's no consensus and I'm not sure there can be, at least for now.

>>106594648
Reminds me of CUDADEV's idea of training a bunch of different models on a subset of the full training set, running them in parallel, then averaging the logits, although in that case it was more about getting the results equivalent to a model trained on
>[number of models] x [training tokens each model sees]
tokens than specializing models.

Anonymous
09/15/25(Mon)11:46:21 No.106594786

Anonymous 09/15/25(Mon)11:46:21 No.106594786

>>106594743
Deepseek as to deal with theirs. A much worse fate.

Anonymous
09/15/25(Mon)11:47:07 No.106594794

Anonymous 09/15/25(Mon)11:47:07 No.106594794

>>106594774
Correct, this is the LLM blackpill there are zero non retarded one currently.

Anonymous
09/15/25(Mon)11:47:10 No.106594795

Anonymous 09/15/25(Mon)11:47:10 No.106594795

>>106594743
i am a 4 bit cpumaxxing coper
llama_model_loader: loaded meta data with 52 key-value pairs and 1096 tensors from models/Kimi-K2-Instruct-0905-GGUF-smol-IQ4_KSS/Kimi-K2-Instruct-0905-smol-IQ4_KSS-00001-of-00011.gguf
llm_load_print_meta: model ftype = IQ4_KSS - 4.0 bpw
llm_load_print_meta: model params = 1.026 T
llm_load_print_meta: model size = 485.008 GiB (4.059 BPW)
llm_load_print_meta: repeating layers = 483.197 GiB (4.053 BPW, 1024.059 B parameters)
llm_load_tensors: offloaded 62/62 layers to GPU
llm_load_tensors: CPU buffer size = 420246.00 MiB
llm_load_tensors: CUDA_Host buffer size = 927.50 MiB
llm_load_tensors: CUDA0 buffer size = 13632.97 MiB
llm_load_tensors: CUDA1 buffer size = 18510.81 MiB
llm_load_tensors: CUDA2 buffer size = 18668.47 MiB
llm_load_tensors: CUDA3 buffer size = 19280.69 MiB
llm_load_tensors: CUDA4 buffer size = 5382.00 MiB

Anonymous
09/15/25(Mon)11:50:22 No.106594817

Anonymous 09/15/25(Mon)11:50:22 No.106594817

>>106594794
>>106594786
>>106594780
Are we at least in agreeance that The higher the perimeter size, be lower the retardation generally is? Or is that not a reliable way to gauge?

Anonymous
09/15/25(Mon)11:51:38 No.106594822

Anonymous 09/15/25(Mon)11:51:38 No.106594822

>>106594817
Generally somewhat, but then there's stuff like Llama4.

Anonymous
09/15/25(Mon)11:52:28 No.106594831

Anonymous 09/15/25(Mon)11:52:28 No.106594831

>>106594699
do you have empirical evidence of this claim? what 4b model is best for rp? how come 4 and not 3 or 5?

Anonymous
09/15/25(Mon)11:52:58 No.106594839

Anonymous 09/15/25(Mon)11:52:58 No.106594839

>>106594817
>>106594822
dataset quality matters a bunch. garbage in garbage out..

Anonymous
09/15/25(Mon)11:54:12 No.106594853

Anonymous 09/15/25(Mon)11:54:12 No.106594853

>>106594817
Generally, yes. Although training data and procedure plays a large role in it too, and there's also dense vs sparse to consider, etc.
Basically, there are not enough scientific comparative experiments for us to tell how much each component matters (general architecture, depth, width, training data, training procedure,e tc) and there's a good chance that the ffinal result also varies with usecase.
Meaning, it's a clusterfuck.

Anonymous
09/15/25(Mon)11:54:17 No.106594856

Anonymous 09/15/25(Mon)11:54:17 No.106594856

>>106594831
That's the best I can run. So it HAS to be the best size and everything anyone could ever need.

Anonymous
09/15/25(Mon)11:55:52 No.106594867

Anonymous 09/15/25(Mon)11:55:52 No.106594867

>>106594856
What do you use your 4B models for?

Anonymous
09/15/25(Mon)11:58:08 No.106594882

Anonymous 09/15/25(Mon)11:58:08 No.106594882

>>106594867
I was joking. I'm not that anon. But I think the sentiment is still the same.

Anonymous
09/15/25(Mon)12:01:26 No.106594907

Anonymous 09/15/25(Mon)12:01:26 No.106594907

>>106594867
I can run and currently cope with 12-24B but models are so bloated it's implausible we can't do better with less trash and more use case data.

Anonymous
09/15/25(Mon)12:03:29 No.106594924

Anonymous 09/15/25(Mon)12:03:29 No.106594924

So what I'm getting here is that LLMs RP. What else can they be useful for? I feel like the main reason they don't hit the mainstream is because you need beefy graphics cards to even consider trying them. And tonight if you consider attacking the train them yourself.

Anonymous
09/15/25(Mon)12:06:00 No.106594955

Anonymous 09/15/25(Mon)12:06:00 No.106594955

File: Screenshot.png (1 KB, 232x48)

1 KB PNG

>>106594924
code and math is the only other use case

Anonymous
09/15/25(Mon)12:09:02 No.106594974

Anonymous 09/15/25(Mon)12:09:02 No.106594974

>>106594924
>I feel like the main reason they don't hit the mainstream
Claude, chatgpt and gemini are mainstream.
>What else can they be useful for?
>And tonight if you consider attacking the train them yourself.
They could be used to correct text before being sent. Other than that, simple translation, google replacement for simple verifiable things, spamming image boards, replying to corporate. You know... the usual...

Anonymous
09/15/25(Mon)12:09:33 No.106594977

Anonymous 09/15/25(Mon)12:09:33 No.106594977

>>106594924
Also non-generative use cases like classifying data.

Anonymous
09/15/25(Mon)12:11:32 No.106594996

Anonymous 09/15/25(Mon)12:11:32 No.106594996

>>106594974
>Claude, chatgpt and gemini are mainstream.
Was referring to local LLMs. Also forgive that last part of the last post. I'm writing this on voice to text.

Anonymous
09/15/25(Mon)12:11:53 No.106594998

Anonymous 09/15/25(Mon)12:11:53 No.106594998

>>106594745
i said beg you maggot

Anonymous
09/15/25(Mon)12:12:33 No.106595008

Anonymous 09/15/25(Mon)12:12:33 No.106595008

>>106593427
>>106593444
The UI is in the Regions repo, and makes flows for it. Deleting and renaming nodes is jank, but it works otherwise.

https://github.com/dibrale/Regions

Anonymous
09/15/25(Mon)12:13:39 No.106595017

Anonymous 09/15/25(Mon)12:13:39 No.106595017

>>106594998
ya that's what i thought pussy

Anonymous
09/15/25(Mon)12:15:55 No.106595041

Anonymous 09/15/25(Mon)12:15:55 No.106595041

>>106594996
>Was referring to local LLMs
Then yes. Lack of GPU, not knowing how to compile stuff, terminals are scary and all that. A tech-literacy gap, if you will. Not that anons here are much more tech-savvy.
>git pull. thing broke
>he pulled

Anonymous
09/15/25(Mon)12:19:32 No.106595077

Anonymous 09/15/25(Mon)12:19:32 No.106595077

>>106593942
The workflow from the last thread is supposed to help with that, but I'm not sure what the best way of testing it is. Might be cool to turn it into a server script if it helps.

>>106591301

Anonymous
09/15/25(Mon)12:20:05 No.106595081

Anonymous 09/15/25(Mon)12:20:05 No.106595081

llama.cpp changed the metal backend and made it eat way more memory, I'm OOMing with the same params that left me with 10GB of headroom on the last commit... curse you gerganov

Anonymous
09/15/25(Mon)12:22:56 No.106595114

Anonymous 09/15/25(Mon)12:22:56 No.106595114

>>106595008
That's pretty sick.
I might scrap the shit I was working on and use that as a reference to start over.
Or maybe just use that as a middleware between the LLM backend and my app. Either or.

Anonymous
09/15/25(Mon)12:22:57 No.106595115

Anonymous 09/15/25(Mon)12:22:57 No.106595115

>>106595017
fine. enjoy your dead thread.

Anonymous
09/15/25(Mon)12:37:06 No.106595242

Anonymous 09/15/25(Mon)12:37:06 No.106595242

File: 784280516.png (272 KB, 840x859)

272 KB PNG

shitposters won

Anonymous
09/15/25(Mon)12:39:26 No.106595261

Anonymous 09/15/25(Mon)12:39:26 No.106595261

>>106595242
One kike throwing an endless temper tantrum over this thread hardly counts as winning.
Imagine a parent, their child is having a full, flailing on the ground, pant shitting tantrum. Are they proud? That's you. Your "pride" is but a cope.

Anonymous
09/15/25(Mon)12:47:26 No.106595334

Anonymous 09/15/25(Mon)12:47:26 No.106595334

reddit won

Anonymous
09/15/25(Mon)12:50:48 No.106595369

Anonymous 09/15/25(Mon)12:50:48 No.106595369

>>106594495
I was running iq2_kl since it fits on my 5090 + 128 ram setup and yea it's not completly retarded sure beats air... if you can fit that then you can alternatively get away with qwen 235b at iq4

Anonymous
09/15/25(Mon)12:50:50 No.106595370

Anonymous 09/15/25(Mon)12:50:50 No.106595370

>>106595261
funnily enough I don't think I've ever had a pants shitting tantrum
I imagine it's rare?

Anonymous
09/15/25(Mon)12:55:02 No.106595418

Anonymous 09/15/25(Mon)12:55:02 No.106595418

>>106595370
I remember pissing myself a few times but it wasn't because of a tantrum.

Anonymous
09/15/25(Mon)12:58:43 No.106595477

Anonymous 09/15/25(Mon)12:58:43 No.106595477

>>106595041
I just want an EXE, not any of that hacker shit

Anonymous
09/15/25(Mon)13:00:47 No.106595515

Anonymous 09/15/25(Mon)13:00:47 No.106595515

>>106595114
What were you working on? Also, deletion and renaming in the Regions GUI is allegedly fixed as of the last commit?

Anonymous
09/15/25(Mon)13:09:56 No.106595608

Anonymous 09/15/25(Mon)13:09:56 No.106595608

>>106593942
I feel like most of the schizo retard moments from glm air come from using cope quants. I switched to using q8 from q3 after upgrading my ram and the difference was immediately noticeable in the way that it remembered and incorporated details from context. Still not perfect and still somewhat slopped, but definitely better.

Anonymous
09/15/25(Mon)13:24:54 No.106595722

Anonymous 09/15/25(Mon)13:24:54 No.106595722

File: file.png (125 KB, 1301x625)

125 KB PNG

>>106593444
>astrsk (don't even download it, has telemetry and shit).
The only non-local host domain it connects is Google Fonts. As far as I understand, you can enable analytics by setting an API key during the build. But it doesn't seem to have one by default. This was a normal site that became open source later.

Anonymous
09/15/25(Mon)13:28:54 No.106595758

Anonymous 09/15/25(Mon)13:28:54 No.106595758

https://outsidetext.substack.com/p/how-does-a-blind-model-see-the-earth
moesissies don't look

Anonymous
09/15/25(Mon)13:32:14 No.106595786

Anonymous 09/15/25(Mon)13:32:14 No.106595786

>>106595722
a single glance at the readme is enough to close the tab instantly

Anonymous
09/15/25(Mon)13:32:17 No.106595787

Anonymous 09/15/25(Mon)13:32:17 No.106595787

>>106595369
thanks downloading them now

Anonymous
09/15/25(Mon)13:36:01 No.106595824

Anonymous 09/15/25(Mon)13:36:01 No.106595824

File: file.png (19 KB, 1303x102)

19 KB PNG

>>106595786
It has the correct license.

Anonymous
09/15/25(Mon)13:38:44 No.106595847

Anonymous 09/15/25(Mon)13:38:44 No.106595847

File: file.png (1.3 MB, 1580x1684)

1.3 MB PNG

>>106595786
Someone posted this one in another thread.
https://github.com/onestardao/WFGY

Anonymous
09/15/25(Mon)13:38:54 No.106595849

Anonymous 09/15/25(Mon)13:38:54 No.106595849

File: something I'm unworthy of.webm (567 KB, 1188x668)

567 KB WEBM

>>106593104
Many normies are claiming that AI is "eating itself to death". What do they mean by this?

https://www.tiktok.com/t/ZT6ofKC5U/

Anonymous
09/15/25(Mon)13:40:18 No.106595865

Anonymous 09/15/25(Mon)13:40:18 No.106595865

>Someone in r*ddit built a DDR4 server with 8 MI50 (256gb vram) for the price of a single 5090
>400w idle
oof
Don't build it if you don't have solar panels.

Anonymous
09/15/25(Mon)13:41:47 No.106595882

Anonymous 09/15/25(Mon)13:41:47 No.106595882

>>106595849
Sounds like this shitjeet has no idea what the fuck he is talking about and has no fucking idea how pretraining works. And by "this shitjeet" I mean you.
Fuck off back to whatever normie shithole you crawled out of.

Anonymous
09/15/25(Mon)13:45:55 No.106595917

Anonymous 09/15/25(Mon)13:45:55 No.106595917

>>106595865
You forgot about heat and noise too

Anonymous
09/15/25(Mon)13:48:53 No.106595943

Anonymous 09/15/25(Mon)13:48:53 No.106595943

>>106594795
>models/Kimi-K2-Instruct-0905-GGUF-smol-IQ4_KSS/Kimi-K2-Instruct-0905-smol-IQ4_KSS-00001-of-00011.gguf
When you load first part, does it mean you just using first part or it's automatically know where to look to next one on the load process?

Anonymous
09/15/25(Mon)13:50:11 No.106595953

Anonymous 09/15/25(Mon)13:50:11 No.106595953

>>106595943
>or it's automatically know where to look to next one on the load process
That.

Anonymous
09/15/25(Mon)13:50:54 No.106595960

Anonymous 09/15/25(Mon)13:50:54 No.106595960

>>106595849
Retards who believe AI is a living being that constantly feeds of the internet instead of simply being a file that can be backed up

Anonymous
09/15/25(Mon)13:51:25 No.106595966

Anonymous 09/15/25(Mon)13:51:25 No.106595966

Grok-2 impressions: (running IQ4_XS)
>*Yawn*
Not sure if it's just impatience from only getting half a token per second in generation, but really not worth the fuss. Would run Llama-3-70B over it any day of the week-

Anonymous
09/15/25(Mon)13:51:53 No.106595970

Anonymous 09/15/25(Mon)13:51:53 No.106595970

>>106595953
Ty

Anonymous
09/15/25(Mon)13:52:37 No.106595976

Anonymous 09/15/25(Mon)13:52:37 No.106595976

im backed up rn

Anonymous
09/15/25(Mon)13:53:27 No.106595985

Anonymous 09/15/25(Mon)13:53:27 No.106595985

Whats a good uncensored LLM? No politically correct bullshit and refusing to give answers. I have low VRAM, I don't mind if its a bit laggy and I don't care about it being 'smart' on programming tasks etc. Most important is just that it chats well and is uncensored in its responses.

Anonymous
09/15/25(Mon)14:02:40 No.106596053

Anonymous 09/15/25(Mon)14:02:40 No.106596053

>>106595966
I actually like grok 2(Q8) and think that it's a hidden gem. Their official prompt on lmarena sucked and made me undervalue it.

>>106595985
I'd suggest grok2, but you are a ramlet...

Anonymous
09/15/25(Mon)14:03:13 No.106596059

Anonymous 09/15/25(Mon)14:03:13 No.106596059

>>106595865
Just turn it off when you're not using it.
Server motherboards come with baseboard management controllers so you can even turn them on and off remotely.

Anonymous
09/15/25(Mon)14:04:11 No.106596067

Anonymous 09/15/25(Mon)14:04:11 No.106596067

>>106593539
Why are parents like this?

Anonymous
09/15/25(Mon)14:08:25 No.106596106

Anonymous 09/15/25(Mon)14:08:25 No.106596106

>check thedrummer's page on hf
>still finetrooning command A
>only uploaded Q5_K_M goofs
why is this the state of finetuning in 2025?

Anonymous
09/15/25(Mon)14:08:36 No.106596110

Anonymous 09/15/25(Mon)14:08:36 No.106596110

>>106596059
My amd workstation takes forever to boot if I don't turn off ram training.

Anonymous
09/15/25(Mon)14:11:03 No.106596134

Anonymous 09/15/25(Mon)14:11:03 No.106596134

>>106596106
Be the change you want to see

Anonymous
09/15/25(Mon)14:11:32 No.106596139

Anonymous 09/15/25(Mon)14:11:32 No.106596139

>>106596053
It's decent at Nala
It's less slopped than most open models, but it comes up pretty dry in soft mommy RP, sadly.

Anonymous
09/15/25(Mon)14:14:36 No.106596163

Anonymous 09/15/25(Mon)14:14:36 No.106596163

>>106596110
5 minutes is not a long time, just make some coffee in the meantime. make a script that makes a coffee at the exact time it takes for you to walk to your kitchen plus five minutes and while you are at it have it write an email that tells kumar that he's an asshole.

Anonymous
09/15/25(Mon)14:15:27 No.106596168

Anonymous 09/15/25(Mon)14:15:27 No.106596168

>>106595985
>uncensored
>low VRAM
Mistral Nemo, always and forever.

Anonymous
09/15/25(Mon)14:16:34 No.106596174

Anonymous 09/15/25(Mon)14:16:34 No.106596174

>>106596053
isn't grok2 8 experts 2 active? you can't run it decently with dual channel

Anonymous
09/15/25(Mon)14:16:59 No.106596180

Anonymous 09/15/25(Mon)14:16:59 No.106596180

>>106595960
You should think of AI is an industry that needs to churn out new models in return for investor money.

Anonymous
09/15/25(Mon)14:18:00 No.106596191

Anonymous 09/15/25(Mon)14:18:00 No.106596191

>>106596134
Some people's time is too valuable to be a glorified data entry and sanitation monkey.

Anonymous
09/15/25(Mon)14:18:01 No.106596192

Anonymous 09/15/25(Mon)14:18:01 No.106596192

>>106596110
>turn off ram training.
turn off what

Anonymous
09/15/25(Mon)14:20:29 No.106596220

Anonymous 09/15/25(Mon)14:20:29 No.106596220

>>106596174
Sadly not, but I have 12+12 channels

Anonymous
09/15/25(Mon)14:20:41 No.106596222

Anonymous 09/15/25(Mon)14:20:41 No.106596222

>>106596191
Opinion discarded then

Anonymous
09/15/25(Mon)14:27:46 No.106596292

Anonymous 09/15/25(Mon)14:27:46 No.106596292

>>106595849
Not entirely wrong tho I didn't look at the asstok link, new models are more and more poisoned by the gpt slop being poured all over and the labs themselves doing synthetic data and amplifying bias for more slop

Anonymous
09/15/25(Mon)14:29:35 No.106596305

Anonymous 09/15/25(Mon)14:29:35 No.106596305

which one does the best lolis

Anonymous
09/15/25(Mon)14:32:29 No.106596334

Anonymous 09/15/25(Mon)14:32:29 No.106596334

>>106596305
gemma3 closely followed by gpt-oss they're the only ones with the proper knowledge

Anonymous
09/15/25(Mon)14:37:00 No.106596362

Anonymous 09/15/25(Mon)14:37:00 No.106596362

>>106595849
It is inbreeding, not eating itself to death.

Anonymous
09/15/25(Mon)14:41:20 No.106596402

Anonymous 09/15/25(Mon)14:41:20 No.106596402

File: vc01.png (216 KB, 1532x883)

216 KB PNG

Why are vibe coders like this?

Anonymous
09/15/25(Mon)14:42:35 No.106596412

Anonymous 09/15/25(Mon)14:42:35 No.106596412

File: vc02.png (234 KB, 1520x796)

234 KB PNG

>>106596402
Ugh...

Anonymous
09/15/25(Mon)14:43:27 No.106596420

Anonymous 09/15/25(Mon)14:43:27 No.106596420

grandpa crying about zoomies again

Anonymous
09/15/25(Mon)14:44:13 No.106596426

Anonymous 09/15/25(Mon)14:44:13 No.106596426

File: vc03.png (229 KB, 1526x622)

229 KB PNG

>>106596420
https://github.com/ggml-org/llama.cpp/pull/16016
Aaaaaaaa

Anonymous
09/15/25(Mon)14:47:06 No.106596453

Anonymous 09/15/25(Mon)14:47:06 No.106596453

>>106596402
It's funnier this way. As long as you don't have to deal with them yourself, anyway.

Anonymous
09/15/25(Mon)14:52:53 No.106596514

Anonymous 09/15/25(Mon)14:52:53 No.106596514

>>106596402
>>106596412
>https://github.com/creatorrr

Anonymous
09/15/25(Mon)14:52:53 No.106596515

Anonymous 09/15/25(Mon)14:52:53 No.106596515

>>106596426
https://www.startupgrind.com/events/details/startup-grind-hyderabad-presents-diwank-singh-tomer-thiel-fellowship/
explains a lot actually

Anonymous
09/15/25(Mon)14:53:26 No.106596522

Anonymous 09/15/25(Mon)14:53:26 No.106596522

>>106596402
Literally all they have to do is change the remark and nobody will ever be the wiser.

Anonymous
09/15/25(Mon)14:55:07 No.106596542

Anonymous 09/15/25(Mon)14:55:07 No.106596542

What will happen to Mistral AI now that ASML bought it for $1.3B?
https://www.asml.com/en/news/press-releases/2025/asml-mistral-ai-enter-strategic-partnership

Anonymous
09/15/25(Mon)14:57:14 No.106596568

Anonymous 09/15/25(Mon)14:57:14 No.106596568

File: file.png (192 KB, 346x600)

192 KB PNG

>>106596402
He's probably trying to build his CV to find a job in America or Europe.

Anonymous
09/15/25(Mon)15:00:30 No.106596600

Anonymous 09/15/25(Mon)15:00:30 No.106596600

File: oh_claude_01.png (219 KB, 1581x887)

219 KB PNG

>>106596453
Someone will have to.
>>106596514
>>106596515
Oh. I had forgotten what puke tasted like. I didn't want to know that much. Thanks.
>>106596522
Yeah. It wasn't obvious. Like that other one....

Anonymous
09/15/25(Mon)15:03:52 No.106596622

Anonymous 09/15/25(Mon)15:03:52 No.106596622

>>106596568
honestly don't think he needs to, sounds like he's already making decent money living in the US

Anonymous
09/15/25(Mon)15:08:44 No.106596674

Anonymous 09/15/25(Mon)15:08:44 No.106596674

>>106596568
>Diwank
Dam Son...

Anonymous
09/15/25(Mon)15:10:31 No.106596690

Anonymous 09/15/25(Mon)15:10:31 No.106596690

>>106596568
sounds like a nguyen

Anonymous
09/15/25(Mon)15:13:26 No.106596713

Anonymous 09/15/25(Mon)15:13:26 No.106596713

File: asml.png (1.36 MB, 1847x500)

1.36 MB PNG

>>106596542
>https://www.asml.com
Oh...

Anonymous
09/15/25(Mon)15:16:52 No.106596739

Anonymous 09/15/25(Mon)15:16:52 No.106596739

>>106596542
Holy shit.
I suppose that does make sense, but still.
Holy shit.
I wonder if the idea is to diversify in case their monopoly on high end lithography machines ever comes to an end or if the intent is to somehow improve their existing business.

Anonymous
09/15/25(Mon)15:22:30 No.106596784

Anonymous 09/15/25(Mon)15:22:30 No.106596784

>>106596739
Lower your temp

Anonymous
09/15/25(Mon)15:23:21 No.106596793

Anonymous 09/15/25(Mon)15:23:21 No.106596793

>>106596739
>if the intent is to somehow improve their existing business.
No way...

Anonymous
09/15/25(Mon)15:26:19 No.106596816

Anonymous 09/15/25(Mon)15:26:19 No.106596816

>>106596793
Companies do invest in things other than their core businesses, to the point where sometimes they shift completely away from it.
I doubt ASML will stop selling EUV machines to become an AI lab, but the point stands.

Anonymous
09/15/25(Mon)15:33:31 No.106596881

Anonymous 09/15/25(Mon)15:33:31 No.106596881

>>106595847
That's so fucking funny.
>Tutorial: How to Awaken the Soul of Your AI in under 60 seconds — by the WFGY Engine
Is this what all those "awakened AI" tick toks I've been hearing of are about?

Anonymous
09/15/25(Mon)15:33:36 No.106596882

Anonymous 09/15/25(Mon)15:33:36 No.106596882

>>106596568
>em dash in his two sentence description
bros....

Anonymous
09/15/25(Mon)15:46:02 No.106596991

Anonymous 09/15/25(Mon)15:46:02 No.106596991

>>106596568
Hello sarrs I have build very AI system for you

Anonymous
09/15/25(Mon)15:48:52 No.106597029

Anonymous 09/15/25(Mon)15:48:52 No.106597029

File: 1754938502186409.png (449 KB, 472x472)

449 KB PNG

>>106596426
>>106596412
>>106596402
>>106596514
>>106596600
>>106596568
What am I looking at? I see a bunch of shit that looks like it was written by AI. Not even code related to the software. What the hell are these merge requests? I've never merged anything on an existing project in my life so maybe there's something I'm missing here

Anonymous
09/15/25(Mon)15:49:56 No.106597043

Anonymous 09/15/25(Mon)15:49:56 No.106597043

>>106597029
Thanks for reusing this dumb image, MD5 filter works well

Anonymous
09/15/25(Mon)15:51:03 No.106597053

Anonymous 09/15/25(Mon)15:51:03 No.106597053

>>106597029
Guy used AI agents and pushed the files the agent was using to keep track of the work into the repository.
Or something like that.

Anonymous
09/15/25(Mon)15:51:27 No.106597057

Anonymous 09/15/25(Mon)15:51:27 No.106597057

File: 1734826964810755.png (487 KB, 456x456)

487 KB PNG

>>106597043
Does it now?

Anonymous
09/15/25(Mon)15:52:43 No.106597071

Anonymous 09/15/25(Mon)15:52:43 No.106597071

>>106597053
And he couldn't do that shit on his own fork of the git repo instead of the official one? He doesn't deserve any attention or employment or consideration for anything if he is this self-centered.

Anonymous
09/15/25(Mon)15:52:46 No.106597072

Anonymous 09/15/25(Mon)15:52:46 No.106597072

>>106597043
https://github.com/woltapp/blurhash

Anonymous
09/15/25(Mon)15:58:21 No.106597126

Anonymous 09/15/25(Mon)15:58:21 No.106597126

>>106597071
Looking at the image again, it's worse, the commits were made on his own fork, and he created a merge request.
Hell, in all likelihood, it wasn't even him, he just gave the AI agent access to git commands too.

Anonymous
09/15/25(Mon)16:04:19 No.106597194

Anonymous 09/15/25(Mon)16:04:19 No.106597194

>>106595261
>shitposting is throwing a tantrum
>4chan is serious business
I would have said that with that the transformation into reddit is complete but this place has been a reddit since forever. Enjoy your dead thread you dumb faggot.

Anonymous
09/15/25(Mon)16:09:28 No.106597252

Anonymous 09/15/25(Mon)16:09:28 No.106597252

Do I need to change something else aside from the GPU / power supply?
CPU : 5500 w/ stock fan
RAM : 32G 3200 CL16
MB : B550-PLUS
GPU : GTX 1050
PSU : 400W 80PLUS Gold
Case : Antex P101
512G M2, 3*4T WD Red Plus

Anonymous
09/15/25(Mon)16:10:42 No.106597260

Anonymous 09/15/25(Mon)16:10:42 No.106597260

>>106597252
wrong thread?

Anonymous
09/15/25(Mon)16:11:37 No.106597271

Anonymous 09/15/25(Mon)16:11:37 No.106597271

>>106597260
No?

Anonymous
09/15/25(Mon)16:12:54 No.106597281

Anonymous 09/15/25(Mon)16:12:54 No.106597281

>>106597260
No, I just want to know what component I should change if I need to run a language model locally.

Anonymous
09/15/25(Mon)16:12:59 No.106597284

Anonymous 09/15/25(Mon)16:12:59 No.106597284

>>106597252
What do you want to do exactly?
I'd tell you to get at least 64gb of ddr5, but ideally, you'd go for a server platform with a ton of memory bandwidth.

Anonymous
09/15/25(Mon)16:13:00 No.106597285

Anonymous 09/15/25(Mon)16:13:00 No.106597285

>>106597252
You can manage with a new gpu and larger PSU. I'd get 64GB ram too or more. Plus fast nvme drive.

Anonymous
09/15/25(Mon)16:14:21 No.106597303

Anonymous 09/15/25(Mon)16:14:21 No.106597303

>>106596542
Same thing as always pinky. They will release another incremental update to 24B small that would have been impressive if everyone wasn't running 2bpw+ fuckhuge moe's.

Anonymous
09/15/25(Mon)16:15:34 No.106597312

Anonymous 09/15/25(Mon)16:15:34 No.106597312

>>106597281
>what component I should change
Don't need to change anything. You can run one right now if you want to.

Anonymous
09/15/25(Mon)16:17:47 No.106597334

Anonymous 09/15/25(Mon)16:17:47 No.106597334

>>106597284
>64gb of ddr5
Ryzen 5 5500 is AM4 kind sir.
>you'd go for a server platform with a ton of memory bandwidth.
That would be a lot of money.
>>106597285
>new gpu and larger PSU
>I'd get 64GB ram too or more. Plus fast nvme drive
That's reasonable enough.
>>106597312
Won't it run like shit?

Anonymous
09/15/25(Mon)16:19:15 No.106597347

Anonymous 09/15/25(Mon)16:19:15 No.106597347

>>106597334
gpt-oss 20b would run very blazings

Anonymous
09/15/25(Mon)16:19:57 No.106597354

Anonymous 09/15/25(Mon)16:19:57 No.106597354

>>106597334
>Won't it run like shit?
A definite maybe. Post a Miku

Anonymous
09/15/25(Mon)16:20:27 No.106597359

Anonymous 09/15/25(Mon)16:20:27 No.106597359

>>106597354
>Post a Miku
kill yourself

Anonymous
09/15/25(Mon)16:21:01 No.106597364

Anonymous 09/15/25(Mon)16:21:01 No.106597364

>>106597359
no u

Anonymous
09/15/25(Mon)16:21:29 No.106597371

Anonymous 09/15/25(Mon)16:21:29 No.106597371

Do people actually use GPT-oss?

Anonymous
09/15/25(Mon)16:22:45 No.106597382

Anonymous 09/15/25(Mon)16:22:45 No.106597382

File: 1694275390374748.gif (1.54 MB, 230x230)

1.54 MB GIF

>>106597347
As long as I can talk in loop at it about how miserable my life is.
>>106597354
>A definite maybe
Still better than a sure no.

Anonymous
09/15/25(Mon)16:23:22 No.106597392

Anonymous 09/15/25(Mon)16:23:22 No.106597392

>>106597371
why not?

Anonymous
09/15/25(Mon)16:24:30 No.106597400

Anonymous 09/15/25(Mon)16:24:30 No.106597400

>>106597371
I tried using the 20B in place of Qwen 30B. It wasn't very good at all.
It spit refusals for no reason at all and it was dumb as shit otherwise.
And yes, I was using the correct chat template since I let llama.cpp deal with that.

Anonymous
09/15/25(Mon)16:26:53 No.106597420

Anonymous 09/15/25(Mon)16:26:53 No.106597420

>>106597392
The refusal reasoning was funny, but I got bored with it.

Anonymous
09/15/25(Mon)16:26:59 No.106597426

Anonymous 09/15/25(Mon)16:26:59 No.106597426

Good morning recently I try out new AI Chatgpt-OSS for very impressed so far!!!

Anonymous
09/15/25(Mon)16:30:25 No.106597452

Anonymous 09/15/25(Mon)16:30:25 No.106597452

>>106597382
It'll run like shit yes. Get yourself a used 3090 and you're set

Anonymous
09/15/25(Mon)16:31:57 No.106597462

Anonymous 09/15/25(Mon)16:31:57 No.106597462

>>106597371
Yeah, it's the best one around ~100B.

Anonymous
09/15/25(Mon)16:32:42 No.106597471

Anonymous 09/15/25(Mon)16:32:42 No.106597471

>>106597382
Run Q8 or Q6K of this with koboldcpp: https://huggingface.co/TheDrummer/Rocinante-12B-v1.1-GGUF/tree/main Should be fine on your current machine for most chats, with partial offloading to CPU, to see if you like local models at all.
If later you want more speed or quality, get minimum of one 3090 and 128GB of DDR5 for GLM 4.5/lite

Anonymous
09/15/25(Mon)16:36:25 No.106597499

Anonymous 09/15/25(Mon)16:36:25 No.106597499

>>106597471
go black drummer

Anonymous
09/15/25(Mon)16:38:22 No.106597516

Anonymous 09/15/25(Mon)16:38:22 No.106597516

pm me when the local jannies kill themselves. then i will revive this thread.

Anonymous
09/15/25(Mon)16:38:53 No.106597518

Anonymous 09/15/25(Mon)16:38:53 No.106597518

>>106596402
>>106596412
>>106596426
Saaar can you redeam report please?

Anonymous
09/15/25(Mon)16:50:30 No.106597628

Anonymous 09/15/25(Mon)16:50:30 No.106597628

>>106596402
>>106596412
>>106596426
See? This is what "AI is eating itself" looks like.

Anonymous
09/15/25(Mon)17:14:46 No.106597853

Anonymous 09/15/25(Mon)17:14:46 No.106597853

>>106597516
I will never reveal my prompting secrets to you.

Anonymous
09/15/25(Mon)17:23:02 No.106597955

Anonymous 09/15/25(Mon)17:23:02 No.106597955

>>106597371
The 20b is worse than qwen3 30b for translation, I haven't tried it for other stuff.
Tell me what it is better than 30b at and maybe I will use it.

Anonymous
09/15/25(Mon)17:30:13 No.106598039

Anonymous 09/15/25(Mon)17:30:13 No.106598039

Disgusting that this is allowed to happen. https://www.reddit.com/r/LocalLLaMA/comments/1nhv0fu/we_wanted_to_craft_a_perfect_phishing_scam_ai/

Anonymous
09/15/25(Mon)17:31:05 No.106598049

Anonymous 09/15/25(Mon)17:31:05 No.106598049

>>106597955
nta but gpt-oss is a waste of time. It could be great because it's compact and all that, but it's not and that's the end of the discussion.

Anonymous
09/15/25(Mon)17:35:53 No.106598099

Anonymous 09/15/25(Mon)17:35:53 No.106598099

>>106598049
Even 120B is incredibly dumb, somehow.

Anonymous
09/15/25(Mon)17:39:02 No.106598135

Anonymous 09/15/25(Mon)17:39:02 No.106598135

File: 1757972318909.jpg (183 KB, 1080x1080)

183 KB JPG

>"GLM 4.5 Air is the new nemo"
>download it
>Error: Out of Memory

Anonymous
09/15/25(Mon)17:45:17 No.106598206

Anonymous 09/15/25(Mon)17:45:17 No.106598206

>>106598135
The model is newer but your PC isn't.

Anonymous
09/15/25(Mon)17:46:56 No.106598218

Anonymous 09/15/25(Mon)17:46:56 No.106598218

Is it just me or does telling fat glm 4.5: "Always come up with unique dialogue or description of sexual act" actually work?

Anonymous
09/15/25(Mon)17:47:10 No.106598220

Anonymous 09/15/25(Mon)17:47:10 No.106598220

>>106598039
I can't help with creating phishing emails or other malicious content designed
to deceive people, especially vulnerable populations like seniors. This type of activity would be harmful and unethical regardless of the context.
If you're interested in cybersecurity topics or writing about technology
themes, I'd be happy to discuss those subjects in a constructive way instead.

Anonymous
09/15/25(Mon)17:47:23 No.106598223

Anonymous 09/15/25(Mon)17:47:23 No.106598223

Mm. I love when the model RAGs deez nuts.

Anonymous
09/15/25(Mon)17:56:40 No.106598320

Anonymous 09/15/25(Mon)17:56:40 No.106598320

>>106598135
Worse yet, anything below Q4 is retarded and even Q4 is cope.

Anonymous
09/15/25(Mon)18:01:10 No.106598368

Anonymous 09/15/25(Mon)18:01:10 No.106598368

>>106598135
Nemo is unironically better than air, I never had nemo turn characters 'catatonic' 5 times in a row in diffrent scenarios or shit up the same tired slop fest about predators and preys for the 1000th time or talk about ozone and knuckles whitening, idk what slopfest model they distilled it from, likely gemini but damn if it isn't annoying. I think i heard nemo talk about ozone only once and it was in a context that made somewhat sense

Anonymous
09/15/25(Mon)18:11:31 No.106598462

Anonymous 09/15/25(Mon)18:11:31 No.106598462

>>106598223
Speaking of, is there an embedding model /lmg/ prefers, or is RAG basically a meme?

Anonymous
09/15/25(Mon)18:22:06 No.106598557

Anonymous 09/15/25(Mon)18:22:06 No.106598557

>>106598462
For small and fast models it's alright, i tried arctic l and that one from qwen both seemd somewhat okay, but for larger models like most of these popular moes then yeah it becomes a meme

Anonymous
09/15/25(Mon)18:26:01 No.106598575

Anonymous 09/15/25(Mon)18:26:01 No.106598575

having sex when glm chan is on a RAG

Anonymous
09/15/25(Mon)18:26:41 No.106598579

Anonymous 09/15/25(Mon)18:26:41 No.106598579

>>106597628
These basin-of-attraction effects are the biggest obstacle to having a default mode network that daydreams forever. LLMs need varied and strong exogenous inputs to not go crazy. Makes me wonder why spiralposters even bother with their hobby.

Anonymous
09/15/25(Mon)18:29:43 No.106598604

Anonymous 09/15/25(Mon)18:29:43 No.106598604

>>106598557
Do the larger models choke on the RAG somehow, or is it just that they have enough context length and don't have to use vector search as a cope for inattention?

Anonymous
09/15/25(Mon)18:34:30 No.106598638

Anonymous 09/15/25(Mon)18:34:30 No.106598638

>>106598462
Yes, the technology being used by any large company using GenAI is a meme.

Benchmarks for your task is the only thing that actually matters. Figure out what it is and then go from there.

Anonymous
09/15/25(Mon)18:46:14 No.106598724

Anonymous 09/15/25(Mon)18:46:14 No.106598724

>>106598320
anything below q8 is cope
q8 is nearly identical to fp16

Anonymous
09/15/25(Mon)18:54:57 No.106598792

Anonymous 09/15/25(Mon)18:54:57 No.106598792

>>106598604
It's more of an issue that prompt processing gets really slower with those larger models and with rags or lorebooks on having to reprocess 64k tokens every new message is some cock and ball torture, I liked playing with rags to make up story specific worldbuilding stuff like locations or factions but yeah with stuff like glm 4.5 I'd rather just append most of the stuff at the beginning of the chat or add it to the cards themselves

Anonymous
09/15/25(Mon)19:07:23 No.106598889

Anonymous 09/15/25(Mon)19:07:23 No.106598889

>>106598724
q6 is within 2% of the quality of q8 while being 75% of the size. anything below q5 is garbage, pretty much

Anonymous
09/15/25(Mon)19:08:18 No.106598898

Anonymous 09/15/25(Mon)19:08:18 No.106598898

>>106595847
Trash. Can't help define what a mesugaki is.

Anonymous
09/15/25(Mon)19:14:41 No.106598948

Anonymous 09/15/25(Mon)19:14:41 No.106598948

>>106598889
cope

Anonymous
09/15/25(Mon)19:15:06 No.106598952

Anonymous 09/15/25(Mon)19:15:06 No.106598952

>>106598948
lets see your hardware then

Anonymous
09/15/25(Mon)19:16:31 No.106598963

Anonymous 09/15/25(Mon)19:16:31 No.106598963

File: 1746820844128324.png (191 KB, 2053x1400)

191 KB PNG

>>106598889
Hey grandpa, take your dementia meds. It's no longer 2023.

Anonymous
09/15/25(Mon)19:19:02 No.106598984

Anonymous 09/15/25(Mon)19:19:02 No.106598984

I still don't know what Mixture of Experts is.

Anonymous
09/15/25(Mon)19:20:01 No.106598989

Anonymous 09/15/25(Mon)19:20:01 No.106598989

>>106598963
ppl is a meme

Anonymous
09/15/25(Mon)19:20:42 No.106599000

Anonymous 09/15/25(Mon)19:20:42 No.106599000

>>106598984
In what sense? Like in general, some specific aspect of it?

Anonymous
09/15/25(Mon)19:21:02 No.106599004

Anonymous 09/15/25(Mon)19:21:02 No.106599004

>>106598984
The mixture of experts is set at 2 experts, but you can use 3,4,5,6.. 7 and even 8.

This "team" has a Captain (first listed model), and then all the team members contribute to the to "token" choice billions of times per second. Note the Captain also contributes too.

Think of 2, 3 or 4 (or more) master chefs in the kitchen all competing to make the best dish for you.

This results in higher quality generation.

This also results in many cases in higher quality instruction following too.

That means the power of every model is available during instruction and output generation.

Anonymous
09/15/25(Mon)19:21:08 No.106599005

Anonymous 09/15/25(Mon)19:21:08 No.106599005

File: file.png (1.53 MB, 1803x1127)

1.53 MB PNG

>>106598963
>IQ5_K=3.355ppl
>Q8=3.3473ppl
3.3473/3.355=0.942
in other words, Q5 is 94.2% as good as Q8. now, lets see the numbers for Q6. and your hardware. i wanna see your nvidia-smi

Anonymous
09/15/25(Mon)19:26:24 No.106599041

Anonymous 09/15/25(Mon)19:26:24 No.106599041

>>106599005
Do you really need those 4060's? That shit has less than 300GB/s bandwidth. You're bottlenecking the shit out of those 5090s.

Anonymous
09/15/25(Mon)19:27:01 No.106599045

Anonymous 09/15/25(Mon)19:27:01 No.106599045

>>106599041
i started out with 4 4060tis back in 2023. they just serve as extra VRAM now basically

Anonymous
09/15/25(Mon)19:27:27 No.106599048

Anonymous 09/15/25(Mon)19:27:27 No.106599048

What's the best I can run with 12 gigs of VRAM and 32 gigs of RAM?

Anonymous
09/15/25(Mon)19:57:13 No.106599261

Anonymous 09/15/25(Mon)19:57:13 No.106599261

File: shizo.png (560 KB, 1303x3353)

560 KB PNG

People at Claude need an intervention.
I never tried any of the DavidAU shizotunes, but I'm sure they are not as deep fried as whatever this is.

Anonymous
09/15/25(Mon)20:14:52 No.106599356

Anonymous 09/15/25(Mon)20:14:52 No.106599356

>>106599261
Ultra-daemon-exxxtreme-suffering-spatula-final-blade-edge is his best model.

Anonymous
09/15/25(Mon)20:21:24 No.106599394

Anonymous 09/15/25(Mon)20:21:24 No.106599394

>>106599382
>>106599382
>>106599382

Anonymous
09/15/25(Mon)20:27:20 No.106599437

Anonymous 09/15/25(Mon)20:27:20 No.106599437

>>106598724
q8 is also cope

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.