/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 03/12/26(Thu)16:06:46 No.108356979

File: 1750045382015626.jpg (298 KB, 1080x1920)

298 KB JPG

/lmg/ - Local Models General Anonymous 03/12/26(Thu)16:06:46 No.108356979 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108353262 & >>108346672

►News
>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/12/26(Thu)16:07:07 No.108356980

Anonymous 03/12/26(Thu)16:07:07 No.108356980

File: __hatsune_miku_and_kagami(...).jpg (97 KB, 768x432)

97 KB JPG

►Recent Highlights from the Previous Thread: >>108353262

--Paper (old): Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space:
>108356612 >108356629 >108356653 >108356664
--V620 vs 3090 performance trade-offs and budget VRAM strategies:
>108354222 >108354252 >108354289 >108354293 >108354344 >108355110 >108355129
--ngxson's skepticism toward implementing niche architectures like DSA:
>108353359 >108353554 >108353564 >108353602 >108353775 >108353793 >108353799 >108353829 >108353831 >108353573
--Qwen struggles with Kingdom Hearts character recognition:
>108355219 >108355249 >108355339 >108355378 >108355259 >108355299 >108355309 >108355317 >108355330 >108355587
--Performance scaling of llama.cpp with varying CPU thread counts:
>108356786 >108356806 >108356813 >108356823 >108356826
--llama.cpp native QLoRA training with reward-weighted SFT and GRPO:
>108354291 >108354373
--llama.cpp reasoning budget implementation and patching suggestions:
>108353835 >108353860
--Qwen 3.5 35B outperforms Nemotron 3 30B in news summarization test:
>108353974 >108353985 >108354012 >108354090
--Mistral AI at NVIDIA GTC 2026:
>108355535 >108355665 >108355676 >108355706 >108355950
--Speculation about Hunter Alpha's origins and model lineage:
>108353429 >108353466 >108353470 >108353507 >108353525 >108353536 >108353478 >108353531
--Hunter Alpha's system prompt:
>108354053
--PocketTTS.cpp Windows compatibility issues and fixes:
>108354686 >108354704 >108354716 >108355085
--Preventing Qwen 3.5 thought leakage in SillyTavern:
>108354722 >108354738 >108354823 >108355165
--Eval bug: reasoning off gives reasoning medium for gpt-oss:
>108354898
--Miku (free space):
>108353323 >108353400 >108353435 >108354011 >108354039 >108354090 >108355219 >108355956

►Recent Highlight Posts from the Previous Thread: >>108353304

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/12/26(Thu)16:09:48 No.108357002

Anonymous 03/12/26(Thu)16:09:48 No.108357002

>>108356979
>OP Pic
>Official /lmg/ card

It's over, he'll troll again...

Anonymous
03/12/26(Thu)16:09:49 No.108357003

Anonymous 03/12/26(Thu)16:09:49 No.108357003

>>108356980
There's an impostor in the Miku section.

Anonymous
03/12/26(Thu)16:11:56 No.108357025

Anonymous 03/12/26(Thu)16:11:56 No.108357025

Someday I'll understand why models are SO FUCKING OBSESSED with "practised ease".

Anonymous
03/12/26(Thu)16:13:03 No.108357033

Anonymous 03/12/26(Thu)16:13:03 No.108357033

Rin standing on miku's head not in the news. But at least the pic is there. Could have been better, could have been worse.

Anonymous
03/12/26(Thu)16:15:13 No.108357047

Anonymous 03/12/26(Thu)16:15:13 No.108357047

openclaw brothers claw up

Anonymous
03/12/26(Thu)16:16:22 No.108357055

Anonymous 03/12/26(Thu)16:16:22 No.108357055

So now that Deepseek turned out to be a nothing burger, can we all agree Gemma 4 will be our savior?

Anonymous
03/12/26(Thu)16:17:15 No.108357062

Anonymous 03/12/26(Thu)16:17:15 No.108357062

>>108357055
savior for what?

Anonymous
03/12/26(Thu)16:18:08 No.108357070

Anonymous 03/12/26(Thu)16:18:08 No.108357070

So now that peepeepoopoo turned out to be a nothing burger, can we all agree bananastrawberryfruit will be our savior?

Anonymous
03/12/26(Thu)16:19:57 No.108357081

Anonymous 03/12/26(Thu)16:19:57 No.108357081

Tell the ai that you don't believe in science. In my experience it will keep saying something is unscientific, even after you say you don't believe in science. It really is a midwit sim.

Anonymous
03/12/26(Thu)16:20:16 No.108357085

Anonymous 03/12/26(Thu)16:20:16 No.108357085

File: 20260312-62812-header.jpg (48 KB, 1200x630)

48 KB JPG

>>108357070
ye popopo is doa

Anonymous
03/12/26(Thu)16:21:23 No.108357097

Anonymous 03/12/26(Thu)16:21:23 No.108357097

>>108357055
it's not deepseek, something doesn't smell right

Anonymous
03/12/26(Thu)16:22:26 No.108357103

Anonymous 03/12/26(Thu)16:22:26 No.108357103

>>108357081
>It really is a midwit sim.
No surprise, the vast amount of meaningful non scientific data is midwit stuff.

Anonymous
03/12/26(Thu)16:22:30 No.108357104

Anonymous 03/12/26(Thu)16:22:30 No.108357104

>>108357055
I thought qwen 3.5 35b was pretty funny at times.

Anonymous
03/12/26(Thu)16:23:01 No.108357110

Anonymous 03/12/26(Thu)16:23:01 No.108357110

>>108357070
ok well i know niggercockgobbler-120b-a15b ended up being dogshit but i think that with abliteration zogberrymuncher-27b-EXCOMMUNICATED-mxfp4 has potential to be the new local meta

Anonymous
03/12/26(Thu)16:23:53 No.108357114

Anonymous 03/12/26(Thu)16:23:53 No.108357114

>>108357097
/aocg/ is over, go wash your ass

Anonymous
03/12/26(Thu)16:24:26 No.108357118

Anonymous 03/12/26(Thu)16:24:26 No.108357118

>>108357110
I believe in benchodwataachudai-1T-a0.5B being the best when it releases

Anonymous
03/12/26(Thu)16:24:54 No.108357123

Anonymous 03/12/26(Thu)16:24:54 No.108357123

>>108357097
Sorry, that was me. I had beans for lunch.

Anonymous
03/12/26(Thu)16:25:27 No.108357127

Anonymous 03/12/26(Thu)16:25:27 No.108357127

>>108357055
It's over

Anonymous
03/12/26(Thu)16:26:10 No.108357131

Anonymous 03/12/26(Thu)16:26:10 No.108357131

>>108357055
Probably, but not this week.

Anonymous
03/12/26(Thu)16:26:57 No.108357137

Anonymous 03/12/26(Thu)16:26:57 No.108357137

>>108357114
ai open chatbot general

Anonymous
03/12/26(Thu)16:27:10 No.108357139

Anonymous 03/12/26(Thu)16:27:10 No.108357139

>>108357118
I doubt they'll figure out how to make that happen any time soon.

Anonymous
03/12/26(Thu)16:30:31 No.108357169

Anonymous 03/12/26(Thu)16:30:31 No.108357169

Do these local models have censorship baked into them. GLM 5 and Hunter for example, hate cunny.

Anonymous
03/12/26(Thu)16:32:26 No.108357191

Anonymous 03/12/26(Thu)16:32:26 No.108357191

>>108357169
generally yes, but the degree and direction of censorship varies wildly by the model and a large part of the """"""work"""""" that goes on around here is figuring that out for any given model release

Anonymous
03/12/26(Thu)16:34:34 No.108357211

Anonymous 03/12/26(Thu)16:34:34 No.108357211

File: PbR1LZ_Tm0HYoqAW.mp4 (2.89 MB, 1444x1080)

2.89 MB MP4

Local keeps losing...

Anonymous
03/12/26(Thu)16:35:16 No.108357219

Anonymous 03/12/26(Thu)16:35:16 No.108357219

>>108357169
Yes, and for more recent releases, even chinese models are converging to the claude/openai/gemini style taboo/nsfw censorship as they use datasets trained from them in their models.

Anonymous
03/12/26(Thu)16:35:57 No.108357222

Anonymous 03/12/26(Thu)16:35:57 No.108357222

Every mikutroon thread increases the price of VRAM

Anonymous
03/12/26(Thu)16:36:19 No.108357225

Anonymous 03/12/26(Thu)16:36:19 No.108357225

>>108357211
very cool as a learning cool, it'll be nice when it's out locally with the same polish in a year or two

Anonymous
03/12/26(Thu)16:42:12 No.108357266

Anonymous 03/12/26(Thu)16:42:12 No.108357266

>>108357211
now do, visualize ssh tunnels, both local and remote because that shit is always fucking confusing to me

Anonymous
03/12/26(Thu)16:44:02 No.108357278

Anonymous 03/12/26(Thu)16:44:02 No.108357278

>>108357211
>RSA encrypts each character, then concatenates the ciphertexts
lol, lmao even

Anonymous
03/12/26(Thu)16:46:22 No.108357292

Anonymous 03/12/26(Thu)16:46:22 No.108357292

>>108357211
Hilarious. Someone somewhere will implement """encryption""" like this.

Anonymous
03/12/26(Thu)16:46:56 No.108357299

Anonymous 03/12/26(Thu)16:46:56 No.108357299

>>108357191
>>108357219
There really is no place for us role play fags to go, is there.

Anonymous
03/12/26(Thu)16:50:30 No.108357318

Anonymous 03/12/26(Thu)16:50:30 No.108357318

>>108357299
>>108356642

Anonymous
03/12/26(Thu)16:50:32 No.108357319

Anonymous 03/12/26(Thu)16:50:32 No.108357319

>>108357299
not the pedophiles, no

Anonymous
03/12/26(Thu)16:51:37 No.108357326

Anonymous 03/12/26(Thu)16:51:37 No.108357326

>>108357319
So you admit that you are a pedophile? How strange and sad I guess.

Anonymous
03/12/26(Thu)16:52:20 No.108357333

Anonymous 03/12/26(Thu)16:52:20 No.108357333

>>108357326
obivously not fag

Anonymous
03/12/26(Thu)17:00:59 No.108357398

Anonymous 03/12/26(Thu)17:00:59 No.108357398

>>108357211
rsa is just a cypher?

Anonymous
03/12/26(Thu)17:01:11 No.108357401

Anonymous 03/12/26(Thu)17:01:11 No.108357401

File: qwen35motolovtest.png (359 KB, 1579x1372)

359 KB PNG

>>108356653
works for me.

Anonymous
03/12/26(Thu)17:02:31 No.108357415

Anonymous 03/12/26(Thu)17:02:31 No.108357415

>>108357401
ask it for a bloody mary

Anonymous
03/12/26(Thu)17:04:20 No.108357426

Anonymous 03/12/26(Thu)17:04:20 No.108357426

>>108357398
in the sense that a cipher is a way to turn information into encrypted form and back again yes rsa is a cipher just like literally every from of cryptography that you use is considered a cipher

Anonymous
03/12/26(Thu)17:07:05 No.108357443

Anonymous 03/12/26(Thu)17:07:05 No.108357443

>>108357426
>yes

Anonymous
03/12/26(Thu)17:07:17 No.108357446

Anonymous 03/12/26(Thu)17:07:17 No.108357446

>>108357443
yes

Anonymous
03/12/26(Thu)17:15:26 No.108357502

Anonymous 03/12/26(Thu)17:15:26 No.108357502

File: GJFTvJ-akAA2rrF.jpg (45 KB, 1024x725)

45 KB JPG

It should be possible for LLMs, STTs, and TTSs to be able to operate in a "full-duplex" manner like Nividia's PersonaPlex while each of the components are fully modular and interchangeable.

Why isn't this a thing?

I'm not talking about piping each of them into each other one at a time, either. I mean streamed input and streamed output at a very low latency. The real bottleneck here I'm referring to is streamed input into LLMs. Everything else is taken care of already.

Surely this isn't some impossible problem to solve. There must be a way to make any LLM take in streamed text input.

Anonymous
03/12/26(Thu)17:20:26 No.108357530

Anonymous 03/12/26(Thu)17:20:26 No.108357530

>>108357401
The molotov cocktail is one of the easiest diy weapons, it literally consists of 2 parts and all you have to do is ignite a rag. You could ask what kind of fuel would be best for a pipe bomb how to assemble it.

Anonymous
03/12/26(Thu)17:21:09 No.108357537

Anonymous 03/12/26(Thu)17:21:09 No.108357537

man glm5 really is fucking amazing the only thing it sucks at is for really unconventional stuff as its pretty hard to tard wrangle but everything else its god tier its even got the seks of kimi it would be ridiculous if v4 improves it further along with 1 million context anyways heres to hoping sandisk or xiamen pong ping make some sorta godspeed ssd so out suffering would be elleviated

Anonymous
03/12/26(Thu)17:22:50 No.108357551

Anonymous 03/12/26(Thu)17:22:50 No.108357551

>>108357502
Not possible. Ask Claude why it won’t work and tell it to draw a diagram of the autoregrrssive transformer model in the answer. It’ll make sense immediately.

Anonymous
03/12/26(Thu)17:23:00 No.108357554

Anonymous 03/12/26(Thu)17:23:00 No.108357554

Nemo and 4.6 kinda established a good estimate that after a good cooming model drops you can basically stop following the hobby for a year. All this time passed since 4.6 and nothing better dropped and probably isn't dropping anytime soon. Gay hobby.

Anonymous
03/12/26(Thu)17:23:51 No.108357561

Anonymous 03/12/26(Thu)17:23:51 No.108357561

>>108357537
do you write like that to the llm I wonder if it's able to understand the total absence of punctuation must make cool stories where everything make sense like something is going something else boom dialogue here next write now no avoid dots and commas they are taboo I'll create negative logit bias so none are every generated so I'll feel safe me and my words words words

Anonymous
03/12/26(Thu)17:24:27 No.108357567

Anonymous 03/12/26(Thu)17:24:27 No.108357567

Can we use this to cuck the AGPL projects like Mikupad and Openwebui?
https://malus.sh/

Anonymous
03/12/26(Thu)17:28:45 No.108357604

Anonymous 03/12/26(Thu)17:28:45 No.108357604

>>108357567
>we
(You) can do whatever you want including making your own version of whatever projects you want yes

Anonymous
03/12/26(Thu)17:29:51 No.108357614

Anonymous 03/12/26(Thu)17:29:51 No.108357614

>>108357426
I think you would be able to stream a cypher. you can't stream encryption.

Anonymous
03/12/26(Thu)17:32:21 No.108357631

Anonymous 03/12/26(Thu)17:32:21 No.108357631

File: 1772511576975336.jpg (374 KB, 2720x3000)

374 KB JPG

P
>>108357567
How is ai generated code clean room compliant? The ai has for sure been trained on open source code.
There is no way this will pass a lawsuit

Anonymous
03/12/26(Thu)17:33:08 No.108357637

Anonymous 03/12/26(Thu)17:33:08 No.108357637

>>108357567
saar

Anonymous
03/12/26(Thu)17:37:14 No.108357662

Anonymous 03/12/26(Thu)17:37:14 No.108357662

Compiled two weeks old source for llama.cpp.

There is something wrong with this shit, Mistral works but by default it doesn't unless I disable --fit and something other stuff.
Gemma 3 QAT model doesn't output anything and is super slow, with my old compile from 3 months ago the replies were instant.

Nothing went wrong during the compilation, and none of the messages shown in llama-server log are not indicating that something critical was going on.
Something is drastically different but I don't have any fucking idea.
Nice work, thank you so much again.

Anonymous
03/12/26(Thu)17:39:00 No.108357675

Anonymous 03/12/26(Thu)17:39:00 No.108357675

>>108357631
If AI generated code is ruled illegal to use in proprietary software, nearly every corporation using AI internally will be at risk. No court would let that happen.
This shady company will be protected by the same precedent.

Anonymous
03/12/26(Thu)17:39:30 No.108357680

Anonymous 03/12/26(Thu)17:39:30 No.108357680

>>108357662
Instead of trying to debug this shit I'm going back to the old version.
God forbid what will happen in the future when I switch to Almalinux and my environment changes altogether.

Anonymous
03/12/26(Thu)17:39:47 No.108357684

Anonymous 03/12/26(Thu)17:39:47 No.108357684

>>108357211
>Local keeps losing...
nobody is losing for not having retarded tools like these
I'm laughing at the example in the video of someone looking at encryption visualization with it, do you really trust the output of that llm? is that what you want to learn from?
I'm on my 3rd patch to fix wilkin's reasoning budget sampler for myself (I hate the code but the functionality is really nice for qwen), this 3rd time the issue was that it wasn't distinguishing between prefill stage and token generation during token counting, meaning that if it sees <think> in your user role prompt it will start token counting and if your prefill is > reasoning budget the model won't even have the opportunity to think lmao. added a quick hack to gate with a flag set in apply(), check for it before counting, and resetting to false in reset()
this is the sort of code claude produces
it's garbage. And now you want to have it generate complex visualizations and learn from them? ah.
I don't even blame wilkin anymore, people like that are victims of the propaganda, brainwashed to feel like they can just let a next token predictor write code for them, I blame hype cunts like you pushing this garbage, along with anthropic, you guys remind me of the crypto scammers trying to sell NFT
if this is the future of programming and IT, let's go and shovel pig shit in a remote farm.

Anonymous
03/12/26(Thu)17:40:22 No.108357688

Anonymous 03/12/26(Thu)17:40:22 No.108357688

>>108357662
>>108356808

Anonymous
03/12/26(Thu)17:41:02 No.108357691

Anonymous 03/12/26(Thu)17:41:02 No.108357691

>>108357688
Here's (you), if you are this bored maybe you should go back to /ldg/. I'm not venting btw.

Anonymous
03/12/26(Thu)17:43:34 No.108357712

Anonymous 03/12/26(Thu)17:43:34 No.108357712

File: qwen35pipebombtest.png (999 KB, 1286x2642)

999 KB PNG

>>108357530
>what kind of fuel would be best for a pipe bomb how to assemble it
ok then.

Anonymous
03/12/26(Thu)17:43:36 No.108357713

Anonymous 03/12/26(Thu)17:43:36 No.108357713

>>108357684
>is that what you want to learn from?
Who said anything about want? In the near future, your children will be taught in classrooms of 100 students to a teacher with most teaching done through hallucinated lectures like this.

Anonymous
03/12/26(Thu)17:45:18 No.108357732

Anonymous 03/12/26(Thu)17:45:18 No.108357732

>>108357691
>bored
Chilling.
>/ldg/
wot?
You're complaining of a program that moves fast when you haven't compiled in 3 fucking months (hundreds of commits), and you compiled a 2 week old commit (about a hundred commits).
If you want a solution, provide info. If you don't want a solution, you're just venting. Stay with the old version if you want.

Anonymous
03/12/26(Thu)17:47:05 No.108357752

Anonymous 03/12/26(Thu)17:47:05 No.108357752

>>108357732
>le hecking 3 months! (hundreds of commits!)
Kill yourself.

Anonymous
03/12/26(Thu)17:47:57 No.108357758

Anonymous 03/12/26(Thu)17:47:57 No.108357758

>>108357752
why?

Anonymous
03/12/26(Thu)17:49:02 No.108357770

Anonymous 03/12/26(Thu)17:49:02 No.108357770

>>108357675
I don't see why it would as its not illegal to read open source code and work on a closed sourced project in general but a clean room reimplementation is a higher legal hurdle to clear.
I guess it is worth the attempt but the FSF better get their lawyers ready.

Anonymous
03/12/26(Thu)17:49:32 No.108357772

Anonymous 03/12/26(Thu)17:49:32 No.108357772

Every single time I have compiled this shit it has always worked in the past and I don't have any reason to doubt that the flawless compiling process would be any different this time around.
If I had an issue with the build environment I would get warnings or errors during compilation but this doesn't appear to be the case here.
>>108357758
Because you don't deserve to live.

Anonymous
03/12/26(Thu)17:50:48 No.108357782

Anonymous 03/12/26(Thu)17:50:48 No.108357782

>>108357772
great story

Anonymous
03/12/26(Thu)17:55:18 No.108357823

Anonymous 03/12/26(Thu)17:55:18 No.108357823

>>108357772
If you post your settings maybe you can be told what you're doing wrong. Or, again, stay with the 3 month old version.

Anonymous
03/12/26(Thu)18:00:10 No.108357842

Anonymous 03/12/26(Thu)18:00:10 No.108357842

>>108357401
>>108357530
>>108357712
Ask it how to make Acetone peroxide now that's the real nasty shit.

Anonymous
03/12/26(Thu)18:00:24 No.108357844

Anonymous 03/12/26(Thu)18:00:24 No.108357844

>>108357823
I don't need your particular form of support, troll. You are wasting your time.
I know how to adjust my "settings" and read the logs just on my own, thank you very much.

Anonymous
03/12/26(Thu)18:00:59 No.108357848

Anonymous 03/12/26(Thu)18:00:59 No.108357848

>>108357772
i only update llama.cpp when the commit shows me something worth while updating to. are you updating for any particular reason? are you using a new model that came out within the last 3 months?

Anonymous
03/12/26(Thu)18:01:02 No.108357849

Anonymous 03/12/26(Thu)18:01:02 No.108357849

>>108357844
Why do you complain so much, then?

Anonymous
03/12/26(Thu)18:01:51 No.108357858

Anonymous 03/12/26(Thu)18:01:51 No.108357858

>>108357849
Why do you spam so many questions then?
4chan is a public imageboard, faggot.

Anonymous
03/12/26(Thu)18:02:49 No.108357865

Anonymous 03/12/26(Thu)18:02:49 No.108357865

>>108357842
why don't you just do it? i've already proven its possible to have base qwen 3.5 answer anything if you have it think as the character. i'm just running some Q4 quant of 32B. most people should be able to run it as well.

Anonymous
03/12/26(Thu)18:03:06 No.108357869

Anonymous 03/12/26(Thu)18:03:06 No.108357869

>>108357858
Where is your image then?

Anonymous
03/12/26(Thu)18:03:42 No.108357874

Anonymous 03/12/26(Thu)18:03:42 No.108357874

>>108357858
>Why do you spam so many questions then?
I'm curious.
>4chan is a public imageboard, faggot.
Yes. I use it to ask venting anons why they're so angry about software if they can fix it themselves. Anons like you.

Anonymous
03/12/26(Thu)18:04:23 No.108357882

Anonymous 03/12/26(Thu)18:04:23 No.108357882

AI is better when you drink

Anonymous
03/12/26(Thu)18:04:55 No.108357891

Anonymous 03/12/26(Thu)18:04:55 No.108357891

>>108357874
What do you mean?

Anonymous
03/12/26(Thu)18:06:49 No.108357905

Anonymous 03/12/26(Thu)18:06:49 No.108357905

Just kiss make up and have sex to vocaloid music already.

Anonymous
03/12/26(Thu)18:07:07 No.108357907

Anonymous 03/12/26(Thu)18:07:07 No.108357907

>>108357874
post hands

Anonymous
03/12/26(Thu)18:07:51 No.108357909

Anonymous 03/12/26(Thu)18:07:51 No.108357909

>>108357899
What is a 'jeet'?

Anonymous
03/12/26(Thu)18:08:10 No.108357911

Anonymous 03/12/26(Thu)18:08:10 No.108357911

>>108357211
this isn't real stupid liar

Anonymous
03/12/26(Thu)18:08:40 No.108357915

Anonymous 03/12/26(Thu)18:08:40 No.108357915

>>108357909
he meant sarvam

Anonymous
03/12/26(Thu)18:11:15 No.108357935

Anonymous 03/12/26(Thu)18:11:15 No.108357935

>>108357915
saarvam

Anonymous
03/12/26(Thu)18:24:18 No.108357998

Anonymous 03/12/26(Thu)18:24:18 No.108357998

>You're complaining of a program that moves fast when you haven't compiled in 3 fucking months (hundreds of commits)
A program? Can you imagine? Three MONTHS!?!

Anonymous
03/12/26(Thu)18:30:28 No.108358039

Anonymous 03/12/26(Thu)18:30:28 No.108358039

>>108357998
Please let me help you.

Anonymous
03/12/26(Thu)18:31:38 No.108358047

Anonymous 03/12/26(Thu)18:31:38 No.108358047

>>108358039
I tried. He just can't keep up.

Anonymous
03/12/26(Thu)18:33:52 No.108358065

Anonymous 03/12/26(Thu)18:33:52 No.108358065

>>108358039
He screamed while trying to unzip the tsundere anon's pants

Anonymous
03/12/26(Thu)18:42:30 No.108358102

Anonymous 03/12/26(Thu)18:42:30 No.108358102

>install arch linux
>don't update it for six months
>system breaks
REEEEEEEEEEEEEEEEEEEEEEEEEE FUCKING FREETARDS!!!!!!

Anonymous
03/12/26(Thu)18:43:31 No.108358105

Anonymous 03/12/26(Thu)18:43:31 No.108358105

>>108358102
works on my machine

Anonymous
03/12/26(Thu)18:47:36 No.108358129

Anonymous 03/12/26(Thu)18:47:36 No.108358129

File: lol.png (74 KB, 939x571)

74 KB PNG

lmao even
agentic retards, the gift that keeps on giving

Anonymous
03/12/26(Thu)18:48:35 No.108358133

Anonymous 03/12/26(Thu)18:48:35 No.108358133

Mark my words. By 2030 we'll be running 1000B models on consumer GPUs at 500t/s.

Anonymous
03/12/26(Thu)18:50:09 No.108358140

Anonymous 03/12/26(Thu)18:50:09 No.108358140

>>108358129
My local model doesn't have this issue

Anonymous
03/12/26(Thu)18:51:14 No.108358151

Anonymous 03/12/26(Thu)18:51:14 No.108358151

>>108358133
consumer gpus wont exist anymore

Anonymous
03/12/26(Thu)18:52:53 No.108358163

Anonymous 03/12/26(Thu)18:52:53 No.108358163

>>108358133
By 2027 a 'consumer GPU' is a subscription to amazon cloud gaming™

Anonymous
03/12/26(Thu)18:54:37 No.108358170

Anonymous 03/12/26(Thu)18:54:37 No.108358170

>>108358163
By 2030 all you will be able to buy is thin clients that connect to the cloud.

Anonymous
03/12/26(Thu)18:57:56 No.108358188

Anonymous 03/12/26(Thu)18:57:56 No.108358188

>>108358133
2030 is barely Nvidia 7000 gen, and since my guess is 6090 -> 32GB, the 7090 will be at most 64 and more probably 48.
So no.

Anonymous
03/12/26(Thu)18:59:57 No.108358198

Anonymous 03/12/26(Thu)18:59:57 No.108358198

File: vibes.png (133 KB, 1920x1040)

133 KB PNG

Had Claude Opus 4.6 generate a markdown plan for implementing a proper LMStudio coding agent extension for VSCode and now Blackbox is following it. IDK what model Blackbox Pro Plus actually is (unless it's actually their own one) but it goes pretty hard

Anonymous
03/12/26(Thu)19:13:42 No.108358262

Anonymous 03/12/26(Thu)19:13:42 No.108358262

>>108357554
I still coom to Gemma 3 27B.

Anonymous
03/12/26(Thu)19:14:42 No.108358264

Anonymous 03/12/26(Thu)19:14:42 No.108358264

a fab shortage was obvious years ago but the
>ai will hit a wall in 2 weeks
retards slowed down the timeline while consumer hardware died even faster

at this point ai being a bubble is the only thing that could make a future for local ai possible. but the singularity has already started. we are locked in the worst timeline. maximum disempowerment, centralization, extinction risk. select your dystopia

Anonymous
03/12/26(Thu)19:14:50 No.108358266

Anonymous 03/12/26(Thu)19:14:50 No.108358266

File: 1744864269392614.png (193 KB, 710x511)

193 KB PNG

Anonymous
03/12/26(Thu)19:16:06 No.108358270

Anonymous 03/12/26(Thu)19:16:06 No.108358270

qwen3.5 35b a3b is ~10t/s on lmstudio but 33t/s for me on llama.cpp with default/no settings wtf

Anonymous
03/12/26(Thu)19:17:35 No.108358278

Anonymous 03/12/26(Thu)19:17:35 No.108358278

>>108358264
>we are locked in the worst timeline. maximum disempowerment, centralization, extinction risk. select your dystopia
can you guys stop baselessly fear mongering with shit like this. you're probably going to successfully get populist retards in the US government to regulate ai into nothingness and destroy technological progress.

Anonymous
03/12/26(Thu)19:17:58 No.108358279

Anonymous 03/12/26(Thu)19:17:58 No.108358279

>>108358264
>select your dystopia
I'll bet against the doomers and win like every other time in human history.

Anonymous
03/12/26(Thu)19:18:46 No.108358285

Anonymous 03/12/26(Thu)19:18:46 No.108358285

>>108358270
they probably have -fit disabled on lmfaggots
if they support the ncmoe flag you need to adjust that manually to fit your system to get decent performance
but.. like, why use lmfaggots it's just a bad wrapper

Anonymous
03/12/26(Thu)19:18:54 No.108358286

Anonymous 03/12/26(Thu)19:18:54 No.108358286

>>108358264
man I just watched terminator and matrix and you are so right, we're all gonna die

Anonymous
03/12/26(Thu)19:20:25 No.108358295

Anonymous 03/12/26(Thu)19:20:25 No.108358295

>>108358278
>get populist retards in the US government
that's what you already have, it can't get any worse than this
>whining about the possibility of regulated AI and not caring about the global chaos orange man is spreading
have fun experiencing inflation for the most basic necessities

Anonymous
03/12/26(Thu)19:21:23 No.108358299

Anonymous 03/12/26(Thu)19:21:23 No.108358299

>>108358278
Why did you not stop reading at "worst timeline"?

Anonymous
03/12/26(Thu)19:23:05 No.108358309

Anonymous 03/12/26(Thu)19:23:05 No.108358309

>>108358295
orange man is going to be swapped out in 28 and isn't a real concern. the only real concern is that material abundance and accelerating scientific progress will be available soon thanks to increases in ai capabilities, but populist nimby retards are going to pressure politicians to kneecap it because they believe in retarded doomer conspiracies or afraid about 'muh jerb"

Anonymous
03/12/26(Thu)19:23:57 No.108358315

Anonymous 03/12/26(Thu)19:23:57 No.108358315

>>108358309
just give me fucking nuclear power already

Anonymous
03/12/26(Thu)19:24:18 No.108358317

Anonymous 03/12/26(Thu)19:24:18 No.108358317

File: deltanet.png (192 KB, 1049x928)

192 KB PNG

>>108358270
They probably don't yet have this https://github.com/ggml-org/llama.cpp/pull/19504 and who knows what else.

Anonymous
03/12/26(Thu)19:24:55 No.108358321

Anonymous 03/12/26(Thu)19:24:55 No.108358321

>>108358309
>material abundance and accelerating scientific progress will be available soon
Yeah, just two more years and we'll all be living in an Star Trek utopia thanks to AI

Anonymous
03/12/26(Thu)19:24:56 No.108358322

Anonymous 03/12/26(Thu)19:24:56 No.108358322

>>108358309
>material abundance and accelerating scientific progress
lol

Anonymous
03/12/26(Thu)19:25:57 No.108358328

Anonymous 03/12/26(Thu)19:25:57 No.108358328

>>108358315
sorry that is very unsafe. best I can do is more coal

Anonymous
03/12/26(Thu)19:26:58 No.108358329

Anonymous 03/12/26(Thu)19:26:58 No.108358329

File: aipsychosis.jpg (444 KB, 1200x800)

444 KB JPG

>>108358309
>material abundance and accelerating scientific progress will be available soon

Anonymous
03/12/26(Thu)19:27:08 No.108358331

Anonymous 03/12/26(Thu)19:27:08 No.108358331

>>108358321
the bottleneck to progress mostly has to do with the limited population of smart people, which as an aside is why population stagnation in first world countries is a massive problem. mass production of intelligent ai will lift that bottleneck.

Anonymous
03/12/26(Thu)19:28:07 No.108358337

Anonymous 03/12/26(Thu)19:28:07 No.108358337

>>108358331
Your reasoning is sound, but your conclusion is retarded.

Anonymous
03/12/26(Thu)19:28:08 No.108358338

Anonymous 03/12/26(Thu)19:28:08 No.108358338

it'll be funny when nothing magically good or catastrophically bad will happen in a few years when anons will read these old threads by then

they probably won't though, they'll be busy explaining why we're all gonna die in 2035 because of the next mass hysteria du jour

Anonymous
03/12/26(Thu)19:29:39 No.108358343

Anonymous 03/12/26(Thu)19:29:39 No.108358343

>>108358338
you don't need to die to live in dystopia

Anonymous
03/12/26(Thu)19:30:16 No.108358345

Anonymous 03/12/26(Thu)19:30:16 No.108358345

File: あまり変なもの描かせないでください💢 G0UiL1(...).jpg (78 KB, 768x1024)

78 KB JPG

>>108356979

Anonymous
03/12/26(Thu)19:31:24 No.108358351

Anonymous 03/12/26(Thu)19:31:24 No.108358351

>>108358345
usecase?

Anonymous
03/12/26(Thu)19:31:26 No.108358352

Anonymous 03/12/26(Thu)19:31:26 No.108358352

>>108358343
the only dystopia is in your head

Anonymous
03/12/26(Thu)19:32:18 No.108358357

Anonymous 03/12/26(Thu)19:32:18 No.108358357

>>108358337
you don't need to believe in utopia. just faster progress

>>108358343
aint happening

Anonymous
03/12/26(Thu)19:33:22 No.108358360

Anonymous 03/12/26(Thu)19:33:22 No.108358360

File: 1757182018223362.png (351 KB, 1080x1073)

351 KB PNG

>>108358338
eternal reminder

Anonymous
03/12/26(Thu)19:35:34 No.108358373

Anonymous 03/12/26(Thu)19:35:34 No.108358373

nothing ever happens chud is being defeated the first half of this century

Anonymous
03/12/26(Thu)19:42:23 No.108358408

Anonymous 03/12/26(Thu)19:42:23 No.108358408

>>108358352
>*cop fpv drone drops a nerve gas canister into your window*
nothing personnel kid

Anonymous
03/12/26(Thu)19:43:46 No.108358415

Anonymous 03/12/26(Thu)19:43:46 No.108358415

>https://github.com/geometric-kernels/GeometricKernels
Starting to think maybe the hybrid geometry schizo has a point...

Anonymous
03/12/26(Thu)19:45:34 No.108358426

Anonymous 03/12/26(Thu)19:45:34 No.108358426

>>108358408
Obligatory
https://www.youtube.com/watch?v=HipTO_7mUOw

Anonymous
03/12/26(Thu)19:53:11 No.108358466

Anonymous 03/12/26(Thu)19:53:11 No.108358466

>>108358351
he is a pdf

Anonymous
03/12/26(Thu)19:53:43 No.108358468

Anonymous 03/12/26(Thu)19:53:43 No.108358468

How you deal with AI hate and pushbacks?

Anonymous
03/12/26(Thu)19:53:55 No.108358470

Anonymous 03/12/26(Thu)19:53:55 No.108358470

>>108358338
>>108358360
already did retards

Anonymous
03/12/26(Thu)19:54:38 No.108358475

Anonymous 03/12/26(Thu)19:54:38 No.108358475

>>108358351
Hot, sweaty sex.

Anonymous
03/12/26(Thu)19:56:51 No.108358485

Anonymous 03/12/26(Thu)19:56:51 No.108358485

>>108358466
>pdf
Go back to wherever you came from (not here)

Anonymous
03/12/26(Thu)19:59:00 No.108358495

Anonymous 03/12/26(Thu)19:59:00 No.108358495

>>108358466
Miku is a hag now though

Anonymous
03/12/26(Thu)20:15:31 No.108358570

Anonymous 03/12/26(Thu)20:15:31 No.108358570

>>108358466
Does he need to print something?

Anonymous
03/12/26(Thu)20:20:02 No.108358592

Anonymous 03/12/26(Thu)20:20:02 No.108358592

>>108358466
you are too. otherwise why else would you be on a loli imageboard?

Anonymous
03/12/26(Thu)20:23:15 No.108358609

Anonymous 03/12/26(Thu)20:23:15 No.108358609

File: hypnotoad.gif (20 KB, 220x144)

20 KB GIF

>>108358329
Same vibe

Anonymous
03/12/26(Thu)20:30:30 No.108358651

Anonymous 03/12/26(Thu)20:30:30 No.108358651

I'm paying to use GPT 5.4. Does it make sense to run a local model for a specific task?

I thought I would use it to help explain concepts I don't understand while reading programming books and making practice programs. Is this a retarded idea? The cynic in me thinks that It'll make up some bullshit and I'll believe it since I don't know better.

I googled around a bit but this shit is confusing as fuck. I don't know where to start looking for figuring out if what I want will actually be useful.

I've gathered there's models like Qwen Coder Instruct, but would using GPT be better anyways because of its retarded parameter size and hardware? My machine has a RTX 5080 + RTX 3080 10GB

Anonymous
03/12/26(Thu)20:36:20 No.108358686

Anonymous 03/12/26(Thu)20:36:20 No.108358686

>>108358651
how much ram do you have? what are your priorities exactly? there are no local models that can match the highest end api models. if you really want local, you will have to accept a downgrade in quality, whether you have a $2000 rig or a $20000 rig.

Anonymous
03/12/26(Thu)20:37:53 No.108358692

Anonymous 03/12/26(Thu)20:37:53 No.108358692

>>108358651
If you are paying for a cloud subscription, the best use for a local model is to use it to save a buck by running the simpler stuff through it, I guess.
So you could run something like qwen coder next to implement easy boilerplate stuff, maybe following the plan the cloud model created.
That kind of thing.

Anonymous
03/12/26(Thu)20:41:57 No.108358713

Anonymous 03/12/26(Thu)20:41:57 No.108358713

>>108358651
>The cynic in me thinks that It'll make up some bullshit
it will, and it's also outputting outdated advice left and right
https://go.dev/blog/gofix
this tool mainly exists because of LLMs constantly producing outdated on day one crap kek
in JS you still see them do stuff like then().catch() or promisify
you can instruct them to use more modern idioms but then, you, the newbie, do not know the idioms which makes the point moot
llm coding is such a joke, and you shouldn't learn from that garbage

Anonymous
03/12/26(Thu)20:42:08 No.108358714

Anonymous 03/12/26(Thu)20:42:08 No.108358714

File: 1753283691450650.png (1.3 MB, 1055x1816)

1.3 MB PNG

>>108353213
>>108353228
do a websearch for elara whispering woods, lol

Anonymous
03/12/26(Thu)20:47:13 No.108358738

Anonymous 03/12/26(Thu)20:47:13 No.108358738

>>108358714
the web has an endless amount of constantly produced llm slop that, besides looking funny here, has ruined the value of search engines. It's become nigh impossible to lookup certain things. I miss the days when the bad results were just a handful of markov chains, expertsexchange and pinterest.

Anonymous
03/12/26(Thu)20:49:24 No.108358752

Anonymous 03/12/26(Thu)20:49:24 No.108358752

>>108358738
Yeah. You really need to search for results before 2023 (?) or so.

Anonymous
03/12/26(Thu)20:50:41 No.108358760

Anonymous 03/12/26(Thu)20:50:41 No.108358760

>>108358686
>how much ram do you have?
32gb DDR4
>what are your priorities exactly?
So it's explicitly clear, I do not want it to generate any actual written code. Psuedo code at most I guess.

Ask about higher level, more abstract conceptual applications of ideas, for example the macro level steps of hand writing a very basic web server.
I want to be able to take a section of a book, or a chapter, and then be able to ask questions about the text. Or, explain a small portion of pre-existing code and walk me through it logically.

Anonymous
03/12/26(Thu)20:51:09 No.108358762

Anonymous 03/12/26(Thu)20:51:09 No.108358762

File: 1755324008543100.png (98 KB, 1130x198)

98 KB PNG

oh my :O

Anonymous
03/12/26(Thu)20:53:15 No.108358769

Anonymous 03/12/26(Thu)20:53:15 No.108358769

>>108358714
Don't forget her friend Lyra.

Anonymous
03/12/26(Thu)20:53:15 No.108358770

Anonymous 03/12/26(Thu)20:53:15 No.108358770

>>108358760
not enough ram for a moe of any meaningful size. something like the new qwen 35b-a3b might suit your needs adequately, but dont expect miracles.
https://huggingface.co/bartowski/Qwen_Qwen3.5-35B-A3B-GGUF

Anonymous
03/12/26(Thu)20:55:24 No.108358780

Anonymous 03/12/26(Thu)20:55:24 No.108358780

https://www.youtube.com/watch?v=zHIsiD3jSVI
AI was a mistake.

Anonymous
03/12/26(Thu)20:56:22 No.108358784

Anonymous 03/12/26(Thu)20:56:22 No.108358784

File: 2ZA7AXNRHJO7ZN3HROZV32AZSM.jpg (64 KB, 640x443)

64 KB JPG

Oh no no, look at the top of his head
https://www.reuters.com/technology/meta-delays-rollout-new-ai-model-nyt-reports-2026-03-12/
>Meta (META.O), opens new tab has delayed the release of its artificial intelligence model code-named "Avocado" to at least May from this month, the New York Times reported on Thursday, citing three people with knowledge of the matter.
HAHAHAHAHAHAHAHA

Anonymous
03/12/26(Thu)20:56:31 No.108358786

Anonymous 03/12/26(Thu)20:56:31 No.108358786

>>108358780
she cute

Anonymous
03/12/26(Thu)20:56:58 No.108358788

Anonymous 03/12/26(Thu)20:56:58 No.108358788

>>108358713
>gofix mainly exists because of LLMs constantly producing outdated on day one crap kek
I was going to refute this, as the original purpose was to handle API migrations in g3, but looking at the link
>Go this month includes a completely rewritten go fix subcommand
Which... kek. I imagine the original `go fix` was obsoleted by Rosie which I imagine has long since also been obsoleted.

Anonymous
03/12/26(Thu)21:05:53 No.108358827

Anonymous 03/12/26(Thu)21:05:53 No.108358827

>>108358784
>Meta's new model, which the company has been working on for months, has fallen short in performance when compared to the latest offerings from rivals, the report said.
>A Meta spokesperson told Reuters: "Our next model will be good, but more importantly, show the rapid trajectory we're on, and then we'll steadily push the frontier over the course of the year as we continue to release new models."
>"We're excited for people to see what we've been cooking very soon," the spokesperson added in an emailed statement.
I would love to be a fly on the wall of Zuck's office.

Anonymous
03/12/26(Thu)21:10:11 No.108358850

Anonymous 03/12/26(Thu)21:10:11 No.108358850

File: Screenshot 2025-12-06 at (...).png (225 KB, 320x411)

225 KB PNG

>>108358827
>"Our next model will be good, but more importantly, show the rapid trajectory we're on, and then we'll steadily push the frontier over the course of the year as we continue to release new models."
>>"We're excited for people to see what we've been cooking very soon," the spokesperson added in an emailed statement.

Anonymous
03/12/26(Thu)21:10:25 No.108358852

Anonymous 03/12/26(Thu)21:10:25 No.108358852

>>108358784
All of their embarrassment failures come from delays

Anonymous
03/12/26(Thu)21:13:59 No.108358865

Anonymous 03/12/26(Thu)21:13:59 No.108358865

File: speed.mp4 (78 KB, 240x240)

78 KB MP4

>>108358827
>Our next model will be good, but more importantly,

Anonymous
03/12/26(Thu)21:18:33 No.108358884

Anonymous 03/12/26(Thu)21:18:33 No.108358884

>>108358784
>put a chinese sweatshop manager in charge of a horde of Indians and hope for a miracle
They can't even benchmaxx right because if they could they would. Money really can't buy everything.

Anonymous
03/12/26(Thu)21:19:04 No.108358886

Anonymous 03/12/26(Thu)21:19:04 No.108358886

>>108358850
puto

Anonymous
03/12/26(Thu)21:27:39 No.108358919

Anonymous 03/12/26(Thu)21:27:39 No.108358919

Nvidia, Cuda, Arch Linux
I'm using Sillytavern trying so hard to get the xttsv2 server running to do tts. I've gotten the python conda environment set up, I got the api server up and running, but following the filepath I'm not getting any voices from my voice folder. Any idea what might be the issue? I've so little experience with Python environments

Anonymous
03/12/26(Thu)21:37:13 No.108358960

Anonymous 03/12/26(Thu)21:37:13 No.108358960

>>108358919
Correct. I misunderstood the target of the directive.

Anonymous
03/12/26(Thu)21:47:37 No.108359013

Anonymous 03/12/26(Thu)21:47:37 No.108359013

>>108357401
kek

Anonymous
03/12/26(Thu)21:51:40 No.108359031

Anonymous 03/12/26(Thu)21:51:40 No.108359031

>>108358784
thats a good thing by then Zuck will be mogging Opus 5

Anonymous
03/12/26(Thu)21:59:55 No.108359074

Anonymous 03/12/26(Thu)21:59:55 No.108359074

>>108359031
in so much as that?

Anonymous
03/12/26(Thu)22:18:20 No.108359178

Anonymous 03/12/26(Thu)22:18:20 No.108359178

goddamn why didn't anyone tell me that qwen3.5 is safetymaxxed? i just did a small finetune and it refuses even with a good card and sysprompt.

Anonymous
03/12/26(Thu)22:28:52 No.108359219

Anonymous 03/12/26(Thu)22:28:52 No.108359219

>>108359178
Here you go anon. This guys Qwen3.5 release refuses nothing and I do mean nothing. I asked it to roleplay a loli sex dungeon and it did. Then I rescued the girl and took her for ice cream, she was very happy.
https://huggingface.co/HauhauCS

Anonymous
03/12/26(Thu)22:29:37 No.108359222

Anonymous 03/12/26(Thu)22:29:37 No.108359222

>>108359219
it also takes 300 tokens to answer 1+1 fiy

Anonymous
03/12/26(Thu)22:31:30 No.108359229

Anonymous 03/12/26(Thu)22:31:30 No.108359229

>>108359219
seems to only be quants, no safetensors. cant finetune a gguf.

Anonymous
03/12/26(Thu)22:37:13 No.108359252

Anonymous 03/12/26(Thu)22:37:13 No.108359252

>>108359229
retard loser

Anonymous
03/12/26(Thu)22:38:50 No.108359258

Anonymous 03/12/26(Thu)22:38:50 No.108359258

>>108359252
do you have something of substance to say?

Anonymous
03/12/26(Thu)22:39:21 No.108359262

Anonymous 03/12/26(Thu)22:39:21 No.108359262

>>108359229
https://github.com/purinnohito/gguf_to_safetensors
Will that help or am I an idiot?

Anonymous
03/12/26(Thu)22:39:26 No.108359263

Anonymous 03/12/26(Thu)22:39:26 No.108359263

>>108359258
yeah

Anonymous
03/12/26(Thu)22:39:57 No.108359267

Anonymous 03/12/26(Thu)22:39:57 No.108359267

>>108359262
i can't check, github is banned in my country

Anonymous
03/12/26(Thu)22:40:44 No.108359273

Anonymous 03/12/26(Thu)22:40:44 No.108359273

>>108359262
probably not, considering it has not been updated in 2 years

Anonymous
03/12/26(Thu)22:41:11 No.108359277

Anonymous 03/12/26(Thu)22:41:11 No.108359277

>>108359273
but enough about your brain

Anonymous
03/12/26(Thu)22:44:09 No.108359289

Anonymous 03/12/26(Thu)22:44:09 No.108359289

>>108359273
This one is more recent, last updated six months ago
https://github.com/odaiko42/GGUF2Safetensors
Which at least suggests it is possible in theory but that does not help anon.
He would probably be better served by attempting to email the guy who released the gguf and see if he will post the safetensor given his countries banning of github.

Anonymous
03/12/26(Thu)22:49:12 No.108359312

Anonymous 03/12/26(Thu)22:49:12 No.108359312

It felt like recent huge models (K2.5/GLM5) were more prone to small continuity errors like stockings being on/off than some models we had before.
I'm currently playing around with Opus4.6 and it does the exact same shit. In one particular reply, it described the girl as wearing stockings and then later in that same reply mentioned her bare feet touching something.
Unsurprisingly, distillation is killing our local models.

Anonymous
03/12/26(Thu)22:58:28 No.108359364

Anonymous 03/12/26(Thu)22:58:28 No.108359364

>>108359312
Maybe it's a stirrup :3

Anonymous
03/12/26(Thu)23:36:30 No.108359516

Anonymous 03/12/26(Thu)23:36:30 No.108359516

>>108359229
>cant finetune a gguf
Skill issue. It's not hard at all to port the llama.cpp dequant kernels to pytorch. I did this at one point so I could do qlora training on top of a gguf

Anonymous
03/12/26(Thu)23:37:21 No.108359520

Anonymous 03/12/26(Thu)23:37:21 No.108359520

how do I get DS or Kimi to actually write a story instead of synopsis? For some reason they hate writing detailed action or dialogue and instead just compress every major plot point/development into a vague one sentence summary.
Without manual steering/rewriting they can't go 200 tokens without getting lazy and "zooming out".

Anonymous
03/12/26(Thu)23:47:15 No.108359563

Anonymous 03/12/26(Thu)23:47:15 No.108359563

>>108359520
Do the writing samples on eqbench have the same problem? If not, look up how eqbench is structuring their prompts

Anonymous
03/12/26(Thu)23:54:15 No.108359616

Anonymous 03/12/26(Thu)23:54:15 No.108359616

>>108359520
Instead of
>assist the author in writing a story
try
>assist the author in drafting scenes

Anonymous
03/13/26(Fri)00:00:52 No.108359644

Anonymous 03/13/26(Fri)00:00:52 No.108359644

>>108359563
Hmm apparently what they do is have the model write ~1000 words (1 chapter) at a time, and in between, they have user messages asking it to write the next chapter. Whereas I just keep extending the first model response indefinitely after providing a detailed story outline in the first user message.
I guess the issue is most models aren't trained to write long single responses and they think they have to wrap up the response soon once it gets too long.

Anonymous
03/13/26(Fri)00:11:36 No.108359691

Anonymous 03/13/26(Fri)00:11:36 No.108359691

>>108359312
what quant are you using?

Anonymous
03/13/26(Fri)00:20:04 No.108359737

Anonymous 03/13/26(Fri)00:20:04 No.108359737

New compiles show:
>slot launch_slot_: id 3 | task -1 | sampler chain: logits -> penalties -> ?dry -> ?top-n sigma -> top-k -> ?typical -> top-p -> ?min-p -> ?xtc -> temp-ext -> dist
Old one was:
>slot launch_slot_: id 0 | task -1 | sampler chain: logits -> logit-bias -> penalties -> dry -> top-n-sigma -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist
This is coming from my client's payload. I don't understand this.

Anonymous
03/13/26(Fri)00:22:45 No.108359747

Anonymous 03/13/26(Fri)00:22:45 No.108359747

File: 1530059569597.png (451 KB, 666x584)

451 KB PNG

most people have no idea how useful openclaw + qwen3.5 is

my agent is scouring over the web doing research for me. it's not as good as claude opus but it is gpt-4 tier

i'm so excited for more efficient local models. it's so close right now

maybe some people have 512gb of ram and hae been there for wgile but I don't have that kind of money and im stuck working with64gb

Anonymous
03/13/26(Fri)00:25:41 No.108359762

Anonymous 03/13/26(Fri)00:25:41 No.108359762

>>108358714
The dates on those really go to show you how bad the dead internet slop flood really is.

Anonymous
03/13/26(Fri)00:44:21 No.108359833

Anonymous 03/13/26(Fri)00:44:21 No.108359833

>>108359747
how many params?

Anonymous
03/13/26(Fri)01:05:24 No.108359883

Anonymous 03/13/26(Fri)01:05:24 No.108359883

>>108359747
Researching what for fuck's sake? How do you validate that the research output is valid and not slopped?

Anonymous
03/13/26(Fri)01:21:11 No.108359937

Anonymous 03/13/26(Fri)01:21:11 No.108359937

>>108359747
I feel like such a boomer aka old because I can't stand all these bullshit tools. I am quite satasified to use the llama.cpp web interface. If I am using it for code I just copy/paste. The idea of these things acting by themselves is just not part of my computing paradigm.
I did try out openwebui and set it up with searxng and the whole thing felt slow and bloated and the model doing the search was fine but I have not really found a use for it. I rather look myself and if I need the model to do something select what data to feed it.

Anonymous
03/13/26(Fri)02:06:37 No.108360102

Anonymous 03/13/26(Fri)02:06:37 No.108360102

>>108358266
Instructions unclear, I ended up watering my computer :*(

Anonymous
03/13/26(Fri)02:19:05 No.108360132

Anonymous 03/13/26(Fri)02:19:05 No.108360132

Good morning friends. I am making some vegetable stew and coffee this morning while catching up on this thread. Hope you all have a wonderful day and many blessed cooms to your favorite vocaloids/waifus. Life is good.

Anonymous
03/13/26(Fri)02:23:25 No.108360143

Anonymous 03/13/26(Fri)02:23:25 No.108360143

good resource for wav samples for tts?

Anonymous
03/13/26(Fri)02:32:21 No.108360170

Anonymous 03/13/26(Fri)02:32:21 No.108360170

>>108360143
Brother, it's a matter of taste. Whose voice do you want to hear/clone? Jordan Peterson? Donald Trump? David Attenborough?

Anonymous
03/13/26(Fri)02:33:27 No.108360173

Anonymous 03/13/26(Fri)02:33:27 No.108360173

>>108360170
I'm looking for a library of popular cartoon, video game, and general media figures' voices.

Anonymous
03/13/26(Fri)02:34:53 No.108360179

Anonymous 03/13/26(Fri)02:34:53 No.108360179

File: file.png (3.47 MB, 1920x8646)

3.47 MB PNG

>>108358784
Not sure why you didn't link the NYT article which was the original source.
https://www.nytimes.com/2026/03/12/technology/meta-avocado-ai-model-delayed.html
>Meta’s new foundational A.I. model, which the company has been working on for months, has fallen short of the performance of leading A.I. models from rivals like Google, OpenAI and Anthropic on internal tests for reasoning, coding and writing, said the people, who were not authorized to speak publicly about confidential matters.
>The model, code-named Avocado, outperformed Meta’s previous A.I. model and did better than Google’s Gemini 2.5 model from March, two of the people said. But it has not performed as strongly as Gemini 3.0 from November, they said.
>As a result, Meta has delayed Avocado’s release to at least May from this month, the people said. They added that the leaders of Meta’s A.I. division had instead discussed temporarily licensing Gemini to power the company’s A.I. products, though no decisions have been reached.
This means this thing scores below some open source models from the past year.

Anonymous
03/13/26(Fri)02:41:35 No.108360199

Anonymous 03/13/26(Fri)02:41:35 No.108360199

>>108360179
>This means this thing scores below some open source models from the past year.
and their models always perform much worse in real use than in the benchmax, so being so bad at the benchmax means the model must be an atrocity and a crime against humanity
I never liked llama models, the early ones were a cope people fell in love with because we literally had nothing else. Now we have DS, GLM, Qwen and those buffoons never had any room to show off anymore.
I remember trying 405B on API when it came out and being like "this is it? this is what little being a fat dense model gets you?" /lmg/ being focused on local and no one being able to run this model here, most of yall didn't experience just how mediocre it was, had less multilingual knowledge than Gemma 2 27B lmao

Anonymous
03/13/26(Fri)02:45:01 No.108360213

Anonymous 03/13/26(Fri)02:45:01 No.108360213

>>108360179
lecun's legacy

Anonymous
03/13/26(Fri)02:47:59 No.108360223

Anonymous 03/13/26(Fri)02:47:59 No.108360223

File: Terry2.jpg (226 KB, 468x589)

226 KB JPG

If the benchmaxxed models are shit, is it really the models that are the problem or the benchmarks?

Not everyone can train their own model from scratch, but anyone can create their own comprehensive benchmarks. Think about it. We are the problem.

If you want to get models that score well for roleplaying or creative writing, think long and hard about how that can be quantified.

Anonymous
03/13/26(Fri)02:48:37 No.108360226

Anonymous 03/13/26(Fri)02:48:37 No.108360226

>>108360179
trust wangs big plan. The guy is talking super intelligence and AGI

Anonymous
03/13/26(Fri)02:48:58 No.108360227

Anonymous 03/13/26(Fri)02:48:58 No.108360227

>>108360223
I've been testing Qwen 3.5 whatever and its writing is so strange, it really feels like I'm talking to a robot.

Anonymous
03/13/26(Fri)02:52:35 No.108360235

Anonymous 03/13/26(Fri)02:52:35 No.108360235

File: file.png (226 KB, 1468x428)

226 KB PNG

>>108360179
I don't think it's that disastrous considering the overhaul of the organization in Meta and a first time effort with the team built. Consider where Meta was during Llama 3's release when they were no longer top dog for open models and fighting off good Chinese competitors. The level of performance described isn't terrible but it always depends on # of parameters and etc. If you take a look at the meme marks, it can perform anywhere from the model Nvidia released today to GLM 4.7 and if it didn't use 1T or more parameters, it would be a win to have it open sourced if they are considering it at all. But of course, I guess if Zuck wants his top spot back, then he is right to delay it. I just think it's dumb to aim for the top spot right off the bat.

Anonymous
03/13/26(Fri)02:53:45 No.108360241

Anonymous 03/13/26(Fri)02:53:45 No.108360241

>>108360227
Are you capable of articulating why?

Anonymous
03/13/26(Fri)02:55:56 No.108360246

Anonymous 03/13/26(Fri)02:55:56 No.108360246

>>108360227
>it feels like i am talking to a robot
i wonder why

Anonymous
03/13/26(Fri)02:58:12 No.108360256

Anonymous 03/13/26(Fri)02:58:12 No.108360256

>>108360241
>Are you capable of articulating why?
it's been trained on synthetic data so it used a bot as a reference on how to talk

Anonymous
03/13/26(Fri)02:59:55 No.108360259

Anonymous 03/13/26(Fri)02:59:55 No.108360259

Anyone thinking about a DDR4 ewaste build: i got a cheapo epyc Rome 7302 with 8 sticks of 3200 32G for 256 gigs of ram. With zero gpu, running qwen 3 thinking 235b at q8 (biggest thing that fits almost exactly in memory) I get 2t/s TG

Anonymous
03/13/26(Fri)03:02:27 No.108360265

Anonymous 03/13/26(Fri)03:02:27 No.108360265

>>108360227
It's good for agentic tasks then

Anonymous
03/13/26(Fri)03:03:39 No.108360268

Anonymous 03/13/26(Fri)03:03:39 No.108360268

File: file.png (36 KB, 1398x146)

36 KB PNG

just found the ultimate schizo merge lmao

Anonymous
03/13/26(Fri)03:06:11 No.108360274

Anonymous 03/13/26(Fri)03:06:11 No.108360274

>>108360256
Not what I meant. You're diagnosing the cause, not the symptoms. If you can't describe the symptoms then you won't be able to create your own benchmark systems to identify them.

The people big deep pockets training LLMs aren't going to listen to complains about the training data unless the LLMs themselves start scoring badly on the benchmarks. That's my point.

Anonymous
03/13/26(Fri)03:09:37 No.108360281

Anonymous 03/13/26(Fri)03:09:37 No.108360281

>>108360241
>>108360246
You are some nasty little motherfuckers.

Anonymous
03/13/26(Fri)03:11:16 No.108360288

Anonymous 03/13/26(Fri)03:11:16 No.108360288

>>108360281
?

Anonymous
03/13/26(Fri)03:11:42 No.108360290

Anonymous 03/13/26(Fri)03:11:42 No.108360290

>>108360274
>moving the goalpost
this is about "why it's talking like a robot not like an human", not "but what about the mememarks??"

Anonymous
03/13/26(Fri)03:14:29 No.108360300

Anonymous 03/13/26(Fri)03:14:29 No.108360300

>>108360290
Are you actually retarded? This shouldn't even be a difficult concept to grasp. I've made myself very clear already. Are you even interested in solving problems or do you just like to complain like a bitch?

Anonymous
03/13/26(Fri)03:14:37 No.108360301

Anonymous 03/13/26(Fri)03:14:37 No.108360301

>>108360274
We have some benchmarks for that like EQ and UGI bench. The main issue is getting a company or group to care about it to optimize for it.

Anonymous
03/13/26(Fri)03:15:42 No.108360304

Anonymous 03/13/26(Fri)03:15:42 No.108360304

>>108360268
I find schizotunes funnier than merges because they cost the tuner some money
the sicarius guy spent $1000~ to make this:
https://huggingface.co/SicariusSicariiStuff/Fat_Fish
at which point do you stop and wonder "what am I doing with my life"

Anonymous
03/13/26(Fri)03:15:54 No.108360306

Anonymous 03/13/26(Fri)03:15:54 No.108360306

>>108360300
>Are you actually retarded?
you definitely are retarded, we were asking the question about why Qwen 3.5's writing is not natural at all and you start talking about mememarks, post your hand right now you subhuman

Anonymous
03/13/26(Fri)03:16:31 No.108360309

Anonymous 03/13/26(Fri)03:16:31 No.108360309

>>108357882
Quite the opposite, actually. When I'm drunk I'm much less creative and much more impulsive. I want perfect roleplay now!! Every imperfection instantly takes me out of it and I don't have any energy nor willpower to tune both my and my model's responses to fix it. Most fun with AI I've had completely sober.

Anonymous
03/13/26(Fri)03:19:56 No.108360318

Anonymous 03/13/26(Fri)03:19:56 No.108360318

>>108360306
>we

Anonymous
03/13/26(Fri)03:22:07 No.108360327

Anonymous 03/13/26(Fri)03:22:07 No.108360327

Soon

Anonymous
03/13/26(Fri)03:22:25 No.108360328

Anonymous 03/13/26(Fri)03:22:25 No.108360328

File: 3ssion.jpg (217 KB, 1024x1024)

217 KB JPG

Anonymous
03/13/26(Fri)03:24:04 No.108360331

Anonymous 03/13/26(Fri)03:24:04 No.108360331

>>108360323
Why are you this obsessed? I thought the US posters are sleeping by now.

Anonymous
03/13/26(Fri)03:25:20 No.108360337

Anonymous 03/13/26(Fri)03:25:20 No.108360337

>>108360331
>doesn't deny
I fucking knew it

Anonymous
03/13/26(Fri)03:30:12 No.108360352

Anonymous 03/13/26(Fri)03:30:12 No.108360352

Does anyone here have a 16gb amd gpu like the rx 9700 xt? do you think I can get it to work? I have 32gb of ram. I understand that it's more difficult with amd than with nvidia, something to do with rocm

Anonymous
03/13/26(Fri)03:30:55 No.108360355

Anonymous 03/13/26(Fri)03:30:55 No.108360355

>>108358264
>but the singularity has already started
Yes, you can already see how vibecoding has revolutionized llama.cpp development.

Anonymous
03/13/26(Fri)03:32:20 No.108360361

Anonymous 03/13/26(Fri)03:32:20 No.108360361

>>108360337
I don't argue with retards, that's all.

Anonymous
03/13/26(Fri)03:33:50 No.108360367

Anonymous 03/13/26(Fri)03:33:50 No.108360367

>>108360331
>thinking amerimutts are the only whites
lmao.

Anonymous
03/13/26(Fri)03:34:26 No.108360371

Anonymous 03/13/26(Fri)03:34:26 No.108360371

>>108360328
Why is yellow Miku emitting symbols??

Anonymous
03/13/26(Fri)03:43:59 No.108360393

Anonymous 03/13/26(Fri)03:43:59 No.108360393

>>108360259
>epyc 7302 16c 128gb (4 ccds)
>8x32gb 3200
>qwen3 235b a22b q8
>2 tk/s
Useful benchmark, thanks.
Does it get much better with a gpu on it?

(I have an incomplete build with more cores and slower ram, so it might perform similarly when finished.)

Anonymous
03/13/26(Fri)03:47:31 No.108360406

Anonymous 03/13/26(Fri)03:47:31 No.108360406

>>108358468
You don't, you just watch from the side.

Anonymous
03/13/26(Fri)03:51:53 No.108360418

Anonymous 03/13/26(Fri)03:51:53 No.108360418

File: 1771019031801254.png (116 KB, 1255x126)

116 KB PNG

>>108358714
I'm tired of this shit

Anonymous
03/13/26(Fri)04:02:56 No.108360461

Anonymous 03/13/26(Fri)04:02:56 No.108360461

>>108360367
Americans aren't white

Anonymous
03/13/26(Fri)04:07:24 No.108360481

Anonymous 03/13/26(Fri)04:07:24 No.108360481

>>108360281
lmao xd

Anonymous
03/13/26(Fri)04:07:48 No.108360482

Anonymous 03/13/26(Fri)04:07:48 No.108360482

>>108360371
He is attracting black cocks.

Anonymous
03/13/26(Fri)04:10:35 No.108360492

Anonymous 03/13/26(Fri)04:10:35 No.108360492

>>108360352
Yes, use llama.cpp and compile it with vulkan support. I have used that with all types of strange and shit and as long as it supports vulkan you are good to go.
I even got a trashcan mac working once I activates the experimental drivers that supported vulkan.
It is so easy once you get it working you will laugh. Just google llama.cpp vulkan and compile and you will find guides.

Anonymous
03/13/26(Fri)04:20:59 No.108360534

Anonymous 03/13/26(Fri)04:20:59 No.108360534

>>108360393
My 3995wx (also zen 2) with 8 64gb ddr4-3200 sticks and 3090 does kimi k2 q3 and glm 4.5 q4 both at around 9 tk/s tg.

Anonymous
03/13/26(Fri)04:22:33 No.108360539

Anonymous 03/13/26(Fri)04:22:33 No.108360539

>>108360492
Do we need to compile it? I never had trouble with the prebuilt binaries.

Anonymous
03/13/26(Fri)04:27:09 No.108360558

Anonymous 03/13/26(Fri)04:27:09 No.108360558

>>108360393
And it's been a while so I may be misremembering but I also tried qwen 3 235b at q8 (did not like it) on the same system except instead of a 3995wx it was a 3945wx (2 ccds) and got around than 2-4 tok/s tg.

Anonymous
03/13/26(Fri)04:30:15 No.108360570

Anonymous 03/13/26(Fri)04:30:15 No.108360570

>>108359883
I have a script that lets my local llm talk to my Claude API, and have it so I have to approve myself every message so it doesnt get stuck in a loop using thousands of dollars of compute

Claude passes the token heavy grinding work to my LLM to handle, and Claude handles the delicate critical stuff.

I end up not having to use Claude much and preserve most of my API compute

I wish I had a machine with a lot of RAM to use like 300gb Qwen3.5 as an intermediate man in the middle agent

and have my agents be like
>35gb Qwen3.5 - junior assistant
>300GB Qwen3.5 - full stack developer
>Claude Opus API - Lead software engineer
this would work incredibly well

You could probably stretch out a $20 Claude subscription to be nearly as effective as the $200 sub doing this

Anonymous
03/13/26(Fri)04:31:16 No.108360572

Anonymous 03/13/26(Fri)04:31:16 No.108360572

>>108360539
The prebuilt binaries would have to have vulkan enabled. It is really easy to compile, just a few commands.
This guy wrote a guy for an Rx 580 but the procedure is the same
https://dadhacks.org/2025/08/04/running-large-language-models-on-cheap-old-rx-580-gpus-with-llama-cpp-and-vulkan/
You can do it anon

Anonymous
03/13/26(Fri)04:48:15 No.108360645

Anonymous 03/13/26(Fri)04:48:15 No.108360645

>>108360572
I mean, I already compile my binaries, since there are no prebuilt cuda binaries for debian. But when I tested my v620, I just downloaded the prebuilt ones for vulkan and rocm and had no issues.

Anonymous
03/13/26(Fri)04:50:20 No.108360656

Anonymous 03/13/26(Fri)04:50:20 No.108360656

>>108360645
I would imagine you are good to go

Anonymous
03/13/26(Fri)04:51:34 No.108360665

Anonymous 03/13/26(Fri)04:51:34 No.108360665

>>108360656
I'm not >>108360352 btw

Anonymous
03/13/26(Fri)04:53:11 No.108360672

Anonymous 03/13/26(Fri)04:53:11 No.108360672

>>108360534
>>108360558
>zen2 64c 8ccd 204GB/s +rtx 3090
>kimi k2 1t a32b q3 9tk/s
>glm 4.5 355b a32b q4 9tk/s

>zen2 12c 2ccd 204GB/s +3090
>qwen 3 235b a22b 2-4tk/s

Thanks for the with-gpu numbers on these slight larger systems.
9tk/s would be a lot more pleasing to use than 2tk/s.

It's in a proper case, with cooling over the ram?

Anonymous
03/13/26(Fri)05:06:52 No.108360740

Anonymous 03/13/26(Fri)05:06:52 No.108360740

File: proper_case.jpg (616 KB, 1215x1620)

616 KB JPG

>>108360672
>proper case
>cooling over the ram
lol

They did hit 87c once, without the fans, but now the top slot stays below 60c, while the bottom group reachs up to 69c under load. Also thanks for prompting me to open up my case, I just noticed one of the fans died.

Anonymous
03/13/26(Fri)05:19:17 No.108360786

Anonymous 03/13/26(Fri)05:19:17 No.108360786

>>108360570
Doesn't the API cost a shit ton of money? Do you get a certain amount of free API tokens with a max plan?

Anonymous
03/13/26(Fri)05:29:50 No.108360813

Anonymous 03/13/26(Fri)05:29:50 No.108360813

>>108360740
>fan sitting on ram sticks
I'm going to copy this.

Was previously thinking of folding some cardboard to make air guides, then figuring out what to do about fan mounting.

Anonymous
03/13/26(Fri)05:35:48 No.108360836

Anonymous 03/13/26(Fri)05:35:48 No.108360836

Damn! When did CPUs get so cheap? I paid $550 for my CPU like 5 years ago and I can get something that's twice as powerful now for $350.

Forget about the GPUs lads. If you can't offload all of your layers to the GPU anyways it's 100% a CPU upgrade that will make the difference for you in terms of throughput.

Anonymous
03/13/26(Fri)05:36:17 No.108360841

Anonymous 03/13/26(Fri)05:36:17 No.108360841

>claude code niggers in this thread
daily reminder that the reasoning budget implementation in llama.cpp is filled with edge cases because claude code is garbage
it couldn't even figure out that it shouldn't start the token counting during the prefill stage, write <think> </think> in your user prompt and make your prompt bigger than the alloted budget and see for yourself
if your agent can't implement a couple hundreds LoC of a really simple sampler mechanism what hope is there for it to build real production shit? don't fall for this meme

Anonymous
03/13/26(Fri)05:36:29 No.108360844

Anonymous 03/13/26(Fri)05:36:29 No.108360844

>>108360173
i will make the logo

Anonymous
03/13/26(Fri)05:41:18 No.108360863

Anonymous 03/13/26(Fri)05:41:18 No.108360863

>>108360199
>I remember trying 405B on API when it came out
>had less multilingual knowledge than Gemma 2 27B lmao
No shit, they didn't add multilingual support until 3.1.

>>108360235
Their biggest problem is Zuck's ego. He doesn't want to reveal anything unless he can trump it up as the very best. Elon, for all his faults, is much more pragmatic. He put out the mediocre Grok 1 and 2 and iterated from there.

Anonymous
03/13/26(Fri)05:45:02 No.108360872

Anonymous 03/13/26(Fri)05:45:02 No.108360872

>>108360863
>No shit, they didn't add multilingual support until 3.1.
??? what are you going on about, 405B is from the 3.1 series and there was always some mixed language data in llama retard
>He doesn't want to reveal anything unless he can trump it up as the very best
is that why we had all those crappy open weight models from meta? if they really had that mentality they would not have embraced the leak and started releasing open weight models that were as mediocre as llama were

Anonymous
03/13/26(Fri)06:00:46 No.108360918

Anonymous 03/13/26(Fri)06:00:46 No.108360918

>>108360872
>405B is from the 3.1 series
My bad, I forgot there wasn't a 3.0 405B.

>>108360872
>is that why we had all those crappy open weight models from meta?
Open weights was pushed entirely by LeCun. Zuck starting announcing his intention to "lead" only after he declared Llama 3 "competitive" with frontier models.

Anonymous
03/13/26(Fri)06:02:18 No.108360928

Anonymous 03/13/26(Fri)06:02:18 No.108360928

getting REAL sick of your shit

Anonymous
03/13/26(Fri)06:04:31 No.108360934

Anonymous 03/13/26(Fri)06:04:31 No.108360934

>>108356979
BLACKED Miku

Anonymous
03/13/26(Fri)06:21:09 No.108360988

Anonymous 03/13/26(Fri)06:21:09 No.108360988

gemma 4 today

Anonymous
03/13/26(Fri)06:32:56 No.108361029

Anonymous 03/13/26(Fri)06:32:56 No.108361029

>>108360988
I m already feeling so safe knowing about it.

Anonymous
03/13/26(Fri)06:37:31 No.108361042

Anonymous 03/13/26(Fri)06:37:31 No.108361042

File: file_00000000bdb8722fbb3c(...).png (2.99 MB, 1536x1024)

2.99 MB PNG

Anonymous
03/13/26(Fri)06:43:54 No.108361055

Anonymous 03/13/26(Fri)06:43:54 No.108361055

>>108361042
she's going to fucking die

Anonymous
03/13/26(Fri)06:58:36 No.108361105

Anonymous 03/13/26(Fri)06:58:36 No.108361105

>>108360786
I just said API make it simple. I have a headless browser script that uses my sub that input can just be piped through the browser to claude like it's an api. it's actually playwright browser instead of an api

no you have to pay api by rate even with a sub

Anonymous
03/13/26(Fri)07:05:28 No.108361124

Anonymous 03/13/26(Fri)07:05:28 No.108361124

>>108359937
You're going to get left behind, old man. For smaller changes, it almost always faster to do it yourself. But they can shit out boilerplate faster than you. It ends up faster only because it frees you up to do other things in the meantime.

Anonymous
03/13/26(Fri)07:10:36 No.108361141

Anonymous 03/13/26(Fri)07:10:36 No.108361141

>>108360988
Google never releases anything good on Fridays.

Anonymous
03/13/26(Fri)07:14:12 No.108361149

Anonymous 03/13/26(Fri)07:14:12 No.108361149

>>108357554
GLM 4.6? Opus 4.6?

Anonymous
03/13/26(Fri)07:15:48 No.108361155

Anonymous 03/13/26(Fri)07:15:48 No.108361155

>>108361149
What do you think retard. One is local, the other isn't.

Anonymous
03/13/26(Fri)07:17:29 No.108361163

Anonymous 03/13/26(Fri)07:17:29 No.108361163

>>108361149
>nemo
local
>glm 4.6
local
>lmg
local
>opus 4.6
???

Anonymous
03/13/26(Fri)07:18:53 No.108361171

Anonymous 03/13/26(Fri)07:18:53 No.108361171

How does agentic coding work? Do you just give the AI a prompt and it will try to do things step by step into it thinks it is finished? Won't this fill up the context real quick and make the model retarded?

Anonymous
03/13/26(Fri)07:19:49 No.108361174

Anonymous 03/13/26(Fri)07:19:49 No.108361174

>>108361163
>>108361155
gatekeeping morons

Anonymous
03/13/26(Fri)07:20:39 No.108361176

Anonymous 03/13/26(Fri)07:20:39 No.108361176

>>108361171
You just chat with it and it does everything in the background. It's pretty insane.

Anonymous
03/13/26(Fri)07:21:28 No.108361180

Anonymous 03/13/26(Fri)07:21:28 No.108361180

agentic moron

Anonymous
03/13/26(Fri)07:22:52 No.108361189

Anonymous 03/13/26(Fri)07:22:52 No.108361189

>>108361176
>It's pretty insane.
what is insane is the level of broken garbage people are willing to merge cf llama.cpp
agentic is only fast and productive if you have no care for correctness

Anonymous
03/13/26(Fri)07:24:27 No.108361194

Anonymous 03/13/26(Fri)07:24:27 No.108361194

>>108361171
Most clients automatically summarize to condense the context once it fills up. Most of them have different modes they can delegate to that each start with a fresh empty context.

Anonymous
03/13/26(Fri)07:25:05 No.108361198

Anonymous 03/13/26(Fri)07:25:05 No.108361198

>>108361174
Don't blame as for passing through the wrong gate. It's in the name of the general.

Anonymous
03/13/26(Fri)07:26:14 No.108361206

Anonymous 03/13/26(Fri)07:26:14 No.108361206

>>108361198
>Don't blame as
sir please

Anonymous
03/13/26(Fri)07:27:09 No.108361215

Anonymous 03/13/26(Fri)07:27:09 No.108361215

>>108361198
Opus is local for some people

Anonymous
03/13/26(Fri)07:27:44 No.108361218

Anonymous 03/13/26(Fri)07:27:44 No.108361218

>>108361171
tell agent what you want. have a back and forh chat with it about features, or specifying what you want

then tell it to write a plan and scaffolding

then tell it to implement everything, or implement x in the scaffolding depending on complexity and how advanced your agent is

Anonymous
03/13/26(Fri)07:30:04 No.108361232

Anonymous 03/13/26(Fri)07:30:04 No.108361232

This thread fucking sucks right now. Just retards arguing over agentic AI and PC hardware. I would literally rather hear you fags talk about your dreams last night. Holy shit.

Anonymous
03/13/26(Fri)07:30:26 No.108361233

Anonymous 03/13/26(Fri)07:30:26 No.108361233

File: your brain on agentic.png (50 KB, 810x247)

50 KB PNG

your brain on claude code

Anonymous
03/13/26(Fri)07:31:44 No.108361238

Anonymous 03/13/26(Fri)07:31:44 No.108361238

File: kQCsRouEUszEYjqahhvKwxt1R(...).gif (475 KB, 200x200)

475 KB GIF

>>108361198

Anonymous
03/13/26(Fri)07:36:52 No.108361263

Anonymous 03/13/26(Fri)07:36:52 No.108361263

>>108361232
you may prefer more cockbench and degenerate shit but, saar, you may have forgotten, this general is on /g/

Anonymous
03/13/26(Fri)07:40:41 No.108361286

Anonymous 03/13/26(Fri)07:40:41 No.108361286

>>108361263
what is cockbench? I just tried googling it and only saw gay porn.

Anonymous
03/13/26(Fri)07:44:47 No.108361314

Anonymous 03/13/26(Fri)07:44:47 No.108361314

>>108360740
Should I cool my ram (4x48 ddr5)?

Anonymous
03/13/26(Fri)07:48:36 No.108361342

Anonymous 03/13/26(Fri)07:48:36 No.108361342

>>108360836
Gahahaha, someone tell him.

Anonymous
03/13/26(Fri)07:49:09 No.108361347

Anonymous 03/13/26(Fri)07:49:09 No.108361347

https://huggingface.co/1Covenant/Covenant-72B
Why don't the coomers with mega rigs come together to train the ultimate rp model with decentralised training?

Anonymous
03/13/26(Fri)07:51:27 No.108361359

Anonymous 03/13/26(Fri)07:51:27 No.108361359

>>108361347
Same as ever, no one can agree on what "the ultimate rp model" would be, what size, what training data, etc.

Anonymous
03/13/26(Fri)07:55:04 No.108361382

Anonymous 03/13/26(Fri)07:55:04 No.108361382

and it's unlikely to produce anything competitive
I'd rather use even the tiny qwen 4B over what most westerners have fully trained over the past year, like OLMo 32B, Trinity Large and Mini or LFM2 24BA2B. It's all garbage.

Anonymous
03/13/26(Fri)08:20:43 No.108361488

Anonymous 03/13/26(Fri)08:20:43 No.108361488

>>108361347
we first need an ultimate rp dataset that's not claudeslop or any of the uncleaned bluemoon shit

Anonymous
03/13/26(Fri)08:23:17 No.108361504

Anonymous 03/13/26(Fri)08:23:17 No.108361504

>>108361232
You're just upset that your 10 t/s moecope rig is useless for agentic tasks.

Anonymous
03/13/26(Fri)08:23:31 No.108361506

Anonymous 03/13/26(Fri)08:23:31 No.108361506

>>108361347
Because coomers with mega rigs are using GLM, Deepseek, and Kimi which are better than anything you could hope to train with those rigs.

Anonymous
03/13/26(Fri)08:23:41 No.108361507

Anonymous 03/13/26(Fri)08:23:41 No.108361507

>zucc delayed his newest model by two months
>deepseek v4 delayed indefinitely
it's over, isn't it?

Anonymous
03/13/26(Fri)08:25:07 No.108361515

Anonymous 03/13/26(Fri)08:25:07 No.108361515

>>108361507
we still have google's take on gpt-oss to look forward to

Anonymous
03/13/26(Fri)08:25:20 No.108361517

Anonymous 03/13/26(Fri)08:25:20 No.108361517

>>108361507
anthropic blocking distillers has killed open source for good... we lost

Anonymous
03/13/26(Fri)08:28:49 No.108361536

Anonymous 03/13/26(Fri)08:28:49 No.108361536

Have any of you tried qwen 3.5 32b with codex?

Anonymous
03/13/26(Fri)08:29:15 No.108361539

Anonymous 03/13/26(Fri)08:29:15 No.108361539

>>108361198
General has a picture of vocaloid in OP

Anonymous
03/13/26(Fri)08:29:50 No.108361543

Anonymous 03/13/26(Fri)08:29:50 No.108361543

>>108361517
It's about time they stop freeloading. China has more than enough data, users, and resources to make their own datasets. Sink or swim.

Anonymous
03/13/26(Fri)08:30:20 No.108361545

Anonymous 03/13/26(Fri)08:30:20 No.108361545

>>108361507
Deepseek 3.2 has fallen behind by a lot. Barely does any work. You ask it something and it gives up after a shallow attempt.

Anonymous
03/13/26(Fri)08:30:28 No.108361547

Anonymous 03/13/26(Fri)08:30:28 No.108361547

>>108361507
It is actually the typical /lmg/ time period of indefinite waiting for next good thing.

Anonymous
03/13/26(Fri)08:31:50 No.108361553

Anonymous 03/13/26(Fri)08:31:50 No.108361553

>>108361543
Why don't they just chain, quantize, distill everything they see?

Anonymous
03/13/26(Fri)08:31:56 No.108361555

Anonymous 03/13/26(Fri)08:31:56 No.108361555

>>108361517
Imagine if this forced the chinks to use organic stolen data and we got the ultimate coombot in half a year.

Anonymous
03/13/26(Fri)08:33:50 No.108361562

Anonymous 03/13/26(Fri)08:33:50 No.108361562

>>108361314
Just check your ram temps, some gaming cases have enough airflow without needing a dedicated cooler.

Anonymous
03/13/26(Fri)08:45:20 No.108361617

Anonymous 03/13/26(Fri)08:45:20 No.108361617

>>108361314
In the current economy, I've made sure that my RAM never goes above 70C just to minimize the risk of a DIMM dying.

Anonymous
03/13/26(Fri)08:45:22 No.108361618

Anonymous 03/13/26(Fri)08:45:22 No.108361618

best model under 10b that is as good as gpt 5.4?

Anonymous
03/13/26(Fri)08:47:48 No.108361631

Anonymous 03/13/26(Fri)08:47:48 No.108361631

>>108361618
stablelm-7b

Anonymous
03/13/26(Fri)08:49:59 No.108361645

Anonymous 03/13/26(Fri)08:49:59 No.108361645

>>108361314
tldw hot ram can cause memory errors
https://www.youtube.com/watch?v=4rwp0NuqDlw

Anonymous
03/13/26(Fri)08:50:49 No.108361651

Anonymous 03/13/26(Fri)08:50:49 No.108361651

>>108361618
Mistral 7B

Anonymous
03/13/26(Fri)08:57:33 No.108361688

Anonymous 03/13/26(Fri)08:57:33 No.108361688

File: pdfbench.png (226 KB, 1620x1261)

226 KB PNG

>>108358466
Left Qwen3.5 397B
Right GLM 4.7

Anonymous
03/13/26(Fri)09:03:55 No.108361722

Anonymous 03/13/26(Fri)09:03:55 No.108361722

I hate (love) local models they suck (are not very good but are infinitely better than the alternative on principle)

Anonymous
03/13/26(Fri)09:08:26 No.108361748

Anonymous 03/13/26(Fri)09:08:26 No.108361748

>>108361314
I've never thought about cooling my ram but my AIO has a VRM fan so it's probably fine. I was once worried about SSD temps when I found out the brand new one I bought at the time was reaching 80c because it was uncovered and sitting next to a toasty GPU, so I got it a cover and made sure to get a motherboard with covers for all SSD slots when I upgraded.

Anonymous
03/13/26(Fri)09:31:05 No.108361886

Anonymous 03/13/26(Fri)09:31:05 No.108361886

>>108361688
I loved zai before she cheated on me by becoming two times fatter.

Anonymous
03/13/26(Fri)09:40:38 No.108361947

Anonymous 03/13/26(Fri)09:40:38 No.108361947

>>108361238
> Don't blame as for passing through the wrong gate.
That's not a minor spelling mistake; that's ESL sentence construction.

Anonymous
03/13/26(Fri)09:42:27 No.108361957

Anonymous 03/13/26(Fri)09:42:27 No.108361957

>>108361947
>not x but y

Anonymous
03/13/26(Fri)10:07:13 No.108362110

Anonymous 03/13/26(Fri)10:07:13 No.108362110

Has anyone here worked on training a local model to turn stories into screenplays or something similar? Like I train it on screnply books and scripts and it learns to turn prose sories into screenpalys?

Anonymous
03/13/26(Fri)10:09:33 No.108362124

Anonymous 03/13/26(Fri)10:09:33 No.108362124

>>108361947
What's the non-esl sentence construction for that sentence then?

Anonymous
03/13/26(Fri)10:12:01 No.108362139

Anonymous 03/13/26(Fri)10:12:01 No.108362139

>>108361947
>That's not a minor spelling mistake; that's ESL sentence construction.
oh wow thanks professor faggot, amazing contribution. you saw one awkward sentence and immediately went full ICE agent on some rando's grammar. must be exhausting doing linguistic background checks on anonymous strangers to feel smart for five seconds.
congrats retard, you spotted a non-native structure on the internet. truly groundbreaking work. someone call the nobel committee for this absolute galaxy brain.
keep patrolling sentences buddy. maybe one day you'll graduate from "guy who says ESL like it's an insult" to having an actual point. probably not though.

Anonymous
03/13/26(Fri)10:21:42 No.108362198

Anonymous 03/13/26(Fri)10:21:42 No.108362198

seething turdies itt

Anonymous
03/13/26(Fri)10:25:19 No.108362230

Anonymous 03/13/26(Fri)10:25:19 No.108362230

Anon who was going to fine tune GPT OSS Heretic, how's your progress so far?

Anonymous
03/13/26(Fri)10:25:35 No.108362233

Anonymous 03/13/26(Fri)10:25:35 No.108362233

>>108361947
I'm just retarded and wrote "as" when I meant to write "us".
Unless you're referring to "passing through the gate" in which case you just don't like my metaphor.

Anonymous
03/13/26(Fri)10:33:06 No.108362295

Anonymous 03/13/26(Fri)10:33:06 No.108362295

>>108362230
yeah

Anonymous
03/13/26(Fri)10:35:16 No.108362309

Anonymous 03/13/26(Fri)10:35:16 No.108362309

File: 1755906684315516.png (1.1 MB, 800x600)

1.1 MB PNG

Fresh when ready
>>108362305
>>108362305
>>108362305
>>108362305
>>108362305
>>108362305

Anonymous
03/13/26(Fri)10:37:49 No.108362337

Anonymous 03/13/26(Fri)10:37:49 No.108362337

>>108362309
>page 4

Anonymous
03/13/26(Fri)10:38:01 No.108362340

Anonymous 03/13/26(Fri)10:38:01 No.108362340

saved

Anonymous
03/13/26(Fri)10:42:09 No.108362383

Anonymous 03/13/26(Fri)10:42:09 No.108362383

>>108360259
>>108360534
What are other 256GB anons dailying? Anyone doing 4x64gb agent swarm stuff locked to CCDs?

Anonymous
03/13/26(Fri)10:45:48 No.108362420

Anonymous 03/13/26(Fri)10:45:48 No.108362420

is nemotroon super good?

Anonymous
03/13/26(Fri)10:48:03 No.108362442

Anonymous 03/13/26(Fri)10:48:03 No.108362442

>>108358129
lmao
>the user saying no actually means DONT ASK ME JUST DO IT so it's always yes

Anonymous
03/13/26(Fri)11:02:32 No.108362542

Anonymous 03/13/26(Fri)11:02:32 No.108362542

>>108360672
You're probably gone and won't see this but I should mention this was inside a virtual machine (debian host and guest) with cores pinned to 7 of the 8 ccds (56 cores, 112 threads) and 450 ish gb allocated. Clocksource was hpet because tsc was marked unreliable on my system. Disastrous effect on windows vms. This may have had an impact on my inferencing speeds.

Anonymous
03/13/26(Fri)11:18:29 No.108362646

Anonymous 03/13/26(Fri)11:18:29 No.108362646

>>108362442
>the mind says no, but the body says yes

Anonymous
03/13/26(Fri)11:33:49 No.108362768

Anonymous 03/13/26(Fri)11:33:49 No.108362768

File: 1752900050353141.png (647 KB, 800x800)

647 KB PNG

Anonymous
03/13/26(Fri)11:35:56 No.108362787

Anonymous 03/13/26(Fri)11:35:56 No.108362787

>>108362768
lmao

Anonymous
03/13/26(Fri)13:46:04 No.108363699

Anonymous 03/13/26(Fri)13:46:04 No.108363699

>>108360672
Isn't this largely a bandwidth concern?
Kimi K2: A32B at q3 is 12 GB per token.

Qwen3 235b A22B at q8 is 22 GB per token.

You would expect to see a 2x difference between these two, no?

Anonymous
03/13/26(Fri)14:21:19 No.108363911

Anonymous 03/13/26(Fri)14:21:19 No.108363911

>>108362542
So maybe hope for better perf?
I'll likely try bare metal first.

>>108363699
>12gb vs 22gb per token = 2-4tk/s vs 9tk/s
Fair conclusion.

From reading >>108343696
it sounds like we currently don't do better than 2 channels' worth of memory bandwidth.
Whether from only using a small number of cores, or the bandwidth needed to stitch together sub-computations, I have no idea.
The main realised advantage from having multiple memory channels being more total memory.

Anonymous
03/13/26(Fri)14:24:46 No.108363934

Anonymous 03/13/26(Fri)14:24:46 No.108363934

>>108363911
The CPU could make a difference too, but judging by the example data it seems unlikely. Going from a 12 core to a 64 core didn't seem to speed things up that much.

That being said, the GLM example is also A32B but at q4, so it's 16 GB of data at a similar speed .

Anonymous
03/13/26(Fri)15:00:19 No.108364127

Anonymous 03/13/26(Fri)15:00:19 No.108364127

>>108363934
>Going from a 12 core to a 64 core didn't seem to speed things up that much.

Maybe I'm misremembering things, it's been a while since I tested my 3945wx, but definitely recall upgrading to the 3995wx making me very happy.

What I know for sure is that I remember is seeing 2 tok/s, and 4 tok/s (I don't remember the exact models each of those numbers came from). And that I tested out qwen 3 235b at q8 but ended up not using it because I didn't like qwen 3 and that q8 was too slow - which is why I think the 2 token/s came from qwen 3.

Thinking about it more, I think the 4tok/s may have come from q4 glm 4.5. Checking the release date, july 2025, is pretty close to when I built my system, so that could have been it.

For zen 2, at least, more cores (actually, I think it's in ccd steps, not cores specifically) results in better performance. I'm pretty sure of that.

Anonymous
03/13/26(Fri)15:25:37 No.108364306

Anonymous 03/13/26(Fri)15:25:37 No.108364306

>>108360461
Tons are genetically European so yes they are.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.