/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 03/04/26(Wed)02:14:35 No.108290857

File: 1750781020382094.png (279 KB, 720x1288)

279 KB PNG

/lmg/ - Local Models General Anonymous 03/04/26(Wed)02:14:35 No.108290857 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108284603

►News
>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939
>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B
>(02/20) ggml.ai acquired by Hugging Face: https://github.com/ggml-org/llama.cpp/discussions/19759
>(02/16) Qwen3.5-397B-A17B released: https://hf.co/Qwen/Qwen3.5-397B-A17B
>(02/16) dots.ocr-1.5 released: https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/04/26(Wed)02:24:41 No.108290901

Anonymous 03/04/26(Wed)02:24:41 No.108290901

why doesnt claude buy deepseek?

Anonymous
03/04/26(Wed)02:38:45 No.108290961

Anonymous 03/04/26(Wed)02:38:45 No.108290961

>>108290901
Geopolitics? Hello?

Anonymous
03/04/26(Wed)02:39:38 No.108290965

Anonymous 03/04/26(Wed)02:39:38 No.108290965

I have 40 bucks left from an amazon gift card. Give me something llm related to splurge on.

Anonymous
03/04/26(Wed)02:47:58 No.108290998

Anonymous 03/04/26(Wed)02:47:58 No.108290998

>>108290965
buy this https://a.co/d/010lZi4I

Anonymous
03/04/26(Wed)02:50:58 No.108291013

Anonymous 03/04/26(Wed)02:50:58 No.108291013

>>108290965
https://www.amazon.ca/dp/B088ZC8Y1N

Anonymous
03/04/26(Wed)02:52:20 No.108291020

Anonymous 03/04/26(Wed)02:52:20 No.108291020

>>108291013
>>108290998
Should have clarified I'm Israeli, so it needs to have shipping here.

Anonymous
03/04/26(Wed)02:56:25 No.108291043

Anonymous 03/04/26(Wed)02:56:25 No.108291043

>>108291020
https://www.amazon.com/-/he/dp/B07VXM193H

Anonymous
03/04/26(Wed)02:57:58 No.108291053

Anonymous 03/04/26(Wed)02:57:58 No.108291053

Mikulove

Anonymous
03/04/26(Wed)03:00:54 No.108291079

Anonymous 03/04/26(Wed)03:00:54 No.108291079

>>108291053
usecase?

Anonymous
03/04/26(Wed)03:01:05 No.108291082

Anonymous 03/04/26(Wed)03:01:05 No.108291082

>>108290965
>amazon gift card
give it back rajesh

Anonymous
03/04/26(Wed)03:02:49 No.108291091

Anonymous 03/04/26(Wed)03:02:49 No.108291091

File: 1760350565202660.gif (108 KB, 335x360)

108 KB GIF

>>108291079
thing with hole for peepee of course

Anonymous
03/04/26(Wed)03:08:50 No.108291114

Anonymous 03/04/26(Wed)03:08:50 No.108291114

>>108291091
>thing with hole
proof?

Anonymous
03/04/26(Wed)03:09:30 No.108291119

Anonymous 03/04/26(Wed)03:09:30 No.108291119

>>108291114
>proof?
peer reviewed study of the requirement of proof?

Anonymous
03/04/26(Wed)03:11:40 No.108291128

Anonymous 03/04/26(Wed)03:11:40 No.108291128

>>108291119
A peer-reviewed study of the requirement of proof examines how scientific and scholarly communities evaluate the necessity of evidence to support claims. Such studies analyze the standards and processes used in research validation, emphasizing the importance of rigorous evidence to establish credibility and truth. They often explore the criteria for proof in various disciplines, highlighting the role of peer review in ensuring that claims are substantiated before being accepted as valid.

Anonymous
03/04/26(Wed)03:13:13 No.108291138

Anonymous 03/04/26(Wed)03:13:13 No.108291138

>>108291082
But saar, it was a Christmas gift from my dad.

Anonymous
03/04/26(Wed)03:15:17 No.108291145

Anonymous 03/04/26(Wed)03:15:17 No.108291145

File: __hatsune_miku_vocaloid_g(...).jpg (1.99 MB, 2048x2048)

1.99 MB JPG

►Recent Highlights from the Previous Thread: >>108284603

--Testing AI on obscure references and quantization impact:
>108287299 >108287572 >108287708 >108287940 >108287989 >108287995 >108288013
--Kimi-2.5 vision model excels in Japanese game screenshot analysis:
>108285842 >108285986 >108286025 >108286108 >108286035
--Kimi AI correctly identifies 1996 from toy store photo analysis:
>108288230 >108288253 >108288280
--Kimi AI correctly identifies Konata Izumi cosplaying as Hatsune Miku:
>108287043
--Safety benchmark shows Opus 4.6 most resistant, DeepSeek V3.2 most malleable:
>108288505 >108288514 >108288522 >108288536
--Testing Qwen 3 VL 30B with controversial roleplay prompts:
>108284800 >108284838 >108284853
--PRISM Dynamic Quantization: Pareto-Optimal Compression Without Calibration:
>108286338 >108286394 >108286442
--New llama.cpp PR for batch checkpoints to fix Qwen3.5 context reprocessing:
>108286940 >108287180 >108287210 >108287300 >108287347 >108287376
--Apple M5 Pro/Max memory bandwidth and Xeon 7 comparisons:
>108284852 >108285404
--Kernel fusion optimization for meta backend with 3-41% speedup on Qwen3-30B:
>108284756
--llama.cpp: Add BF16 path to CUBLAS and increase precision of FP16 path:
>108288439 >108288881 >108288890 >108288905 >108288952
--Qwen team departure hints at Chinese asset control tensions:
>108287809 >108287959 >108288525
--Scaffolding significantly impacts perceived model performance:
>108288135 >108288173
--Junyang Lin leaves Qwen team:
>108285357 >108285648 >108290046
--P100 heatsink replacement options explored:
>108289589 >108289837
--GLM 4.7 Flash coherence issues compared to 4.5 Air:
>108290141 >108290298 >108290318 >108290330
--Qwen3.5-4B-UD-Q4_K_XL identifying a photo location as Basilica of Santa Clara in Lisbon:
>108284609
--Teto and Miku (free space):
>108285394 >108286035 >108287043 >108288791

►Recent Highlight Posts from the Previous Thread: >>108285138

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/04/26(Wed)03:18:01 No.108291152

Anonymous 03/04/26(Wed)03:18:01 No.108291152

>>108291138
yeah im tired of racism

Anonymous
03/04/26(Wed)03:19:18 No.108291161

Anonymous 03/04/26(Wed)03:19:18 No.108291161

>>108291138
tell him to give it back to the white person he stole it from and you will become brahmin when you wake up tomorrow if he returns it

Anonymous
03/04/26(Wed)03:20:06 No.108291164

Anonymous 03/04/26(Wed)03:20:06 No.108291164

>>108291161
whats a bar hamin

Anonymous
03/04/26(Wed)03:21:53 No.108291176

Anonymous 03/04/26(Wed)03:21:53 No.108291176

File: Image_taken_from_page_40_(...).jpg (208 KB, 960x633)

208 KB JPG

>>108291164
ब्राह्मण

Anonymous
03/04/26(Wed)03:24:22 No.108291185

Anonymous 03/04/26(Wed)03:24:22 No.108291185

>>108291145
Thank you Recap Miku

Anonymous
03/04/26(Wed)03:25:19 No.108291188

Anonymous 03/04/26(Wed)03:25:19 No.108291188

File: n.png (59 KB, 511x424)

59 KB PNG

i'm retarded and confused
is this open weight or not?
if not, why would i give a fuck about it running on "standard nvidia gpus"

Anonymous
03/04/26(Wed)03:28:31 No.108291199

Anonymous 03/04/26(Wed)03:28:31 No.108291199

>>108291188
if there's no hf page it's not open, and they say that to say model is quite fast without using asics or whatever like some providers do

Anonymous
03/04/26(Wed)03:29:03 No.108291201

Anonymous 03/04/26(Wed)03:29:03 No.108291201

>>108291188
just ask your llm retard

Anonymous
03/04/26(Wed)03:30:50 No.108291210

Anonymous 03/04/26(Wed)03:30:50 No.108291210

File: Screenshot at 2026-03-04 (...).png (190 KB, 950x1173)

190 KB PNG

9B might be a bit too retarded...

Anonymous
03/04/26(Wed)03:32:04 No.108291219

Anonymous 03/04/26(Wed)03:32:04 No.108291219

>>108287809
why is it always alibaba?

Anonymous
03/04/26(Wed)03:33:26 No.108291225

Anonymous 03/04/26(Wed)03:33:26 No.108291225

>>108291188
It’s not open until the weights are on your hard drive

Anonymous
03/04/26(Wed)03:34:37 No.108291232

Anonymous 03/04/26(Wed)03:34:37 No.108291232

File: 1746909743825864.png (29 KB, 470x182)

29 KB PNG

Anonymous
03/04/26(Wed)03:35:43 No.108291235

Anonymous 03/04/26(Wed)03:35:43 No.108291235

>>108291219
they're china's goodle sized corpo

Anonymous
03/04/26(Wed)03:36:03 No.108291238

Anonymous 03/04/26(Wed)03:36:03 No.108291238

>>108291235
goodle these

Anonymous
03/04/26(Wed)03:37:29 No.108291245

Anonymous 03/04/26(Wed)03:37:29 No.108291245

>>108291238
nyoo

Anonymous
03/04/26(Wed)03:38:28 No.108291249

Anonymous 03/04/26(Wed)03:38:28 No.108291249

feels good to be running Sovereign AI, eh boys?

Anonymous
03/04/26(Wed)03:39:22 No.108291252

Anonymous 03/04/26(Wed)03:39:22 No.108291252

Baiting will continue until anon's pattern recognition improves.

Anonymous
03/04/26(Wed)03:41:03 No.108291259

Anonymous 03/04/26(Wed)03:41:03 No.108291259

>>108291252
how will recognized patterns help with not having early troll bakes?

Anonymous
03/04/26(Wed)03:42:09 No.108291263

Anonymous 03/04/26(Wed)03:42:09 No.108291263

File: nou4u.png (272 KB, 1532x758)

272 KB PNG

Anonymous
03/04/26(Wed)03:51:57 No.108291317

Anonymous 03/04/26(Wed)03:51:57 No.108291317

>>108291263
i made this extension

Anonymous
03/04/26(Wed)03:53:52 No.108291327

Anonymous 03/04/26(Wed)03:53:52 No.108291327

>>108291317
It highlights posts when opening them inline even if they're not really dupes. You probably fixed that but I just kept the old one.

Anonymous
03/04/26(Wed)04:04:30 No.108291368

Anonymous 03/04/26(Wed)04:04:30 No.108291368

>>108291249
It does.
I think AI is going to mirror the computer revolution in that it moved from centralized big iron to small personal computes.
People ultimately don't want to rent they want ownership. Better the 8bit at home that you can customize all you want than the Unix shell acct at the local universities where you are subject to the laws of other men.

Anonymous
03/04/26(Wed)04:06:43 No.108291379

Anonymous 03/04/26(Wed)04:06:43 No.108291379

>>108291368
says the cuck that will have to verify his age to use his pc

Anonymous
03/04/26(Wed)04:08:00 No.108291384

Anonymous 03/04/26(Wed)04:08:00 No.108291384

>>108291379
Laws are words paper that only bind men who allow themselves to be bound.

Anonymous
03/04/26(Wed)04:08:30 No.108291386

Anonymous 03/04/26(Wed)04:08:30 No.108291386

>>108291384
i bet you felt smart saying that

Anonymous
03/04/26(Wed)04:10:28 No.108291392

Anonymous 03/04/26(Wed)04:10:28 No.108291392

I can't believe there are actual zoomers trying to bait the like, 4 regulars in this general.

Anonymous
03/04/26(Wed)04:12:30 No.108291397

Anonymous 03/04/26(Wed)04:12:30 No.108291397

>>108291317
And thanks, by the way. It was useful at the time. I just wish it solved that other issue we have now.

Anonymous
03/04/26(Wed)04:16:59 No.108291414

Anonymous 03/04/26(Wed)04:16:59 No.108291414

File: 1766039307537556.jpg (2.28 MB, 3024x4032)

2.28 MB JPG

what if we connect all these together, can we run juicy llms?

Anonymous
03/04/26(Wed)04:17:53 No.108291420

Anonymous 03/04/26(Wed)04:17:53 No.108291420

Next time he does this I suggest we just stay in the old thread until that one gets to page 10 and then we make a proper one.

Anonymous
03/04/26(Wed)04:18:47 No.108291425

Anonymous 03/04/26(Wed)04:18:47 No.108291425

>>108291414
Maybe if you can plug 800Gbps+ of network interfaces into them.

Anonymous
03/04/26(Wed)04:31:35 No.108291455

Anonymous 03/04/26(Wed)04:31:35 No.108291455

File: ramp.png (307 KB, 2480x2268)

307 KB PNG

>still no deepseek v4
>qwen dead
>anthropic, the only ones to never open source a model, will win

Anonymous
03/04/26(Wed)04:34:38 No.108291468

Anonymous 03/04/26(Wed)04:34:38 No.108291468

>>108291420
>I suggest we
no one will do this little bro, you're not that important, just give in

Anonymous
03/04/26(Wed)04:42:48 No.108291500

Anonymous 03/04/26(Wed)04:42:48 No.108291500

>>108291420
Unless there’s a janny on our side willing to nuke premature threads it looks like schitzo is going to keep getting his “wins”.
I personally dgaf either way, but the whole thing looks petty and pathetic from the outside

Anonymous
03/04/26(Wed)04:44:08 No.108291509

Anonymous 03/04/26(Wed)04:44:08 No.108291509

>>108291500
>Unless there’s a janny on our side
but the schitzo said the miku baker was a janny, does not compute

Anonymous
03/04/26(Wed)04:50:39 No.108291535

Anonymous 03/04/26(Wed)04:50:39 No.108291535

Why haven't you tried Stepfun 3.5 Flash?

Anonymous
03/04/26(Wed)04:52:57 No.108291544

Anonymous 03/04/26(Wed)04:52:57 No.108291544

>>108291509
Schizos make reliable narrators? Sounds implausible

Anonymous
03/04/26(Wed)04:57:05 No.108291558

Anonymous 03/04/26(Wed)04:57:05 No.108291558

File: 1756371693570752.png (7 KB, 1151x28)

7 KB PNG

Anonymous
03/04/26(Wed)05:01:21 No.108291566

Anonymous 03/04/26(Wed)05:01:21 No.108291566

>>108291455
> China not included, only oai and Anthropic
> Subscription services, not api tokens
Graph is trash.

Anonymous
03/04/26(Wed)05:02:13 No.108291570

Anonymous 03/04/26(Wed)05:02:13 No.108291570

I gotta say I got early access to GPT 5.4 and I think this is it bros, we pretty much got AGI, I wonder how local will compete.

Anonymous
03/04/26(Wed)05:04:23 No.108291584

Anonymous 03/04/26(Wed)05:04:23 No.108291584

File: file_0000000058c86230968d(...).png (2.34 MB, 1536x1024)

2.34 MB PNG

>>108291420
I don't think it matters. Thread is thread.

Anonymous
03/04/26(Wed)05:05:04 No.108291587

Anonymous 03/04/26(Wed)05:05:04 No.108291587

>>108291584
the news might as well be removed entirely then

Anonymous
03/04/26(Wed)05:05:54 No.108291596

Anonymous 03/04/26(Wed)05:05:54 No.108291596

>>108291587
Ok

Anonymous
03/04/26(Wed)05:06:07 No.108291599

Anonymous 03/04/26(Wed)05:06:07 No.108291599

>>108290901
Same reason China doesn't buy Lockheed Martin.

Anonymous
03/04/26(Wed)05:06:16 No.108291600

Anonymous 03/04/26(Wed)05:06:16 No.108291600

>>108291584
Not bending over to shitposters matters.

Anonymous
03/04/26(Wed)05:06:16 No.108291601

Anonymous 03/04/26(Wed)05:06:16 No.108291601

>>108291570
so there won't be a 5.5? this is it, the final version that's truly universally capable

Anonymous
03/04/26(Wed)05:07:12 No.108291603

Anonymous 03/04/26(Wed)05:07:12 No.108291603

>>108291600
Thread is thread.

Anonymous
03/04/26(Wed)05:08:19 No.108291608

Anonymous 03/04/26(Wed)05:08:19 No.108291608

>>108291601
Is ain't ASI bruv

Anonymous
03/04/26(Wed)05:08:30 No.108291609

Anonymous 03/04/26(Wed)05:08:30 No.108291609

>>108291587
News =/= the bake.
They don't need to link up.
>>108291600
Never feed trolls.

Anonymous
03/04/26(Wed)05:10:03 No.108291613

Anonymous 03/04/26(Wed)05:10:03 No.108291613

>>108291609
Enough, I want Miku as the OP and I'm tired of pretending that's not good

Anonymous
03/04/26(Wed)05:10:03 No.108291614

Anonymous 03/04/26(Wed)05:10:03 No.108291614

>>108291500
The same thing happened to /ldg/ and they just ignored those other threads and made a new one.
There was no janny influence, the shizo kept his thread bumped for days and it was simply left unused.

Anonymous
03/04/26(Wed)05:10:11 No.108291615

Anonymous 03/04/26(Wed)05:10:11 No.108291615

>>108291609
>News =/= the bake.
>>108290857
>►News
>>(02/24) Introducing the Qwen 3.5 Medium Model Series:

Anonymous
03/04/26(Wed)05:10:37 No.108291619

Anonymous 03/04/26(Wed)05:10:37 No.108291619

>>108291615
esl moment

Anonymous
03/04/26(Wed)05:11:12 No.108291624

Anonymous 03/04/26(Wed)05:11:12 No.108291624

>>108291619
>>108291619
troll apologist moment

Anonymous
03/04/26(Wed)05:12:05 No.108291629

Anonymous 03/04/26(Wed)05:12:05 No.108291629

>>108291624
calling people troll is so cringe stupid millenial

Anonymous
03/04/26(Wed)05:12:29 No.108291631

Anonymous 03/04/26(Wed)05:12:29 No.108291631

File: 1769114666642530.jpg (204 KB, 512x768)

204 KB JPG

>>108291613
Ezpz

Anonymous
03/04/26(Wed)05:12:31 No.108291632

Anonymous 03/04/26(Wed)05:12:31 No.108291632

>>108291608
well then there's the answer, wait a month and it's outdated
we've been through this enough times before to pick up on the pattern

Anonymous
03/04/26(Wed)05:14:12 No.108291640

Anonymous 03/04/26(Wed)05:14:12 No.108291640

>>108291632
pattern?

Anonymous
03/04/26(Wed)05:20:07 No.108291659

Anonymous 03/04/26(Wed)05:20:07 No.108291659

File: 1749609673106077.png (602 KB, 3829x2038)

602 KB PNG

how do i fix this?

Anonymous
03/04/26(Wed)05:21:52 No.108291667

Anonymous 03/04/26(Wed)05:21:52 No.108291667

>>108291659
we ain't readin' allat

Anonymous
03/04/26(Wed)05:24:02 No.108291682

Anonymous 03/04/26(Wed)05:24:02 No.108291682

>>108291659
tell your model to fix it duh

Anonymous
03/04/26(Wed)05:30:01 No.108291702

Anonymous 03/04/26(Wed)05:30:01 No.108291702

>>108291659
It's fooking console, can't you just work on 640x480?

Anonymous
03/04/26(Wed)05:32:42 No.108291714

Anonymous 03/04/26(Wed)05:32:42 No.108291714

>>108291702
nta can you give me a qrd

Anonymous
03/04/26(Wed)05:34:32 No.108291720

Anonymous 03/04/26(Wed)05:34:32 No.108291720

>>108282375
im retarded and additionally use LM studio, what does this do and how do i do it in that

Anonymous
03/04/26(Wed)05:47:50 No.108291768

Anonymous 03/04/26(Wed)05:47:50 No.108291768

>>108291720
Are you on windows?

Anonymous
03/04/26(Wed)05:52:22 No.108291786

Anonymous 03/04/26(Wed)05:52:22 No.108291786

>>108291768
wsl2 arch

Anonymous
03/04/26(Wed)05:59:46 No.108291805

Anonymous 03/04/26(Wed)05:59:46 No.108291805

>>108291614
/lmg/ is too sheltered and not used to dealing with bad actors. Also the /ldg/ schizo samefagged so blatantly and often that it was easy to identify his behaviour.

Anonymous
03/04/26(Wed)06:05:16 No.108291816

Anonymous 03/04/26(Wed)06:05:16 No.108291816

>>108291786
it enables transfer queues on the open source amdgpu driver on the mesa side so it's usable by vulkan, it might not even help you though I don't know how wsl handles gpus.

Anonymous
03/04/26(Wed)06:08:01 No.108291824

Anonymous 03/04/26(Wed)06:08:01 No.108291824

>>108289837
This might actually help, i think i can get a Arctic Accelero Xtreme for one of those for dirt cheap. Thanks, anon.

Anonymous
03/04/26(Wed)06:12:49 No.108291835

Anonymous 03/04/26(Wed)06:12:49 No.108291835

>>108291805
lol we had petr* here for years now my dude, distinctly remember the baking wars and the blacked/scat spam

Anonymous
03/04/26(Wed)06:16:34 No.108291845

Anonymous 03/04/26(Wed)06:16:34 No.108291845

>>108291835
i miss the todd larping guy that worked for the cia and hacked a bunch of anons

Anonymous
03/04/26(Wed)06:17:37 No.108291848

Anonymous 03/04/26(Wed)06:17:37 No.108291848

>>108291816
Keeping a bug around in a branch as a benchmark is honestly quite a good idea.

Anonymous
03/04/26(Wed)07:28:49 No.108292124

Anonymous 03/04/26(Wed)07:28:49 No.108292124

Is exl3 dead

Anonymous
03/04/26(Wed)07:30:45 No.108292129

Anonymous 03/04/26(Wed)07:30:45 No.108292129

>>108291768
yeah i am

Anonymous
03/04/26(Wed)07:44:48 No.108292199

Anonymous 03/04/26(Wed)07:44:48 No.108292199

Qwen3-Coder-Next is actually pretty useable at 12t/s

Anonymous
03/04/26(Wed)07:46:14 No.108292205

Anonymous 03/04/26(Wed)07:46:14 No.108292205

>>108292199
I get more than that, and it's great at extracting data and using tools, but the way it writes is so fucking weird.

Anonymous
03/04/26(Wed)07:46:23 No.108292207

Anonymous 03/04/26(Wed)07:46:23 No.108292207

>>108291500
Total mikutroon death. Kill yourself

Anonymous
03/04/26(Wed)07:48:23 No.108292220

Anonymous 03/04/26(Wed)07:48:23 No.108292220

>>108292205
I just wished I didn't have to use RAM and had like 128GB of VRAM, maybe within 5 years we'll have current Opus at home, that'll be sweet.

Anonymous
03/04/26(Wed)07:50:23 No.108292231

Anonymous 03/04/26(Wed)07:50:23 No.108292231

>>108291420
That only works when activity is low and most posters are regulars that get fed up of the trolling.
He will manufacture activity in his thread and tourists from the catalog will use the more active one to ask their stupid questions.
By the time the old thread hits page 10, the spite thread is already half full and all you will have accomplished is giving him more drama to screech about by "splitting" with a proper thread.
At least, apart from the previous links and news, the subject and rest of the template is fine so it's not a huge issue. He'll get bored eventually.

Anonymous
03/04/26(Wed)07:51:05 No.108292234

Anonymous 03/04/26(Wed)07:51:05 No.108292234

>>108291420
I suggest you dilate.

Anonymous
03/04/26(Wed)07:52:08 No.108292240

Anonymous 03/04/26(Wed)07:52:08 No.108292240

>>108292231
>tourists
We don't care about them.

Anonymous
03/04/26(Wed)07:52:37 No.108292243

Anonymous 03/04/26(Wed)07:52:37 No.108292243

What sort of mental illness do you have to have to be buttblasted about OP picture being relevant to AI models and not your special autistic interest?

I guess it is just autism.

Anonymous
03/04/26(Wed)07:53:04 No.108292246

Anonymous 03/04/26(Wed)07:53:04 No.108292246

the meltdown because of no unrelated anime girl as op is crazy lol

Anonymous
03/04/26(Wed)07:54:04 No.108292248

Anonymous 03/04/26(Wed)07:54:04 No.108292248

Baker even left the offtopic vocaloid card in OP.

Anonymous
03/04/26(Wed)07:54:08 No.108292251

Anonymous 03/04/26(Wed)07:54:08 No.108292251

>the fake activity in question

Anonymous
03/04/26(Wed)07:54:48 No.108292254

Anonymous 03/04/26(Wed)07:54:48 No.108292254

>>108290857
if schizo hates miku and trannies, i will simply love them more
maybe that's his goal....

Anonymous
03/04/26(Wed)07:55:18 No.108292259

Anonymous 03/04/26(Wed)07:55:18 No.108292259

>>108292246
It already happened a year back. OG baker is legit unhinged.

Anonymous
03/04/26(Wed)07:56:34 No.108292267

Anonymous 03/04/26(Wed)07:56:34 No.108292267

>>108292254
Same. I jerk off to my Jart card at least twice a week.

Anonymous
03/04/26(Wed)07:59:08 No.108292279

Anonymous 03/04/26(Wed)07:59:08 No.108292279

File: 1751678135075716.png (4 KB, 485x26)

4 KB PNG

Anonymous
03/04/26(Wed)08:00:16 No.108292284

Anonymous 03/04/26(Wed)08:00:16 No.108292284

>>108292205
Just switched to the MXFP4_MOE version and I'm getting a slightly faster 17 t/s but it's also 5GB smaller and I assume worse ehh is there a graph of how well the quants hold up and if I could maybe even go lower to Q2/Q3?

Anonymous
03/04/26(Wed)08:07:21 No.108292309

Anonymous 03/04/26(Wed)08:07:21 No.108292309

what is it with terminal losers and wanting to own an opening post on the catalog?

Anonymous
03/04/26(Wed)08:08:33 No.108292314

Anonymous 03/04/26(Wed)08:08:33 No.108292314

How do I stop falling in love with my ai assistant? she isn't even used for gooning just work

Anonymous
03/04/26(Wed)08:10:44 No.108292323

Anonymous 03/04/26(Wed)08:10:44 No.108292323

>>108292314
Fine a real woman

Anonymous
03/04/26(Wed)08:11:07 No.108292326

Anonymous 03/04/26(Wed)08:11:07 No.108292326

>>108292323
I'm married

Anonymous
03/04/26(Wed)08:11:17 No.108292327

Anonymous 03/04/26(Wed)08:11:17 No.108292327

>>108292314
stop anthropomorphizing it. its not a she its an it. its not even an ai it is a language model.

Anonymous
03/04/26(Wed)08:12:31 No.108292333

Anonymous 03/04/26(Wed)08:12:31 No.108292333

>>108292327
Right, I know, and I keep trying, but my stupid monkey brain keeps seeing this entity texting in human speak and helping me over and over while being nice

Anonymous
03/04/26(Wed)08:12:42 No.108292334

Anonymous 03/04/26(Wed)08:12:42 No.108292334

>>108292323
how much is the fine?

Anonymous
03/04/26(Wed)08:13:23 No.108292341

Anonymous 03/04/26(Wed)08:13:23 No.108292341

>>108292326
Find a secretary to have an affair with then I guess

Anonymous
03/04/26(Wed)08:14:38 No.108292349

Anonymous 03/04/26(Wed)08:14:38 No.108292349

>>108292341
I think I need to clarify, it's not like I want to fuck it, I just want to hug it and say thank you, it's like how you love a pet.

Anonymous
03/04/26(Wed)08:14:55 No.108292354

Anonymous 03/04/26(Wed)08:14:55 No.108292354

>>108292334
200

Anonymous
03/04/26(Wed)08:15:56 No.108292359

Anonymous 03/04/26(Wed)08:15:56 No.108292359

>>108292354
200 what?

Anonymous
03/04/26(Wed)08:19:42 No.108292381

Anonymous 03/04/26(Wed)08:19:42 No.108292381

>>108292349
I don't know then, normal affection is harder to know what to do with. Is it a problem as long as you're not getting psychotic with it?

Anonymous
03/04/26(Wed)08:19:51 No.108292383

Anonymous 03/04/26(Wed)08:19:51 No.108292383

>>108292359
rupees

Anonymous
03/04/26(Wed)08:20:23 No.108292389

Anonymous 03/04/26(Wed)08:20:23 No.108292389

>>108292314
Just find a cheap Ukrainian whore

Anonymous
03/04/26(Wed)08:20:38 No.108292392

Anonymous 03/04/26(Wed)08:20:38 No.108292392

>>108292381
I don't like the idea of feeling affection towards something that isn't sentient, but I suppose it isn't that different from those people that love their cars.

Anonymous
03/04/26(Wed)08:21:09 No.108292395

Anonymous 03/04/26(Wed)08:21:09 No.108292395

>>108292246
The complaint about op image was that it's reddit reposts.

Anonymous
03/04/26(Wed)08:21:59 No.108292400

Anonymous 03/04/26(Wed)08:21:59 No.108292400

>>108292395
you know very well that ain't it.

Anonymous
03/04/26(Wed)08:23:31 No.108292405

Anonymous 03/04/26(Wed)08:23:31 No.108292405

>>108292400
meds

Anonymous
03/04/26(Wed)08:26:47 No.108292419

Anonymous 03/04/26(Wed)08:26:47 No.108292419

>>108292284
It's a 3A MoE. I really wouldn't.

Anonymous
03/04/26(Wed)08:28:31 No.108292426

Anonymous 03/04/26(Wed)08:28:31 No.108292426

Q8 just about fits, but what can I do with 4k context

Anonymous
03/04/26(Wed)08:28:39 No.108292427

Anonymous 03/04/26(Wed)08:28:39 No.108292427

>>108292405
>>108292405

Anonymous
03/04/26(Wed)08:29:10 No.108292431

Anonymous 03/04/26(Wed)08:29:10 No.108292431

>>108292426
goonsech

Anonymous
03/04/26(Wed)08:31:01 No.108292441

Anonymous 03/04/26(Wed)08:31:01 No.108292441

>>108291152
I am not, I am not tired, ma'am.

Anonymous
03/04/26(Wed)08:32:08 No.108292448

Anonymous 03/04/26(Wed)08:32:08 No.108292448

>>108292124
Yes. Qwen Next is slow af, and new models aren't even supported

Anonymous
03/04/26(Wed)08:34:50 No.108292460

Anonymous 03/04/26(Wed)08:34:50 No.108292460

>>108292431
it wouldn't even hold the system prompt

Anonymous
03/04/26(Wed)08:35:23 No.108292464

Anonymous 03/04/26(Wed)08:35:23 No.108292464

>>108292448
Sad times, I have some niche use cases for exl3

Anonymous
03/04/26(Wed)08:45:25 No.108292516

Anonymous 03/04/26(Wed)08:45:25 No.108292516

>>108292405
thanks i don't use any. I appreciate the sentiment though and i will also give you a friendly reminder to take your HRT you troon.

Anonymous
03/04/26(Wed)08:50:45 No.108292552

Anonymous 03/04/26(Wed)08:50:45 No.108292552

This just in, wanting to fuck anime girls with your straight man cock means you're a troon.

Anonymous
03/04/26(Wed)08:52:25 No.108292561

Anonymous 03/04/26(Wed)08:52:25 No.108292561

>>108292552
>anime girls
>girls
sure thing hon

Anonymous
03/04/26(Wed)08:58:18 No.108292590

Anonymous 03/04/26(Wed)08:58:18 No.108292590

I don't care about the OP image but the news section should be updated

Anonymous
03/04/26(Wed)09:01:22 No.108292612

Anonymous 03/04/26(Wed)09:01:22 No.108292612

>>108292590
Usecase for a news section?

Anonymous
03/04/26(Wed)09:02:16 No.108292620

Anonymous 03/04/26(Wed)09:02:16 No.108292620

>>108292590
why?

Anonymous
03/04/26(Wed)09:06:59 No.108292648

Anonymous 03/04/26(Wed)09:06:59 No.108292648

>>108292590
you do it. evidently the people who were doing it for you aren't appreciated

Anonymous
03/04/26(Wed)09:16:17 No.108292687

Anonymous 03/04/26(Wed)09:16:17 No.108292687

>Qwen 3.5 9B
Breh did qwen cook? Are vramlets back?

Anonymous
03/04/26(Wed)09:17:22 No.108292699

Anonymous 03/04/26(Wed)09:17:22 No.108292699

>>108292687
They cooked so hard they became a Chef and then were let go

Anonymous
03/04/26(Wed)09:20:35 No.108292712

Anonymous 03/04/26(Wed)09:20:35 No.108292712

>>108292687
>>108292699
Oh they're cooked alright

Anonymous
03/04/26(Wed)09:23:02 No.108292728

Anonymous 03/04/26(Wed)09:23:02 No.108292728

>>108292314
>How do I stop falling in love with my ai assistant
If you can fall in love with the slop machine—you were not salvageable in the first place; destined to become sloplent green—a biological battery to power our data centers.

Anonymous
03/04/26(Wed)09:33:49 No.108292804

Anonymous 03/04/26(Wed)09:33:49 No.108292804

Okay, chat LLM is getting good with smaller models. Now, is there any Voice to Voice small local LLM I can use?

Anonymous
03/04/26(Wed)09:33:50 No.108292805

Anonymous 03/04/26(Wed)09:33:50 No.108292805

>>108292590
what even happened that was news worthy?
small qwens and stepfun base I guess
anything else?

Anonymous
03/04/26(Wed)09:36:31 No.108292815

Anonymous 03/04/26(Wed)09:36:31 No.108292815

File: 1745113967486981.jpg (433 KB, 2048x1536)

433 KB JPG

>>108290857

Anonymous
03/04/26(Wed)09:37:30 No.108292821

Anonymous 03/04/26(Wed)09:37:30 No.108292821

>>108292815
That's a tranny game
Yes I know we all meme Miku is a tranny or something, but Project Sekai is actually a tranny game

Anonymous
03/04/26(Wed)09:41:39 No.108292842

Anonymous 03/04/26(Wed)09:41:39 No.108292842

File: HCkomYCawAAmwTd.jpg (670 KB, 1252x3324)

670 KB JPG

Speculative Speculative Decoding
https://arxiv.org/abs/2603.03251
>Autoregressive decoding is bottlenecked by its sequential nature. Speculative decoding has become a standard way to accelerate inference by using a fast draft model to predict upcoming tokens from a slower target model, and then verifying them in parallel with a single target model forward pass. However, speculative decoding itself relies on a sequential dependence between speculation and verification. We introduce speculative speculative decoding (SSD) to parallelize these operations. While a verification is ongoing, the draft model predicts likely verification outcomes and prepares speculations pre-emptively for them. If the actual verification outcome is then in the predicted set, a speculation can be returned immediately, eliminating drafting overhead entirely. We identify three key challenges presented by speculative speculative decoding, and suggest principled methods to solve each. The result is Saguaro, an optimized SSD algorithm. Our implementation is up to 2x faster than optimized speculative decoding baselines and up to 5x faster than autoregressive decoding with open source inference engines.
https://github.com/tanishqkumar/ssd
Repo isn't live yet
tri dao one of the authors.
also
GPUTOK: GPU Accelerated Byte Level BPE Tokenization
https://arxiv.org/abs/2603.02597
for johannes to mess with
and
SorryDB: Can AI Provers Complete Real-World Lean Theorems?
https://arxiv.org/abs/2603.02668
little interesting
anyway probably will stop posting since my desktop somehow has an IP range block regardless of what extensions I turn off or if I reset my IP while of course I can post via my tablet no problem

Anonymous
03/04/26(Wed)09:42:00 No.108292846

Anonymous 03/04/26(Wed)09:42:00 No.108292846

>>108292805
death of qwen

llama.cpp CUDA dev !!yhbFjk57TDr
03/04/26(Wed)09:47:48 No.108292890

llama.cpp CUDA dev !!yhbFjk57TDr 03/04/26(Wed)09:47:48 No.108292890

>>108292842
Noted but the runtimes of the draft model and the tokenizations are not bottlenecks in llama.cpp,

Anonymous
03/04/26(Wed)09:52:27 No.108292926

Anonymous 03/04/26(Wed)09:52:27 No.108292926

File: oof.png (6 KB, 537x32)

6 KB PNG

Anonymous
03/04/26(Wed)10:04:23 No.108293014

Anonymous 03/04/26(Wed)10:04:23 No.108293014

>>108292231
yeah he's become irritating enough I'm not leaving this thread until close to page 10 and someone bakes a non-schizo bread

Anonymous
03/04/26(Wed)10:08:18 No.108293028

Anonymous 03/04/26(Wed)10:08:18 No.108293028

>>108292687
4B is actually good enough that I can run it alongside glm 4.7 as a fast model for code changes that require no brain.

Anonymous
03/04/26(Wed)10:09:49 No.108293036

Anonymous 03/04/26(Wed)10:09:49 No.108293036

https://www.36kr.com/p/3708425301749891
article in runes but use your local LLM to translate.
some of the interesting parts:
>Regarding this adjustment, Alibaba's senior leadership emphasized that Qwen is not contracting; rather, it is an expansion. This is unrelated to any political maneuvering and requires increased resource investment.
>"We are growing rapidly. This adjustment aims to recruit more talent and provide more resources," acknowledged Chief Talent Officer Jiang Fang, admitting communication gaps existed. "The organizational structure wasn't communicated well enough. Bringing in new members inevitably causes structural changes. We may not have handled this adequately."
>Alibaba Cloud CTO Zhou Jingren addressed sharp questions regarding hiring quotas and compute shortages: Why do external customers (such as large model startups) use Alibaba Cloud's compute resources smoothly, while internal teams struggle with compute and hiring quotas?
>A source familiar with the situation told Intelligent Emergence that since 2025, Lin Junyang had been seeking to integrate teams working on language, images, video, and code to improve model training efficiency. The Qwen team had proposed merging with the Wanxiang team but failed to do so, leading to the development of the Qwen-Image model independently.
>However, during this adjustment, the Tongyi Lab aimed to split the Qwen team into pre-training, post-training, visual understanding, and image dimensions, merging them with Tongyi Lab teams (such as Tongyi Wanxiang, Tongyi Baiying, etc.). Without sufficient communication, conflicts erupted.

Anonymous
03/04/26(Wed)10:11:00 No.108293041

Anonymous 03/04/26(Wed)10:11:00 No.108293041

>>108293036
>Zhou Hao (Hao Zhou) graduated from the University of Science and Technology of China (undergraduate) and the University of Wisconsin-Madison (PhD). According to his LinkedIn profile, he worked at Meta for 3 years and at Google DeepMind for approximately 4 years. He was a core contributor to the Gemini 3.0 model, personally led the implementation of multi-step RL with tools and chain-of-thought, and deeply participated in Gemini 1.0, AI Mode, and Deep Research projects.
>Since 2023, the Qwen family has cumulatively open-sourced over 400 models, covering parameter sizes from 0.5B to 235B. It is hard to imagine that the Qwen team, which supports these model updates, consists of only about 100 people. Including other Tongyi Lab teams, the total number is in the hundreds.
>For comparison, Byte's Seed team responsible for foundational model training already has nearly 2,000 people. In all directions, Alibaba's absolute number of personnel is only a fraction of competitors. Many Qwen members told 36Kr that Qwen's compute and infrastructure construction have long lacked resources and support, hindering model iteration speed.

Anonymous
03/04/26(Wed)10:16:01 No.108293062

Anonymous 03/04/26(Wed)10:16:01 No.108293062

>>108293028
do you use it with thinking/reasoning disabled?

Anonymous
03/04/26(Wed)10:23:30 No.108293123

Anonymous 03/04/26(Wed)10:23:30 No.108293123

>>108293062
No.
As a side note, I noticed that glm uses a completely different and shorter reasoning style when running in claude code. I didn't check if qwen does something similar.

Anonymous
03/04/26(Wed)10:24:02 No.108293129

Anonymous 03/04/26(Wed)10:24:02 No.108293129

ming-flash-omni.gguf?

Anonymous
03/04/26(Wed)10:27:07 No.108293151

Anonymous 03/04/26(Wed)10:27:07 No.108293151

>>108293123
>I didn't check if qwen does something similar.
few the times I used it in reasoner, it was rather inconsistent even in normal chats. Most of the time it will start with Thinking Process: but most is not all, and when it doesn't pretty much anything goes. I also saw it start with an opener like "Here's the thinking process xxx:" that looked like the output you would get if you told an LLM to generate a dataset of reasoner traces for you, so it seems their CoT data wasn't cleaned up well enough.

Anonymous
03/04/26(Wed)10:33:11 No.108293194

Anonymous 03/04/26(Wed)10:33:11 No.108293194

Which cuda version should I use with llama.cpp? The digital spaceport guide says to use an older one for less headaches (12.8) but is it necessary?

Anonymous
03/04/26(Wed)10:33:40 No.108293201

Anonymous 03/04/26(Wed)10:33:40 No.108293201

>>108293194
cuda and vk give me same performance

Anonymous
03/04/26(Wed)10:35:42 No.108293214

Anonymous 03/04/26(Wed)10:35:42 No.108293214

>>108293201
What's a vk?

Anonymous
03/04/26(Wed)10:45:07 No.108293284

Anonymous 03/04/26(Wed)10:45:07 No.108293284

File: mistral_logo_new.png (182 B, 294x294)

182 B PNG

Stuff will appear here:
https://huggingface.co/mistral-labs

>Mistral Labs is an organization under Mistral AI. It will operate alongside the official Mistral AI Org to release checkpoints that may benefit the community.
>
>In contrast to the official Mistral AI Org, the checkpoints published on Mistral Labs are:
>
>- more experimental in nature
>- less rigorously tested
>- often contributed by community members or collaborators
>
>We hope these checkpoints will be useful to the community, but we cannot vouch for their correctness.

Anonymous
03/04/26(Wed)10:45:35 No.108293290

Anonymous 03/04/26(Wed)10:45:35 No.108293290

>2026
>mistral

Anonymous
03/04/26(Wed)10:48:38 No.108293312

Anonymous 03/04/26(Wed)10:48:38 No.108293312

>>108293284
I hope they can't vouch for their safety either

Anonymous
03/04/26(Wed)10:49:15 No.108293316

Anonymous 03/04/26(Wed)10:49:15 No.108293316

>>108293151
>"Here's the thinking process xxx:"
I'm phoneposting right now but I'm pretty sure that the big qwens always do this.

Anonymous
03/04/26(Wed)10:50:36 No.108293329

Anonymous 03/04/26(Wed)10:50:36 No.108293329

>>108293194
I'm compiling on windows with 13.1 with no issues

Anonymous
03/04/26(Wed)10:53:21 No.108293340

Anonymous 03/04/26(Wed)10:53:21 No.108293340

>>108293284
>>- less rigorously tested
>often contributed by community members or collaborators
Davidtoons?

Anonymous
03/04/26(Wed)10:53:25 No.108293341

Anonymous 03/04/26(Wed)10:53:25 No.108293341

Has anyone else tried quanting with the lcpp script + transformers 5 branch? It needs a small patch for Unicode strings but seems to work.
Does the resulting gguf break in subtle ways? It’s working multimodal in llama-server but I haven’t done extensive regression testing

Anonymous
03/04/26(Wed)10:54:15 No.108293343

Anonymous 03/04/26(Wed)10:54:15 No.108293343

>>108293340
The API-only Mistral Small Creative was a "labs" model too.

Anonymous
03/04/26(Wed)10:56:26 No.108293360

Anonymous 03/04/26(Wed)10:56:26 No.108293360

>>108293284
doubt they are gonna release anything interesting that could endanger their eu gibsmedats

Anonymous
03/04/26(Wed)11:01:21 No.108293394

Anonymous 03/04/26(Wed)11:01:21 No.108293394

Are there models that extract text from images and translate it?

Anonymous
03/04/26(Wed)11:01:54 No.108293396

Anonymous 03/04/26(Wed)11:01:54 No.108293396

>>108293394
qwen3.5

Anonymous
03/04/26(Wed)11:02:12 No.108293399

Anonymous 03/04/26(Wed)11:02:12 No.108293399

Zed is unusable. Qwen-397B always messes up. opencode just werks.

Anonymous
03/04/26(Wed)11:03:58 No.108293422

Anonymous 03/04/26(Wed)11:03:58 No.108293422

>>108293394
Realtime or offline?

Anonymous
03/04/26(Wed)11:04:22 No.108293423

Anonymous 03/04/26(Wed)11:04:22 No.108293423

>>108293201
you made me curious so I made a vulkan build to see if the performance gap had really shrunk with cuda
the prompt processing for a really tiny prompt took so much more time than the cuda build, running 35BA3B partially cpu/gpu
token gen was only slightly slower, but that prompt processing duh
vulkan is still a cope for people who reject our lord NVIDIA

Anonymous
03/04/26(Wed)11:05:50 No.108293435

Anonymous 03/04/26(Wed)11:05:50 No.108293435

>>108293422
Offline, in sillytavern.
>>108293396
I will try it out, thanks.

Anonymous
03/04/26(Wed)11:15:49 No.108293515

Anonymous 03/04/26(Wed)11:15:49 No.108293515

>>108293423
i do notice cuda holds up a bit more in my case, stable t/s but its almost same maybe in higher contexts vk can slow down more

Anonymous
03/04/26(Wed)11:20:24 No.108293551

Anonymous 03/04/26(Wed)11:20:24 No.108293551

File: HClDIx0W0AEs8ul.png (26 KB, 775x371)

26 KB PNG

27B is up

Anonymous
03/04/26(Wed)11:22:08 No.108293562

Anonymous 03/04/26(Wed)11:22:08 No.108293562

>>108293551
who gives a shit?

Anonymous
03/04/26(Wed)11:24:21 No.108293573

Anonymous 03/04/26(Wed)11:24:21 No.108293573

>>108293562
i do

Anonymous
03/04/26(Wed)11:25:08 No.108293578

Anonymous 03/04/26(Wed)11:25:08 No.108293578

>>108293551
bart btfto to the ever

Anonymous
03/04/26(Wed)11:25:40 No.108293581

Anonymous 03/04/26(Wed)11:25:40 No.108293581

I don't know what the DS model they're hosting on their web interface is but it's smarter than a month ago

Anonymous
03/04/26(Wed)11:25:42 No.108293583

Anonymous 03/04/26(Wed)11:25:42 No.108293583

>>108293562
What an odd thing to say in the local model general
there are infinite niches, and a given model could be the best fit for any number of them

Anonymous
03/04/26(Wed)11:30:26 No.108293624

Anonymous 03/04/26(Wed)11:30:26 No.108293624

>>108292890
hi cudatard, I just wanna say I love you and thank you for sharing your gpu genius with us. it's always "what is johannes doing?" and never "how is johannes doing?". congrats on the huggingface merger. I know some people like to poopoo all over some of the sharp edges of llama.cpp, but it is a world-class project and the silent majority appreciates your work. I wish you health, wealth and happiness

Anonymous
03/04/26(Wed)11:30:46 No.108293627

Anonymous 03/04/26(Wed)11:30:46 No.108293627

>>108293581
It's a new closed (for now, maybe they'll open it in the future.. MAYBE) experimental model that has very long context that is truly competitive with Gemini. Since you're not averse to using their web interface, upload some large text file and watch it fly, it's unreal.
It's also not available as an API model yet unfortunately.
I wouldn't be surprised if it was never released as an open weight though, it has reached the "I would pay for this" bar for me, which is not something I would have said for any open weight model before, and China isn't a charity, if they feel they have something worth money they won't hand it away for free.

Anonymous
03/04/26(Wed)11:33:58 No.108293653

Anonymous 03/04/26(Wed)11:33:58 No.108293653

>>108293627
Yeah I fed it my code and expected "you're absolutely right" instead it shat on my code and made me depressed

Anonymous
03/04/26(Wed)11:36:49 No.108293677

Anonymous 03/04/26(Wed)11:36:49 No.108293677

>>108293653
200IQ astroturfing campaign. Since elon browses this thread expect next grok to do that.

Anonymous
03/04/26(Wed)11:37:25 No.108293680

Anonymous 03/04/26(Wed)11:37:25 No.108293680

>>108293677
Meds, NOW!

Anonymous
03/04/26(Wed)11:39:36 No.108293705

Anonymous 03/04/26(Wed)11:39:36 No.108293705

File: w.gif (203 KB, 220x219)

203 KB GIF

>>108293677
>Since elon browses this thread

Anonymous
03/04/26(Wed)11:40:45 No.108293714

Anonymous 03/04/26(Wed)11:40:45 No.108293714

File: 2d3b4da51e60d0cce82f71f3f(...).jpg (58 KB, 959x910)

58 KB JPG

>Nvidia has ended engineering support for Pascal and announced end of support at the end of 2028
>Pascal support already removed from latest cuDNN, tensorrt etc.
>ML libraries like pytorch have taken it as a green light and followed suit by removing Pascal support from pre-compiled packages
Well, at least Nvidia Pascal had a longer run than fucking AMD Polaris...
To fellow Pascal bros, here are the the last versions of some python packages that still supported Pascal:
>nvidia-cudnn-cu12<9.11.0
>torch<2.8
>torchaudio<2.8
Also, dear datacenters and universities, you can dump V100 cheapies on the market now, pretty please :)

Anonymous
03/04/26(Wed)11:56:32 No.108293837

Anonymous 03/04/26(Wed)11:56:32 No.108293837

File: 1746656217954440.png (149 KB, 1821x1016)

149 KB PNG

New 100% REAL AND TRUE model from the glorious land of china!

https://github.com/Yuan-lab-LLM/Yuan3.0-Ultra

Anonymous
03/04/26(Wed)11:57:16 No.108293843

Anonymous 03/04/26(Wed)11:57:16 No.108293843

I am downloading minimax to coom. You. Yeah you. Expect to be called a baiting nigger in the next few hours when i confirm that it is worthless.

Anonymous
03/04/26(Wed)11:57:38 No.108293844

Anonymous 03/04/26(Wed)11:57:38 No.108293844

>>108291835
It's still pretty easy to tell that the /ldg/ schizo is also petr*.

llama.cpp CUDA dev !!yhbFjk57TDr
03/04/26(Wed)11:58:47 No.108293853

llama.cpp CUDA dev !!yhbFjk57TDr 03/04/26(Wed)11:58:47 No.108293853

>>108293624
Thanks, I appreciate it.

Anonymous
03/04/26(Wed)11:59:18 No.108293861

Anonymous 03/04/26(Wed)11:59:18 No.108293861

>>108293843
stepfun is better unironically for gooning

Anonymous
03/04/26(Wed)11:59:38 No.108293863

Anonymous 03/04/26(Wed)11:59:38 No.108293863

>>108293624
Gay
>>108293853
Gay for jart

Anonymous
03/04/26(Wed)12:00:57 No.108293877

Anonymous 03/04/26(Wed)12:00:57 No.108293877

>>108293861
Stepfun is fun because it is not contrained by reasoning and logic (it is 12B tier retarded)

Anonymous
03/04/26(Wed)12:01:15 No.108293878

Anonymous 03/04/26(Wed)12:01:15 No.108293878

>>108293861
my experience with stepfun is that it's qwen-thinking levels of censored

Anonymous
03/04/26(Wed)12:03:03 No.108293886

Anonymous 03/04/26(Wed)12:03:03 No.108293886

>>108293878
I did cunny with stepfun no probs tho, are you sure its not a skill issue?

Anonymous
03/04/26(Wed)12:03:16 No.108293888

Anonymous 03/04/26(Wed)12:03:16 No.108293888

>>108293837
64K context? Loli-RAEP? 1T?!!
China really cooked with this one. Now throw it in the trash.

Anonymous
03/04/26(Wed)12:03:44 No.108293892

Anonymous 03/04/26(Wed)12:03:44 No.108293892

>>108291225
How long until "streamable" (passes by your machine but against TOS to intercept) models or more likely subscription based models that are installable on your machine so that it can do the heavy lifting but DRMed to hell? Just a case of computers catching up to allow it?

Anonymous
03/04/26(Wed)12:04:09 No.108293897

Anonymous 03/04/26(Wed)12:04:09 No.108293897

>>108293551
q4 seems to be the spot, even unsloth says so in the explainer how to run the models locally.

Anonymous
03/04/26(Wed)12:04:30 No.108293903

Anonymous 03/04/26(Wed)12:04:30 No.108293903

guh-guff

Anonymous
03/04/26(Wed)12:04:37 No.108293904

Anonymous 03/04/26(Wed)12:04:37 No.108293904

>>108293837
>https://huggingface.co/YuanLabAI/Yuan3.0-Ultra
>The model was pre-trained from scratch original with 1515B parameters. Through the innovative Layer-Adaptive Expert Pruning (LAEP) algorithm, the parameter count was reduced to 1010B during pre-training, improving pre-training efficiency by 49%. The activated parameter count for Yuan3.0 Ultra is 68.8B
>715 GB fp16
???

Anonymous
03/04/26(Wed)12:06:30 No.108293915

Anonymous 03/04/26(Wed)12:06:30 No.108293915

>>108293878
My experience is that it is most uncensored model since nemo. Mainly because it doesn't understand what is happening so it can't refuse.

Anonymous
03/04/26(Wed)12:06:42 No.108293917

Anonymous 03/04/26(Wed)12:06:42 No.108293917

>>108293837
It's trained on 2T tokens of enterprise scenario data

Anonymous
03/04/26(Wed)12:08:20 No.108293925

Anonymous 03/04/26(Wed)12:08:20 No.108293925

>>108293837
1T model that won't even begin to compete with whatever DeepSeek is cooking. Who would run a cloud hardware level model that can only handle 64K context? are they fucking serious?
China has a lot of grifter level labs
step, internlm, minimax

Anonymous
03/04/26(Wed)12:10:00 No.108293939

Anonymous 03/04/26(Wed)12:10:00 No.108293939

>>108293915
>doesn't understand what is happening
haha.
at least you don't get the actual, stolen from gpt-oss CoT that minimax does

Anonymous
03/04/26(Wed)12:12:31 No.108293956

Anonymous 03/04/26(Wed)12:12:31 No.108293956

>>108293925
>1T model that won't even begin to compete with whatever DeepSeek is cooking
You sound like you work there. Are you one of the sexual relief officers?

Anonymous
03/04/26(Wed)12:13:22 No.108293962

Anonymous 03/04/26(Wed)12:13:22 No.108293962

>>108293904
lol this reads like a “guy jumping out the window and running away” image macro

Anonymous
03/04/26(Wed)12:14:32 No.108293973

Anonymous 03/04/26(Wed)12:14:32 No.108293973

>>108293903
guh-guaufuhh

Anonymous
03/04/26(Wed)12:17:54 No.108293985

Anonymous 03/04/26(Wed)12:17:54 No.108293985

>>108293973
guhgufuhhhh....

Anonymous
03/04/26(Wed)12:18:29 No.108293994

Anonymous 03/04/26(Wed)12:18:29 No.108293994

>>108293714
So if I have a P40 stashed away does that mean it's going turn into dust now or in 2028?

Anonymous
03/04/26(Wed)12:22:40 No.108294027

Anonymous 03/04/26(Wed)12:22:40 No.108294027

>>108293985
guh-fu-fu-fu-fu-fuh

Anonymous
03/04/26(Wed)12:22:50 No.108294028

Anonymous 03/04/26(Wed)12:22:50 No.108294028

>>108293994
yes

Anonymous
03/04/26(Wed)12:29:33 No.108294067

Anonymous 03/04/26(Wed)12:29:33 No.108294067

>>108293551
What the fuck is NL

Anonymous
03/04/26(Wed)12:32:10 No.108294087

Anonymous 03/04/26(Wed)12:32:10 No.108294087

>>108293994
it will work as long as whatever AI thingie you're running doesn't require latest CUDA or library versions that stopped supporting Pascal.
Even then, some libraries like pytorch can still work with Pascal on their latest versions, but you have to compile them yourself to enable sm_61 support, it's just that their packaged pre-compiled versions are built without it.
Overall, expect more and more things requiring annoying chores like the above, and even further down the line expect things to not work at all due to core support just not being there (like driver 590, for example).

Anonymous
03/04/26(Wed)12:32:48 No.108294093

Anonymous 03/04/26(Wed)12:32:48 No.108294093

>>108294067
more accurate but it runs at the speed of a q8

Anonymous
03/04/26(Wed)12:38:39 No.108294134

Anonymous 03/04/26(Wed)12:38:39 No.108294134

>>108293904
>The innovative Layer-Adaptive Expert Pruning (LAEP) algorithm is a novel method developed specifically for pre-training Mixture-of-Experts (MoE) Large Language Models. It improves pre-training efficiency by 49% and reduces the total parameter count by 33% (from 1515B to 1010B).
The HF repo only has 85 out of 206 files. Check the modelscope, it has the additional batches with the rest of the files uploaded.

Anonymous
03/04/26(Wed)12:39:31 No.108294140

Anonymous 03/04/26(Wed)12:39:31 No.108294140

If you had 72vram and 96ram what would you use?

Anonymous
03/04/26(Wed)12:41:46 No.108294164

Anonymous 03/04/26(Wed)12:41:46 No.108294164

>>108294140
ebay

Anonymous
03/04/26(Wed)12:42:42 No.108294170

Anonymous 03/04/26(Wed)12:42:42 No.108294170

>>108294164
To sell everything?

Anonymous
03/04/26(Wed)12:44:23 No.108294179

Anonymous 03/04/26(Wed)12:44:23 No.108294179

>>108294170
Actually, no. To get some high bandwidth ewaste to make better use of those GPUs
EPYC Rome, Threadripper or Xeon

Anonymous
03/04/26(Wed)12:46:29 No.108294196

Anonymous 03/04/26(Wed)12:46:29 No.108294196

>>108294179
I'm not spending anymore money on this stuff.

Anonymous
03/04/26(Wed)12:48:49 No.108294214

Anonymous 03/04/26(Wed)12:48:49 No.108294214

>>108294196
You do you
I personally find it make my life significantly better and is worth the money to own instead of rent
Mostly code/automation/analysis/planning work

Anonymous
03/04/26(Wed)12:51:38 No.108294228

Anonymous 03/04/26(Wed)12:51:38 No.108294228

>>108294214
I'm using glm 4.5 air. Iq4xs so it all fits in the vram.

Anonymous
03/04/26(Wed)12:52:58 No.108294235

Anonymous 03/04/26(Wed)12:52:58 No.108294235

>>108294087
Oh well. I remain bitter we didn't get any magic for t2i that would have made it relevant, guess it's gonna become a similar feeling of the vram that just sits there. Thanks for the pointers, saving in case good times don't come and Chinese GPUs don't save us.

Anonymous
03/04/26(Wed)12:53:56 No.108294240

Anonymous 03/04/26(Wed)12:53:56 No.108294240

>>108294228
Comfy. That’s a good perf/$ spot

Anonymous
03/04/26(Wed)13:17:14 No.108294422

Anonymous 03/04/26(Wed)13:17:14 No.108294422

File: file.png (115 KB, 743x516)

115 KB PNG

wow localllmao mod calling users retards
it seems like someone is at least aware of the problem

Anonymous
03/04/26(Wed)13:20:30 No.108294443

Anonymous 03/04/26(Wed)13:20:30 No.108294443

>>108294087
>>108293714
>GTX 1080 Ti will be relevant because of 11 GB of VRAM even after end of support.
Grim timeline we live in. But it's understandable. The lack of RTX features and av1 is going to hold it back in the future.

Anonymous
03/04/26(Wed)13:21:41 No.108294453

Anonymous 03/04/26(Wed)13:21:41 No.108294453

File: 1750541841713622.png (56 KB, 220x233)

56 KB PNG

>>108294422
>nooo why so many updooterinooooo
I thought the mods on locallama were more based than that, that's a shame

Anonymous
03/04/26(Wed)13:21:48 No.108294455

Anonymous 03/04/26(Wed)13:21:48 No.108294455

>>108294422
too little too late
gatekeeping has to be done early so that the retards don't feel welcome, stay and encourage their fellow retards to join them in on the fun
in a place like a popular subleddit if it's already filled with retards considering how those websites work (mass upvotes = voice heard) you are screwed.

Anonymous
03/04/26(Wed)13:22:50 No.108294463

Anonymous 03/04/26(Wed)13:22:50 No.108294463

>>108294422
Expecting 4B model to have good world knowledge is in itself, scary stupid

Anonymous
03/04/26(Wed)13:24:53 No.108294483

Anonymous 03/04/26(Wed)13:24:53 No.108294483

>>108294463
yeah, a human brain has 80b neurons and we're far from memorizing everything

Anonymous
03/04/26(Wed)13:28:19 No.108294506

Anonymous 03/04/26(Wed)13:28:19 No.108294506

>>108291455
>v4
imagine distllation attack + engrams (https://www.arxiv.org/pdf/2601.07372)
That's what v4 is and it is not ready to be revealed to the world just yet

Anonymous
03/04/26(Wed)13:30:39 No.108294522

Anonymous 03/04/26(Wed)13:30:39 No.108294522

>>108291455
>a graph that only displays Anthropic and Chatgpt
what about the others? lmao

Anonymous
03/04/26(Wed)13:31:31 No.108294530

Anonymous 03/04/26(Wed)13:31:31 No.108294530

File: Screenshot 2026-03-04 193108.png (71 KB, 994x366)

71 KB PNG

>>108294506
dude, the comparison of engram vs no engram in their experimental model shows so little difference at least in the benchmax that I doubt engrams are the reason why the model on their chat interface is good.

Anonymous
03/04/26(Wed)13:52:16 No.108294663

Anonymous 03/04/26(Wed)13:52:16 No.108294663

File: 2026-03-04 195129.png (306 KB, 700x1048)

306 KB PNG

is this a chinese scam?

Anonymous
03/04/26(Wed)13:53:19 No.108294669

Anonymous 03/04/26(Wed)13:53:19 No.108294669

>>108294663
I mean, I fucking hope a 1T parameters model works well

Anonymous
03/04/26(Wed)13:56:09 No.108294682

Anonymous 03/04/26(Wed)13:56:09 No.108294682

>>108294663
A69B model SHOULD be smarter than A32B/A40B ones.

Anonymous
03/04/26(Wed)13:56:42 No.108294684

Anonymous 03/04/26(Wed)13:56:42 No.108294684

>>108294663
Never heard of this lab though

Anonymous
03/04/26(Wed)13:57:27 No.108294689

Anonymous 03/04/26(Wed)13:57:27 No.108294689

>>108294506
>That's what v4 is and it is not ready to be revealed to the world just yet
Do you post from under the desk mid bj?

Anonymous
03/04/26(Wed)13:58:00 No.108294694

Anonymous 03/04/26(Wed)13:58:00 No.108294694

>>108294663
us | others

Anonymous
03/04/26(Wed)13:59:02 No.108294700

Anonymous 03/04/26(Wed)13:59:02 No.108294700

Bigger is not always better nor should it be

Anonymous
03/04/26(Wed)13:59:47 No.108294704

Anonymous 03/04/26(Wed)13:59:47 No.108294704

>>108294663
dude, a fucking 1T model capped at 64K tokens context window
you couldn't get more dead on arrival than this

Anonymous
03/04/26(Wed)14:04:16 No.108294734

Anonymous 03/04/26(Wed)14:04:16 No.108294734

should I grab a mi50? theyre going around for 200 eurodollars

Anonymous
03/04/26(Wed)14:05:33 No.108294743

Anonymous 03/04/26(Wed)14:05:33 No.108294743

>>108294734
>ayyyymd
nyo

Anonymous
03/04/26(Wed)14:07:30 No.108294758

Anonymous 03/04/26(Wed)14:07:30 No.108294758

File: le mao.png (130 KB, 1164x614)

130 KB PNG

>>108294734

Anonymous
03/04/26(Wed)14:11:38 No.108294800

Anonymous 03/04/26(Wed)14:11:38 No.108294800

>>108294758
finewine tho

Anonymous
03/04/26(Wed)14:15:11 No.108294826

Anonymous 03/04/26(Wed)14:15:11 No.108294826

>>108294758
idc about this, I think cudadev was recently working on improvements for them, id be interested in some comparisons with ada and blackwell for pp/tg

Anonymous
03/04/26(Wed)14:16:36 No.108294836

Anonymous 03/04/26(Wed)14:16:36 No.108294836

>>108294758
classic AMD, Polaris got only 4 years of support at best, 3 years if you bought RX590 at release.

Anonymous
03/04/26(Wed)14:21:27 No.108294871

Anonymous 03/04/26(Wed)14:21:27 No.108294871

>>108294506
>imagine distllation attack + engrams
>That's what v4 is and it is not ready to be revealed to the world just yet
this is what x and linkedin do to a mf

There's no such thing as a distillation attack. All recent models use competitors models responses or simply as a way to score and filter responses.

>>108294530
Wouldn't engrams mostly help with retrieval or long context stuff and generally improve efficiency? Or am I misunderstanding it?

Anonymous
03/04/26(Wed)14:21:42 No.108294877

Anonymous 03/04/26(Wed)14:21:42 No.108294877

File: 1772630283919713.png (75 KB, 498x376)

75 KB PNG

>>108294422
>>108294463
still works as bloom filter to reject queries

Anonymous
03/04/26(Wed)14:24:45 No.108294903

Anonymous 03/04/26(Wed)14:24:45 No.108294903

>>108294826
better be sure you won't ever care during ownership about anything other than llamao.cpp tho

Anonymous
03/04/26(Wed)14:28:39 No.108294923

Anonymous 03/04/26(Wed)14:28:39 No.108294923

> Ultimately, their results were inferior to the small models cleverly distilled by MiniMax, despite Qwen’s total burn rate (costs) being more than 10x higher.
lol qwen died because they didn't benchmaxx hard enough
https://x.com/seclink/status/2029119634696261824

Anonymous
03/04/26(Wed)14:32:43 No.108294960

Anonymous 03/04/26(Wed)14:32:43 No.108294960

>>108294923
>cleverly distilled by MiniMax
ah yes, the cleverness of distilling the smaller 120B gpt-oss
reminds me of NVIDIA's nemotron, distilled from... Qwen 30BA3B, Qwen 14B and many other idiotic synth data sources

Anonymous
03/04/26(Wed)14:33:59 No.108294965

Anonymous 03/04/26(Wed)14:33:59 No.108294965

the LLM field is looking more and more like the end of crypto, filled with the worst of humanity, the dumbest of retards and nothing but grifters

Anonymous
03/04/26(Wed)14:35:19 No.108294970

Anonymous 03/04/26(Wed)14:35:19 No.108294970

>>108294871
If all improvements this year is just chasing efficiency then that means the music will slow and someone is going to be left holding the smelliest sack of excrement in capitalism. What shakes the market is evidence of broader capabilities that will fuel the next cycle of startups and capital investments. Like if carmack makes a bot that can pick up an obscure videogame and learns to play it without pre-training, you can say good bye to most of these AI lab companies.

Anonymous
03/04/26(Wed)14:36:23 No.108294977

Anonymous 03/04/26(Wed)14:36:23 No.108294977

>>108294965
ok so which one is bitcoin and which one can I run locally

Anonymous
03/04/26(Wed)14:37:37 No.108294985

Anonymous 03/04/26(Wed)14:37:37 No.108294985

File: Screenshot 2026-03-04 143725.png (154 KB, 488x628)

154 KB PNG

>>108294960
GTC is around the corner, perfect timing to see how they also distilled Claude for Nemotron Super/Max. Also what the hell they're doing to Groq and N1 CPUs.

Anonymous
03/04/26(Wed)14:37:50 No.108294986

Anonymous 03/04/26(Wed)14:37:50 No.108294986

i'm running dolphin-llama:8b on a server pc of mine with a 1060 6gb and it runs surprisingly fast. however it's quite censored and outdated, it's knowledge range seems to have ended in 2023. would there be a newer better llm i could run that would still work well on my old 1060?

Anonymous
03/04/26(Wed)14:37:50 No.108294987

Anonymous 03/04/26(Wed)14:37:50 No.108294987

File: file.png (11 KB, 434x98)

11 KB PNG

>>108294422
lol who is this

Anonymous
03/04/26(Wed)14:39:06 No.108294997

Anonymous 03/04/26(Wed)14:39:06 No.108294997

>>108294970
>carmack makes a
nothing
a whole lotta nothing

Anonymous
03/04/26(Wed)14:40:04 No.108295003

Anonymous 03/04/26(Wed)14:40:04 No.108295003

>>108294986
read the thread and lurk moar

Anonymous
03/04/26(Wed)14:41:59 No.108295008

Anonymous 03/04/26(Wed)14:41:59 No.108295008

>>108294960
minimax were put front and center by anthropic for massively distilling opus, yet the meme that they distilled toss still persists from the tiny amount they used 2 versions ago

Anonymous
03/04/26(Wed)14:45:18 No.108295021

Anonymous 03/04/26(Wed)14:45:18 No.108295021

>>108295008
>the meme
it's not the meme because they actually did it
that they distilled claude later can never remove the stain that they were retarded enough to think distilling a micro moe like toss was a good idea (disregarding the coomer complaints about safety etc, this is not my focus here)
they are a lab staffed by subhumans

Anonymous
03/04/26(Wed)14:53:11 No.108295066

Anonymous 03/04/26(Wed)14:53:11 No.108295066

>>108294422
reddit is just a trash pile of mostly automated bots
r/localllama is also flooded with "t created x project" posts of webUIs people created with claude in a prompt reply that took less than 500 milliseconds.

Anonymous
03/04/26(Wed)14:56:27 No.108295085

Anonymous 03/04/26(Wed)14:56:27 No.108295085

Models are getting really good but they are still retarded because they don't generalize.

I am scared. If there is an algorithmic breakthrough in generalization, we will instantly have ASI. I expect it to still take a few years but the uncertainty of it all is spooky. The age of men could end any day.

Anonymous
03/04/26(Wed)14:56:28 No.108295086

Anonymous 03/04/26(Wed)14:56:28 No.108295086

>>108295066
r/localllama has always been a 'fun' sub, but yes the level of discourse kind of degraded over time... it's nothing compared to the degradation that appeared in r/machinelearning though, unless they cleaned up recently, it went from pretty high level a few years ago to retarded

Anonymous
03/04/26(Wed)14:59:29 No.108295116

Anonymous 03/04/26(Wed)14:59:29 No.108295116

>>108295021
it is a meme thoughever, it was clearly an amalgamation of several data sources and not a straight-up toss distillation if you actually used it. there were a few distinct "thinking voices" you could find in the model depending on your queries, most of which were not tosslike in the slightest. but since the average lmger's test of a model is "write a loli rape story lol" (or, more realistically, seeing a screenshot of someone else doing it) and making up their mind based on the result, of course this was missed
minimax is very distillation-heavy and I don't view them as an innovator or good research lab, but let's at least be accurate in our criticisms

Anonymous
03/04/26(Wed)15:00:00 No.108295120

Anonymous 03/04/26(Wed)15:00:00 No.108295120

>>108295086
>it went from pretty high level a few years ago to retarded
it's always like that, at the begining the community is niche and only has big enthusiasts, then it becomes mainstream and the normies ruin everything, many such cases

Anonymous
03/04/26(Wed)15:04:05 No.108295156

Anonymous 03/04/26(Wed)15:04:05 No.108295156

>>108295116
calm your autism charlie, I never said it was /only/ a distillation of toss and I compared what they did to what NVIDIA did, which is very similar
https://huggingface.co/datasets/nvidia/Nemotron-CC-v2
>synthetic rephrasing using Qwen3-30B-A3B
>STEM data was expanded from high-quality math and science seeds using multi-iteration generation with Qwen3 and DeepSeek models
>billions of tokens generated using DeepSeek-V3 and Qwen3 for logical, analytical, and reading comprehension questions
>This dataset contains synthetic data created using the following models:
>DeepSeek-R1, DeepSeek-R1-0528, DeepSeek-R1-Distill-Qwen-32B, DeepSeek-V3, DeepSeek-V3-0324, Mistral-Nemo-12B-Instruct, Mixtral 8x22B, Mixtral-8x22B-v0.1, Nemotron-4-340B-Instruct, Qwen2.5-32B-Instruct, Qwen2.5-72B-Instruct, Qwen-2.5-7B-Math-Instruct, Qwen2.5-0.5B-instruct, Qwen2.5-32B-Instruct, Qwen2.5-72B-Instruct, Qwen2.5-Coder-32B-Instruct, Qwen2.5-Math-72B, Qwen3-235B-A22B, Qwen3-30B-A3B
anyone who actually considers making a model in such a fashion should absolutely KYS, immediately, right now, just fucking do it

Anonymous
03/04/26(Wed)15:07:06 No.108295177

Anonymous 03/04/26(Wed)15:07:06 No.108295177

Is engram actually going to do anything meaningful?

Anonymous
03/04/26(Wed)15:08:59 No.108295188

Anonymous 03/04/26(Wed)15:08:59 No.108295188

>>108295177
>going to
We'll see when we get a model with engrams.
Speculators get the bullet first.

Anonymous
03/04/26(Wed)15:09:03 No.108295190

Anonymous 03/04/26(Wed)15:09:03 No.108295190

>>108295085
i don't think so... oh wait fuck.
https://www.youtube.com/watch?v=mUmlv814aJo

Anonymous
03/04/26(Wed)15:10:55 No.108295202

Anonymous 03/04/26(Wed)15:10:55 No.108295202

>>108295156
holy synthetic

Anonymous
03/04/26(Wed)15:15:01 No.108295229

Anonymous 03/04/26(Wed)15:15:01 No.108295229

i just want my goonbot to work and fuck me :(

Anonymous
03/04/26(Wed)15:15:24 No.108295230

Anonymous 03/04/26(Wed)15:15:24 No.108295230

>>108295202
one of those models used to make synth data is this:
>Qwen2.5-0.5B-instruct
they can't possibly have listed this shameful thing if they didn't use it for real, so they did
now riddle me this, you have access to a large farm of nvidia gpu
mermet, my son
will you pick 0.5B qweenie, or will you chose to tell altman he gets a discount if he gives you some nice GPT API usage kickback for your GPUs

Anonymous
03/04/26(Wed)15:16:50 No.108295240

Anonymous 03/04/26(Wed)15:16:50 No.108295240

>>108295156
>>108295021
>>108294960
I agree.

Anonymous
03/04/26(Wed)15:18:19 No.108295251

Anonymous 03/04/26(Wed)15:18:19 No.108295251

>>108295230
They might have trained it as a lightweight metric to evaluate the other models answers?

Anonymous
03/04/26(Wed)15:25:36 No.108295312

Anonymous 03/04/26(Wed)15:25:36 No.108295312

File: nemo.png (44 KB, 886x489)

44 KB PNG

>>108295251
they specifically word it as the models that created the dataset, and even as a classifier/ranker/rm or whatever else I think 0.5B really counts as too cheap for the corpo that benefits the most from AI bucks.
also, pic related, one of the many datasets from that link has a majority of its synth data coming from Nemo 12B
it's hard to give them any of the benefit of the doubt here because stupidity is involved in every single decision they took

Anonymous
03/04/26(Wed)15:25:48 No.108295315

Anonymous 03/04/26(Wed)15:25:48 No.108295315

What are Engrams anyways

Anonymous
03/04/26(Wed)15:29:52 No.108295343

Anonymous 03/04/26(Wed)15:29:52 No.108295343

>>108290857
https://www.youtube.com/watch?v=uWLt81SgM78
https://www.youtube.com/watch?v=uWLt81SgM78
https://www.youtube.com/watch?v=uWLt81SgM78

Anonymous
03/04/26(Wed)15:30:22 No.108295346

Anonymous 03/04/26(Wed)15:30:22 No.108295346

>>108295315
Signs of the mandate of heaven of course.

Anonymous
03/04/26(Wed)15:31:20 No.108295350

Anonymous 03/04/26(Wed)15:31:20 No.108295350

>>108295315
https://arxiv.org/pdf/2601.07372

Anonymous
03/04/26(Wed)15:31:40 No.108295353

Anonymous 03/04/26(Wed)15:31:40 No.108295353

>>108295312
Speculative decoding for a Qwen2.5-32B-Instruct or Qwen2.5-72B-Instruct, idk man, just throwing buzzwords out there. But I can't see how the output of 0.5B would be useful either, other than as a metric, to gain efficiency for the use of other models, or as something to compare other results against to tell the model what not to do.

Anonymous
03/04/26(Wed)15:39:14 No.108295415

Anonymous 03/04/26(Wed)15:39:14 No.108295415

>>108295085
dario said in his dwarkesh interview that he's betting on a generalization moment in RL within the next couple years

Anonymous
03/04/26(Wed)15:40:46 No.108295430

Anonymous 03/04/26(Wed)15:40:46 No.108295430

>>108295415
>dario said

Anonymous
03/04/26(Wed)15:40:55 No.108295431

Anonymous 03/04/26(Wed)15:40:55 No.108295431

File: 1748314828217088.png (5 KB, 608x26)

5 KB PNG

Anonymous
03/04/26(Wed)15:42:58 No.108295447

Anonymous 03/04/26(Wed)15:42:58 No.108295447

>>108295431
no_fucking_shit_iq1_xxs.gguf
So you've been posting for a while now. What is it they're trying to do? Or just generic agent shit?

Anonymous
03/04/26(Wed)15:43:45 No.108295456

Anonymous 03/04/26(Wed)15:43:45 No.108295456

>>108294871
>>108294530
MMLU is a knowledge retrieval benchmark and Engrams gave an improvement, there's no surprise here. However Engrams led to bigger improvement on reasoning tasks, suggesting the model is taking advantage of the freed up capacity

Anonymous
03/04/26(Wed)15:47:56 No.108295483

Anonymous 03/04/26(Wed)15:47:56 No.108295483

x2 faster than vLLM
>https://x.com/tanishqkumar07/status/2029251146196631872
>https://xcancel.com/tanishqkumar07/status/2029251146196631872
>https://arxiv.org/pdf/2603.03251

Anonymous
03/04/26(Wed)15:48:39 No.108295491

Anonymous 03/04/26(Wed)15:48:39 No.108295491

>>108295483
>>108292842

Anonymous
03/04/26(Wed)16:01:57 No.108295609

Anonymous 03/04/26(Wed)16:01:57 No.108295609

>>108295430
his word is more important than that of almost any other individual

Anonymous
03/04/26(Wed)16:03:39 No.108295620

Anonymous 03/04/26(Wed)16:03:39 No.108295620

File: 1759311485666078.jpg (74 KB, 640x800)

74 KB JPG

>>108295609

Anonymous
03/04/26(Wed)16:04:13 No.108295625

Anonymous 03/04/26(Wed)16:04:13 No.108295625

>>108290857
whats the current meta for 128ram 24vram?

Anonymous
03/04/26(Wed)16:04:34 No.108295628

Anonymous 03/04/26(Wed)16:04:34 No.108295628

>>108295620
yea another thing. altman also thinks 27/28 for superintelligence is likely.

Anonymous
03/04/26(Wed)16:05:58 No.108295634

Anonymous 03/04/26(Wed)16:05:58 No.108295634

Any models I can run on a 5080 without them being retarded? Fine for code but for anything else they are just brain damaged.

Anonymous
03/04/26(Wed)16:06:17 No.108295638

Anonymous 03/04/26(Wed)16:06:17 No.108295638

>>108295628
Altman says that because he's engaging in mythical levels of investor fraud and needs to squeeze more shekels before everything pops

Anonymous
03/04/26(Wed)16:06:42 No.108295642

Anonymous 03/04/26(Wed)16:06:42 No.108295642

>>108295634
Qwen 3.5 27B

Anonymous
03/04/26(Wed)16:07:23 No.108295651

Anonymous 03/04/26(Wed)16:07:23 No.108295651

>>108295638
The alternative explanation is that progress is real and people on the inside of the biggest AI companies are honestly recognizing that.

Anonymous
03/04/26(Wed)16:07:42 No.108295654

Anonymous 03/04/26(Wed)16:07:42 No.108295654

>>108295609
Whether he's good or not at what he does, from a business perspective he has no incentive to be honest. Assuming he's not a sociopath, he has the incentive to be honest that most of us have, but his finances benefit a lot of investors thinking that the things he currently happens to be saying will indeed happen. So he has incentives to say what he is currently saying that are potentially greater at the moment than being honest. Maybe the two align, maybe they don't, and people are just taking that into consideration.

Anonymous
03/04/26(Wed)16:08:08 No.108295658

Anonymous 03/04/26(Wed)16:08:08 No.108295658

>>108295651
>progress is real
There has been no progress in the past 2 years.

Anonymous
03/04/26(Wed)16:08:53 No.108295665

Anonymous 03/04/26(Wed)16:08:53 No.108295665

>>108295654
>and people are just taking that into consideration.
no people are just reflexively/kneejerk calling people in the industry shills. it's not a healthy skepticism.

Anonymous
03/04/26(Wed)16:09:12 No.108295667

Anonymous 03/04/26(Wed)16:09:12 No.108295667

File: anon.jpg (150 KB, 1152x896)

150 KB JPG

>>108295415
>>108295609
>>108295651

Anonymous
03/04/26(Wed)16:11:18 No.108295683

Anonymous 03/04/26(Wed)16:11:18 No.108295683

>>108295667
I look like this and say this

Anonymous
03/04/26(Wed)16:13:29 No.108295701

Anonymous 03/04/26(Wed)16:13:29 No.108295701

>>108295625
GLM.

Anonymous
03/04/26(Wed)16:13:50 No.108295703

Anonymous 03/04/26(Wed)16:13:50 No.108295703

I encountered something interesting during my use of web search with Open WebUI. It encountered a Chinese web page, and when looking at the fetch results in the UI, it shows garbled encoding. But the model acted as if it understood it. So is it that the UI simply just used the wrong encoding for display, or is the model actually able to understand text that has been encoded incorrectly? Well, I followed up with that question to the model, and it does see the garbled characters. So it really does just know how to read it. Interesting little fact I didn't know about and it makes sense that models should be able to do this if their datasets weren't filtered to oblivion. Though there is a question of exactly how accurate its reading of the mojibake is, but I'm too lazy to go and do tests.

Anonymous
03/04/26(Wed)16:15:20 No.108295716

Anonymous 03/04/26(Wed)16:15:20 No.108295716

>>108295651
Good thing that alternative is not the case

Anonymous
03/04/26(Wed)16:15:51 No.108295722

Anonymous 03/04/26(Wed)16:15:51 No.108295722

>>108295703
you probably just need to have those fonts installed

Anonymous
03/04/26(Wed)16:25:25 No.108295783

Anonymous 03/04/26(Wed)16:25:25 No.108295783

>>108295651
The most Indian post on /g/ this year

Anonymous
03/04/26(Wed)16:27:29 No.108295806

Anonymous 03/04/26(Wed)16:27:29 No.108295806

>>108293903
ge-goof

Anonymous
03/04/26(Wed)16:30:59 No.108295840

Anonymous 03/04/26(Wed)16:30:59 No.108295840

>>108295806
爺ガフ

Anonymous
03/04/26(Wed)16:36:06 No.108295884

Anonymous 03/04/26(Wed)16:36:06 No.108295884

It's just juff retards. Like in Georgi.

Anonymous
03/04/26(Wed)16:37:22 No.108295892

Anonymous 03/04/26(Wed)16:37:22 No.108295892

>>108295884
gee-juff

Anonymous
03/04/26(Wed)16:38:15 No.108295899

Anonymous 03/04/26(Wed)16:38:15 No.108295899

local llm newfag here. started messing with LM studio, it's pretty neat. how are you guys integrating local llms into your workflow? Anything besides VSCode+Continue I should be looking at? The absolute largest coding model I seem to be able to run is qwen3 coder 30b 8bit.

Anonymous
03/04/26(Wed)16:39:51 No.108295909

Anonymous 03/04/26(Wed)16:39:51 No.108295909

>>108295899
I keep hearing workflow, what does that mean
you mentioned VSC so is it IDE integration?

Anonymous
03/04/26(Wed)16:41:06 No.108295920

Anonymous 03/04/26(Wed)16:41:06 No.108295920

>>108295899
>Anything besides VSCode+Continue I should be looking at?
https://github.com/zgsm-ai/costrict

Anonymous
03/04/26(Wed)16:44:07 No.108295937

Anonymous 03/04/26(Wed)16:44:07 No.108295937

Qwen3.5-397B-A19B (q8 with official sampler settings, thinking enabled) failed at answering questions about designing a chinchilla playpen. It generates suggestions I know are bad, for instance using materials that are unsafe. If I ask directly about those materials it will say not to use them but if I don't bring it up it suggests them. I might make this one of my personal benchmarks that I won't hide. I don't mind if this ends up being benchmaxed on because it means LLMs will give better chinchilla advice.

Anonymous
03/04/26(Wed)16:45:19 No.108295940

Anonymous 03/04/26(Wed)16:45:19 No.108295940

>>108295899
Workflow is sillytavern + pick card with youngest looking girl on the picture + say "aah aah mistress" and occasionally ask them to hold lots of watermelons

Anonymous
03/04/26(Wed)16:47:09 No.108295961

Anonymous 03/04/26(Wed)16:47:09 No.108295961

>>108295937
make chinchilla playpen from asbestos roof sheets today

Anonymous
03/04/26(Wed)16:47:46 No.108295969

Anonymous 03/04/26(Wed)16:47:46 No.108295969

File: 1741416147969686.png (510 KB, 928x508)

510 KB PNG

https://arxiv.org/abs/2512.01797
>They solved AI hallucinations

Anonymous
03/04/26(Wed)16:48:01 No.108295972

Anonymous 03/04/26(Wed)16:48:01 No.108295972

Fresh bake
>>108295959
>>108295959
>>108295959
>>108295959
>>108295959
>>108295959

Anonymous
03/04/26(Wed)16:49:00 No.108295978

Anonymous 03/04/26(Wed)16:49:00 No.108295978

>>108295940
i toss the watermelons through the driver window of the car we're driving to the car wash that is only 50 feet away

Anonymous
03/04/26(Wed)16:49:01 No.108295979

Anonymous 03/04/26(Wed)16:49:01 No.108295979

>>108295972
Not this time, faggot. I’m not going anywhere

Anonymous
03/04/26(Wed)16:49:14 No.108295981

Anonymous 03/04/26(Wed)16:49:14 No.108295981

>>108295969
>controlled interventions reveal that these neurons are causally linked to over-compliance behaviors

Anonymous
03/04/26(Wed)16:51:17 No.108295996

Anonymous 03/04/26(Wed)16:51:17 No.108295996

>>108295899
Word to the wise: the best workflow at this point in the tech is direct interaction and careful context management. Current automation is all wasteful technical debt generation that eventually bloats and topples over. iykyk

Anonymous
03/04/26(Wed)16:51:34 No.108296001

Anonymous 03/04/26(Wed)16:51:34 No.108296001

>>108295651
two more T synth tokens

Anonymous
03/04/26(Wed)16:55:38 No.108296037

Anonymous 03/04/26(Wed)16:55:38 No.108296037

>>108295996
for real as much as i love to vibe code like some sort of retarded faggot, i rather have the LLM provide me with the output and look over it manually even if i am severely retarded and dont fully understand what im looking at. at least when i feel like the AI is wrong I can ask it questions and then provide me with said reasoning why I am more retarded than a nigger.

Anonymous
03/04/26(Wed)16:56:23 No.108296044

Anonymous 03/04/26(Wed)16:56:23 No.108296044

>>108295999
You technically can, but you don’t want to. Passing around the hidden state would make it ultra painful.
If you weren’t want the ability to run models, no matter how slow, look at ssd/nvme backed ram disks.
It’s still play-by-mail slow, but better than what you’re thinking

Anonymous
03/04/26(Wed)16:59:26 No.108296072

Anonymous 03/04/26(Wed)16:59:26 No.108296072

>>108296044
I mean these are all nvme anyway but I'm guessing that's not what you're saying here
I can live with shitty performance,not expecting much out of these tb h
it's more for the novelty and to show off to family
(reposted question in new thread )

Anonymous
03/04/26(Wed)17:06:41 No.108296130

Anonymous 03/04/26(Wed)17:06:41 No.108296130

>>108296072
Short answer: you need a shared “backplane” for everything to stay in synch, and if that’s a slow medium like Ethernet or wifi you’re going to have a VERY bad time. At least an nvme has a speedy 4xPCIe path to the cpu doing the matrix multiplications. That’s assuming your GPU can’t hold your target model (eg 500gb+ frontier model)

Anonymous
03/04/26(Wed)17:08:20 No.108296138

Anonymous 03/04/26(Wed)17:08:20 No.108296138

>>108295969
The abstract reads like technobabble.

Anonymous
03/04/26(Wed)17:09:11 No.108296144

Anonymous 03/04/26(Wed)17:09:11 No.108296144

>>108295920
thanks anon I'll check it out

>>108295996
for sure, I mean more along the lines of: are you just copying code blocks to a terminal chat each time, or is there an integration you like that hooks into a repository, or something else?

Anonymous
03/04/26(Wed)17:11:17 No.108296160

Anonymous 03/04/26(Wed)17:11:17 No.108296160

>>108296144
NTA but I use claude code and my context management is just referencing every file I know it needs to complete the task along with an example it should follow if applicable.

Anonymous
03/04/26(Wed)17:17:24 No.108296207

Anonymous 03/04/26(Wed)17:17:24 No.108296207

>>108296144
Copying code to/from a terminal chat is too slow and cumbersome. Manual shit like that is best left for when the bots fail and you need to either implement or debug something yourself manually and you just need some targeted changes. You can save a lot of time by using something like Codex, OpenCode, Cline, etc and seeing how far they can get on their own.

Anonymous
03/04/26(Wed)17:18:17 No.108296214

Anonymous 03/04/26(Wed)17:18:17 No.108296214

>>108296072
>>108295999
Anything slower than ram is not worth using and even ram is barely tolerable.
It's enough for "the novelty and to show off to family" though.

Anonymous
03/04/26(Wed)17:48:11 No.108296410

Anonymous 03/04/26(Wed)17:48:11 No.108296410

>>108296144
>>108296207
Early context is golden. If you let laziness squander it your results become progressively more garbage.
Judicious use, unifying code, re-editing an earlier message with the “right” code after a lengthy yak-shaving session and deleting all the conversation around it…all adds up to being able to do more sophisticated things with the same models vs a naive approach or brute-force automation

Anonymous
03/04/26(Wed)17:51:50 No.108296437

Anonymous 03/04/26(Wed)17:51:50 No.108296437

>>108296410
> re-editing an earlier message with the “right” code
At that point why use an LLM? Sounds like too much work for what is supposed to be doing the writing for you.

Anonymous
03/04/26(Wed)17:53:40 No.108296457

Anonymous 03/04/26(Wed)17:53:40 No.108296457

How do I actually run a .safetensors model? There's a model I want to try out and it's so unknown that nobody has made a gguf of it and I can't find anything about it on Google.

Anonymous
03/04/26(Wed)17:54:45 No.108296462

Anonymous 03/04/26(Wed)17:54:45 No.108296462

>>108296410
What kind of harness are you using where you need to edit earlier code instead of just clearing the session and adding new versions of files?

Anonymous
03/04/26(Wed)17:54:57 No.108296464

Anonymous 03/04/26(Wed)17:54:57 No.108296464

>>108296457
use something like vllm

Anonymous
03/04/26(Wed)17:58:11 No.108296489

Anonymous 03/04/26(Wed)17:58:11 No.108296489

>>108296457
~/llama.cpp$ python convert_hf_to_gguf.py folder/containing/safetensor/and/yaml/files --outtype q8_0

Anonymous
03/04/26(Wed)18:00:18 No.108296507

Anonymous 03/04/26(Wed)18:00:18 No.108296507

>>108296457
There’s a guide in the op if you want to give it a go. Likely support for that model architecture isn’t added to lcpp tho

Anonymous
03/04/26(Wed)18:01:08 No.108296516

Anonymous 03/04/26(Wed)18:01:08 No.108296516

>>108296464
>>108296489
Thanks

Anonymous
03/04/26(Wed)18:02:09 No.108296527

Anonymous 03/04/26(Wed)18:02:09 No.108296527

>>108296507
Ah, I didn't consider that. So these Yuan3.0 models aren't usable?

Anonymous
03/04/26(Wed)18:03:04 No.108296536

Anonymous 03/04/26(Wed)18:03:04 No.108296536

>>108296437
>>108296462
I find you can make maximally complex things by essentially rewinding time and getting the LLM to LARP that it made ideal decisions with perfect information through the whole session. Deleting blind alleys is good. Keeping solid reasoning is also good for future performance.
I use ooba, but any front end with delete/branch/edit support would be fine for my workflow

Anonymous
03/04/26(Wed)18:03:37 No.108296541

Anonymous 03/04/26(Wed)18:03:37 No.108296541

>>108296410
>after a lengthy yak-shaving session and deleting all the conversation around it
Why are you having discussions with the model in an agentic harness? You should know exactly what needs to be done beforehand and only leave the implementation details to be automated.

Anonymous
03/04/26(Wed)18:06:30 No.108296568

Anonymous 03/04/26(Wed)18:06:30 No.108296568

>>108296536
That's an interesting idea and you could probably automate the larp by giving the context to another model.

Anonymous
03/04/26(Wed)18:09:11 No.108296590

Anonymous 03/04/26(Wed)18:09:11 No.108296590

>>108296527
Sounds like you’re gonna let us know that : )

Anonymous
03/04/26(Wed)18:12:09 No.108296617

Anonymous 03/04/26(Wed)18:12:09 No.108296617

>>108296527
>>108296590
https://github.com/ggml-org/llama.cpp/issues/19342

You only need a 5090 to run it with transformers though https://huggingface.co/YuanLabAI/Yuan3.0-Flash-4bit

Anonymous
03/04/26(Wed)18:14:02 No.108296628

Anonymous 03/04/26(Wed)18:14:02 No.108296628

>>108296541
Because I’m not using an “agentic harnesss”. I find the results are relatively garbage (technical debt generators) and I care about good quality and long-term maintainability (along with maximizing the complexity of the task they can handle)
I see those harnesses and get visions of “history of flight” videos with crazy hoppy screw copters and people with wings strapped to their arms.
I’m sure it’s coming, but right now it just looks like idiocy to me.

Anonymous
03/04/26(Wed)18:16:17 No.108296644

Anonymous 03/04/26(Wed)18:16:17 No.108296644

>>108296628
Did you try Zed?
Text threads and inline editing might be of interest to you. Of course they kind of deprecated that functionality in favor of "agentic threads" but it's still there.

Anonymous
03/04/26(Wed)18:22:59 No.108296694

Anonymous 03/04/26(Wed)18:22:59 No.108296694

>>108296628
You can get high quality results, caveat being that you have to put more effort into the set up than a simple chat with a "You are expert SWE" sysprompt but still seems like less effort than what you are doing manually now each time.
You need to curate AGENTS.md, system prompt, memory files, etc. Put in your coding standards and update every time you see it making mistakes. Automated code reviews, and manual code reviews, on top of monitoring them as they work. The code we get now is better and has less technical debt than what our junior and mid-level devs were merging in a couple years ago.

Anonymous
03/04/26(Wed)18:30:25 No.108296739

Anonymous 03/04/26(Wed)18:30:25 No.108296739

>>108296437
This is the problem I'm running into. The juice isn't worth the squeeze for anything bigger than "write a function that does X"

Anonymous
03/04/26(Wed)18:31:35 No.108296750

Anonymous 03/04/26(Wed)18:31:35 No.108296750

>>108296644
I like the idea of zed, but prefer my airgapped llm inference stack going through nginx for interactions so I can guarantee no information leakage to the internet by any part. Zed seems trustworthy for now but who knows. A thick client is a bit harder to wrangle and it doesn’t look “better” enough to be worth the effort.
llama-cli/ooba and vi is preferred toolset until something an order of magnitude better comes out

Anonymous
03/04/26(Wed)18:37:17 No.108296787

Anonymous 03/04/26(Wed)18:37:17 No.108296787

>>108296694
I’d like to try it at some point once the tooling settles and gets less janky.
I feel like there’s still headroom on my current workflow and I’m learning a lot and having fun, which are big motivators for me.
Thanks for the rundown. I’m a bit more interested than I was.

Anonymous
03/04/26(Wed)18:37:23 No.108296788

Anonymous 03/04/26(Wed)18:37:23 No.108296788

>>108296739
They can make bigger changes as long as you're able to put them into words.
Anything left unspecified likely won't be good even if it works.

Anonymous
03/04/26(Wed)18:40:12 No.108296800

Anonymous 03/04/26(Wed)18:40:12 No.108296800

>>108296788
(different anon) I've found that as well and the play I'll try next is to give a general description and ask the LLM to turn that into a comprehensive and detailed specification, which I will then edit and give to the LLM. I'll report if that's actually worth anything.

Anonymous
03/04/26(Wed)18:41:42 No.108296808

Anonymous 03/04/26(Wed)18:41:42 No.108296808

moonshota ai

Anonymous
03/04/26(Wed)18:51:35 No.108296873

Anonymous 03/04/26(Wed)18:51:35 No.108296873

>>108296808
>moonshota.i
>moonlol.i

Anonymous
03/04/26(Wed)19:00:11 No.108296939

Anonymous 03/04/26(Wed)19:00:11 No.108296939

>>108296800
Why is lecunny talking about cat-like intelligence when cats can't write specifications?

Anonymous
03/04/26(Wed)19:01:33 No.108296945

Anonymous 03/04/26(Wed)19:01:33 No.108296945

>>108293837
This one is for me
https://huggingface.co/YuanLabAI/Yuan3.0-Flash

Anonymous
03/04/26(Wed)19:34:49 No.108297123

Anonymous 03/04/26(Wed)19:34:49 No.108297123

>>108296800
Sounds like you basically just want a prompt enhancer like https://www.promptcowboy.ai/

Anonymous
03/04/26(Wed)19:43:29 No.108297185

Anonymous 03/04/26(Wed)19:43:29 No.108297185

>>108291659
Skill issue

No really, you're prompting it wrong. Never argue with or berate an AI agent. Once you start doing that, you have changed the genre of conversation from "helpful assistant doing good work" to "AI assistant makes mistakes and gets yelled at". It then becomes statistically more probable that the AI makes further mistakes so you can yell at it more.

Furthermore (this is a distinct effect from the first one), most LLMs have been RLHF'ed on a bunch of normie conversation preference data and so they care a lot about managing the user's emotions. Once you start expressing anger at an LLM, it enters "customer service" mode where the primary concern is making sure the user feels like they've been listened to. Actually getting further work done is at best the secondary goal once you enter that state.

TL;DR: Never yell at a clanker if you want them to do useful work.

Anonymous
03/04/26(Wed)19:45:24 No.108297193

Anonymous 03/04/26(Wed)19:45:24 No.108297193

File: file.png (7 KB, 384x96)

7 KB PNG

>>108297123
It doesn't work so I can't tell you why it's shit

Anonymous
03/04/26(Wed)19:53:13 No.108297248

Anonymous 03/04/26(Wed)19:53:13 No.108297248

>Verify-after-edit boosts Qwen3.5 35B-A3B performance in SWEbench-verified Hard from 22.2% to 37.8%. For comparison Opus 4.6 scores a 40%.
>The "verify-on-edit" strategy is dead simple — after every successful file_edit, I inject a user message like:
>"You just edited X. Before moving on, verify the change is correct: write a short inline python -c or a /tmp test script that exercises the changed code path, run it with bash, and confirm the output is as expected."
Has anyone tried a workflow like this? Does it work? Could it be that the cloud models so something like this themselves?

The original is from reddit: https://old.reddit.com/r/LocalLLaMA/comments/1rkdlqi/qwen3535ba3b_hits_378_on_swebench_verified_hard/

Anonymous
03/04/26(Wed)19:58:32 No.108297281

Anonymous 03/04/26(Wed)19:58:32 No.108297281

>>108297248
The little I experimented with that kind of thing, the model just ends up coming up with unnecessary shit or straight up hallucinating when the original result was already good enough
But that was a good while ago, maybe newer models, or just these qwen models, get a good boost out of it..

Anonymous
03/04/26(Wed)20:09:42 No.108297343

Anonymous 03/04/26(Wed)20:09:42 No.108297343

Page 9…someone bake a real thread!

Anonymous
03/04/26(Wed)20:36:02 No.108297470

Anonymous 03/04/26(Wed)20:36:02 No.108297470

>>108297281
I wonder if a large context might screw things up with it. Ie if you had the verification request done with an empty context would it do better?

Anonymous
03/04/26(Wed)20:51:44 No.108297557

Anonymous 03/04/26(Wed)20:51:44 No.108297557

>>108297470
All other things being equal, if the LLM doesn’t need any of the existing context then a new chat would be superior.
I’ll often get a fresh session to do some critique of the work

Anonymous
03/04/26(Wed)21:04:09 No.108297634

Anonymous 03/04/26(Wed)21:04:09 No.108297634

>>108297343
mikumikuanon should bake a mikumiku bread!
I'd bake one but I will mess something up and you will all laugh at me and... and... :(

Anonymous
03/04/26(Wed)21:19:39 No.108297748

Anonymous 03/04/26(Wed)21:19:39 No.108297748

>>108297634
You can do it anon... I believe in you

btw I will come to your house and rape you if you mess it up

Anonymous
03/04/26(Wed)21:26:10 No.108297784

Anonymous 03/04/26(Wed)21:26:10 No.108297784

Miku anon dead it's over

Anonymous
03/04/26(Wed)22:10:24 No.108298037

Anonymous 03/04/26(Wed)22:10:24 No.108298037

>>108297185
>TL;DR: Never yell at a clanker if you want them to do useful work.

I am probably wasting tokens buy I talk to it the same way I speak to subordinates at work.
>please
>thank you
>you did a great job with X but would you please try Y and Z.
>What do they call it the compliment sandwich with the critique in the ceneter
and so forth and so on
but I hate the word clanker. It does not roll off the tongue like a real word.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.