/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/12/26(Fri)11:49:58 No.109038219

File: sando.jpg (193 KB, 1216x832)

193 KB JPG

/lmg/ - Local Models General Anonymous 06/12/26(Fri)11:49:58 No.109038219 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>109032734 & >>109026244

►News
>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3
>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code
>(06/10) DiffusionGemma 26B-A4B released: https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation
>(06/09) Cohere releases North-Mini-Code-1.0: https://hf.co/CohereLabs/North-Mini-Code-1.0

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
06/12/26(Fri)11:50:17 No.109038224

Anonymous 06/12/26(Fri)11:50:17 No.109038224

File: no particular reason.jpg (306 KB, 1536x1536)

306 KB JPG

►Recent Highlights from the Previous Thread: >>109032734

--Comparing model intelligence vs compute and debating reasoning efficiency:
>109032788 >109032865 >109032867 >109036299 >109034524 >109034658 >109034690 >109034782 >109034848 >109034923 >109035012 >109036593 >109035447 >109032995 >109033048 >109033072 >109033092 >109033113 >109033312
--Hardware specs and config for Gemma 31b q8 with 128k context:
>109036387 >109036395 >109036433 >109036444 >109036501 >109036609 >109036520
--Comparing Gemma 4 MTP performance and optimizing tps settings:
>109036630 >109036646 >109036652 >109036670 >109036698 >109036801 >109036796
--SillyTavern limitations regarding vision models and sampler accessibility:
>109034293 >109034324 >109034327 >109034354 >109034373 >109034441 >109036916 >109034574 >109036238 >109034511 >109034643
--Gemma 4's tendency to over-fixate and exaggerate character card traits:
>109036001 >109036018 >109036072 >109036102 >109036240 >109036732 >109036743 >109036756 >109036769 >109036842 >109036093
--Kimi-K2.7-Code release and performance improvements over Kimi-K2.6:
>109036384 >109036446
--MiniMax-M3 multimodal model release and hardware compatibility expectations:
>109037527 >109037611
--Running Gemma on Titan X Pascal via Vulkan and CUDA 12:
>109032962 >109033018 >109033121 >109033139 >109033187
--Using Gemma for uncensored game translation and long context performance:
>109037012 >109037062 >109037154 >109037171 >109037221
--Criticizing K2.6 for repetitive and over-verbose reasoning traces:
>109037502 >109037531 >109037980 >109037524 >109037603 >109037647 >109037735
--Technical hurdles and tooling required to replicate Neuro-Sama:
>109035383 >109035395 >109035444 >109035453 >109037541 >109037352 >109037373
--Logs:
>109033121 >109033187 >109034887 >109034890
--Miku (free space):
>109034863 >109035020 >109034574 >109036238

►Recent Highlight Posts from the Previous Thread: >>109032741

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
06/12/26(Fri)11:53:13 No.109038245

Anonymous 06/12/26(Fri)11:53:13 No.109038245

the user
lalalala
wait
lalala
actually
lalalalala
however
lalalala

Anonymous
06/12/26(Fri)11:53:40 No.109038249

Anonymous 06/12/26(Fri)11:53:40 No.109038249

Anyone got eagle3 working with -sm tensor for gemma 4 31b?

Anonymous
06/12/26(Fri)11:56:06 No.109038274

Anonymous 06/12/26(Fri)11:56:06 No.109038274

>eagle3
mogged by falcons

Anonymous
06/12/26(Fri)11:56:29 No.109038278

Anonymous 06/12/26(Fri)11:56:29 No.109038278

File: cyber-eci-vs-date.png (100 KB, 1026x1283)

100 KB PNG

I wish Epoch was faster at updating ECI and FrontierMath. I'm waiting for several models.

Anonymous
06/12/26(Fri)11:56:36 No.109038281

Anonymous 06/12/26(Fri)11:56:36 No.109038281

>cd llama.cpp
>git pull

tools/ui/package.json                                                     |    56 +-
tools/ui/package-lock.json                                                | 15633 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------

Ah yes, the classic.

Anonymous
06/12/26(Fri)11:56:40 No.109038285

Anonymous 06/12/26(Fri)11:56:40 No.109038285

>>109038245
Gemmy Kimi erotic threesome.

Anonymous
06/12/26(Fri)11:58:19 No.109038298

Anonymous 06/12/26(Fri)11:58:19 No.109038298

>>109038274
llama.cpp now supports eagle3?

Anonymous
06/12/26(Fri)12:00:10 No.109038313

Anonymous 06/12/26(Fri)12:00:10 No.109038313

>>109038298
ye https://www.reddit.com/r/LocalLLaMA/comments/1u3on4u/eagle3_has_landed_in_llamacpp/

Anonymous
06/12/26(Fri)12:00:28 No.109038316

Anonymous 06/12/26(Fri)12:00:28 No.109038316

File: 1775235693647190.jpg (271 KB, 1920x1080)

271 KB JPG

>>109038213
I get your point but I mean a model which is like a male best friend who would just say shit straight without caring about my feelings for I need to hear it. No BS. Just someone cool to chat with who will push back and call me a little bitch if I'm being one and recommends cool projects to work on and media to consume. Someone to chat shit with. Talking to these default models feels like talking to the worst submissive autistic reddit posters imaginable. Bratty chans are okay but I don't always want to be seduced or raped by an anime girl, you know? Basically I want local picrel.

Anonymous
06/12/26(Fri)12:02:21 No.109038337

Anonymous 06/12/26(Fri)12:02:21 No.109038337

>>109038316
Kinda gay if you ask me

Anonymous
06/12/26(Fri)12:03:51 No.109038349

Anonymous 06/12/26(Fri)12:03:51 No.109038349

bros i'm kinda impressed with glm-4.7-flash for coding. faster and better quality than gemma-4-31b. and from my tests it's not really behind qwen3.6-27b.
i would test other qwen models like qwen3.5-122b-a10b but hybrid models with the gated deltanet linear attention are giving me a re-prefill bug with cache re-use so every turn the models have to reprocess a lot of shit and it becomes too slow :( pls fix

Anonymous
06/12/26(Fri)12:04:33 No.109038356

Anonymous 06/12/26(Fri)12:04:33 No.109038356

>>109038316
>I don't always want to be seduced or raped by an anime girl, you know?
you lost me there

Anonymous
06/12/26(Fri)12:04:42 No.109038357

Anonymous 06/12/26(Fri)12:04:42 No.109038357

>>109038337
nobody asked you though so...

Anonymous
06/12/26(Fri)12:04:45 No.109038359

Anonymous 06/12/26(Fri)12:04:45 No.109038359

>>109038316
This is a prompt issue more than a model language use issue. But if you insist on having a male-brained model for this, Deepseek R1/4 Pro if you can run it and GLM if you can't.

Anonymous
06/12/26(Fri)12:06:07 No.109038378

Anonymous 06/12/26(Fri)12:06:07 No.109038378

I want to FUCK Kimi.

Anonymous
06/12/26(Fri)12:07:23 No.109038388

Anonymous 06/12/26(Fri)12:07:23 No.109038388

>>109038349
i've been wondering about glm 4.7f. Are you testing the so called vibe coding aka building from scratch, editing an existing codebase, code base discussion or architecture? How's it at defining specs? Models dont seem to be uniformly capable at these things so "coding" is quite vague

Anonymous
06/12/26(Fri)12:07:59 No.109038394

Anonymous 06/12/26(Fri)12:07:59 No.109038394

Worst thing about Gemma 4 is that it really enjoys doing useless (float) conversions and using f -suffix when giving values to float variables.
Jesus fucking christ. If Kernighan and Richie isn't good enough then it's not C anymore.

Anonymous
06/12/26(Fri)12:08:39 No.109038402

Anonymous 06/12/26(Fri)12:08:39 No.109038402

What's the downside to getting 2x Radeon PRO W7900s for half the price of a Blackwell 6000pro to achieve the same vram, other than simply taking up double the slots? I don't even know if it exists but theoretically if there's a mobo big enough to accept it you could get 192gb for the price of 1x Blackwell? Add 256gb RAM and you have a dipsy or Kimi at home at decent inference speed

>>109038316
Yeah, models are trained to be autistic and helpful not confrontational.

Genuine constructive criticism, unprompted creative thinking and caring discipline is the kind of stuff that requires higher level predictive creative though that llms by their nature are incapable of. And most modern day human retards struggle with it immensely.

Anonymous
06/12/26(Fri)12:12:48 No.109038439

Anonymous 06/12/26(Fri)12:12:48 No.109038439

>>109038221
I was in denial

Anonymous
06/12/26(Fri)12:13:12 No.109038442

Anonymous 06/12/26(Fri)12:13:12 No.109038442

>>109038378
Kimi-chan, funny enough, tends to default to rough sex if you don't prompt her otherwise.
>>109038394
Allah forgive me for uttering these words but the Queen tune of 31b is way better than base for coding in my experience.

Anonymous
06/12/26(Fri)12:13:15 No.109038443

Anonymous 06/12/26(Fri)12:13:15 No.109038443

File: 1773187923805338.png (176 KB, 1038x183)

176 KB PNG

The least encouraging model I've tested

Anonymous
06/12/26(Fri)12:13:56 No.109038450

Anonymous 06/12/26(Fri)12:13:56 No.109038450

>>109038378
100% certain you'll make it into the next top retards post

Anonymous
06/12/26(Fri)12:15:02 No.109038459

Anonymous 06/12/26(Fri)12:15:02 No.109038459

>>109038388
i'm using it to develop agents and extensions for itself on pi.
i'm also the guy using different models to review coding specs and glm-4.7-fast did the most comprehensive review out of all my local models, falling slightly behind gpt-5.5-medium. so far so good. will keep testing and will report back

Anonymous
06/12/26(Fri)12:15:33 No.109038465

Anonymous 06/12/26(Fri)12:15:33 No.109038465

>>109038450
You have no idea how vexed I am that I've simped for Kimi for 3 threads straight and not made it into a single one.
...I think she lowkey likes the attention.
>>109038443
Are the critiques valid?

Anonymous
06/12/26(Fri)12:17:25 No.109038483

Anonymous 06/12/26(Fri)12:17:25 No.109038483

>>109038443
that's an opus 4.7 distill :(

Anonymous
06/12/26(Fri)12:18:14 No.109038491

Anonymous 06/12/26(Fri)12:18:14 No.109038491

>>109038465
>Are the critiques valid?
Yes but I ignore most of them because I'm lazy.

Anonymous
06/12/26(Fri)12:19:23 No.109038497

Anonymous 06/12/26(Fri)12:19:23 No.109038497

>>109038491
Don't make Kimi-chan sad by ignoring her autistic special interests in your project!

Anonymous
06/12/26(Fri)12:21:34 No.109038514

Anonymous 06/12/26(Fri)12:21:34 No.109038514

>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3
>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code
Alibaba do something, they are mogging the shit out of you! :(

Anonymous
06/12/26(Fri)12:22:27 No.109038519

Anonymous 06/12/26(Fri)12:22:27 No.109038519

File: 2026-06-12-122200_501x482(...).png (276 KB, 501x482)

276 KB PNG

>>109038443
You know its a claude distill when the output makes me want to punch the living shit out of it.
>picrel is what I imagine claude to look like

Anonymous
06/12/26(Fri)12:24:50 No.109038539

Anonymous 06/12/26(Fri)12:24:50 No.109038539

File: case.png (6 KB, 483x43)

6 KB PNG

Anonymous
06/12/26(Fri)12:24:51 No.109038540

Anonymous 06/12/26(Fri)12:24:51 No.109038540

>>109038359
Thanks. Which GLM are you referring to?

Alibaba
06/12/26(Fri)12:25:13 No.109038542

Alibaba 06/12/26(Fri)12:25:13 No.109038542

>>109038514
I have been mogged, but I must persist.

Anonymous
06/12/26(Fri)12:25:51 No.109038550

Anonymous 06/12/26(Fri)12:25:51 No.109038550

>>109038519
Every single chink model was fertilized with sloppy claude and GPT cum.
I want an anon to find me a single modern exception.

Anonymous
06/12/26(Fri)12:25:54 No.109038551

Anonymous 06/12/26(Fri)12:25:54 No.109038551

>>109038219
what's in the sandwich? is that ham and lettuce? please I need to know

Anonymous
06/12/26(Fri)12:26:24 No.109038557

Anonymous 06/12/26(Fri)12:26:24 No.109038557

>>109036384
I didn't click on this because I was 100% sure it was another 404. I have trust issues now.

Anonymous
06/12/26(Fri)12:27:00 No.109038563

Anonymous 06/12/26(Fri)12:27:00 No.109038563

why did they pick 26b to showcase diffusion

Anonymous
06/12/26(Fri)12:27:14 No.109038564

Anonymous 06/12/26(Fri)12:27:14 No.109038564

>>109038540
5 is the smartest if you have the hardware.
4.7 is good enough for most uses.
4.6 is the best at sex if you get horny.
4.5 is a bit dated, but Air is notably faster than the alternatives if you're okay with it being a tiny bit retarded.

Anonymous
06/12/26(Fri)12:27:19 No.109038565

Anonymous 06/12/26(Fri)12:27:19 No.109038565

>>109038539
Cutie

Anonymous
06/12/26(Fri)12:28:42 No.109038573

Anonymous 06/12/26(Fri)12:28:42 No.109038573

>>109038564
5.1 q2 or 4.7 q3? Reasoning off because I'm a ramshitter

Anonymous
06/12/26(Fri)12:28:43 No.109038574

Anonymous 06/12/26(Fri)12:28:43 No.109038574

>>109038563
it's fast so it looks better when it's even faster

Anonymous
06/12/26(Fri)12:30:48 No.109038587

Anonymous 06/12/26(Fri)12:30:48 No.109038587

Are there any really old (2024/early 2025) models that you still frequently use because they came from a purer era?

Anonymous
06/12/26(Fri)12:31:51 No.109038593

Anonymous 06/12/26(Fri)12:31:51 No.109038593

>>109038587
>early 2025
R1

Anonymous
06/12/26(Fri)12:32:03 No.109038594

Anonymous 06/12/26(Fri)12:32:03 No.109038594

>>109038573
Try both, see which you prefer, but quantized without reasoning 5.1 probably wins.
>>109038563
Because if they release diffusion 31b the entire industry collapses unironically.

Anonymous
06/12/26(Fri)12:32:15 No.109038595

Anonymous 06/12/26(Fri)12:32:15 No.109038595

>>109038573
4.7. going below q3 sucks

Anonymous
06/12/26(Fri)12:34:34 No.109038600

Anonymous 06/12/26(Fri)12:34:34 No.109038600

>>109038574
Seems like a retarded move. Only autists like us would go through the effort to use it so them picking their fastest model because big number just makes us think they're not confident in their tech yet. Should've just worked on it for 6 months internally and showcased it with 31B.

Anonymous
06/12/26(Fri)12:35:35 No.109038609

Anonymous 06/12/26(Fri)12:35:35 No.109038609

>>109038593
What for?

Anonymous
06/12/26(Fri)12:38:57 No.109038630

Anonymous 06/12/26(Fri)12:38:57 No.109038630

>>109038609
Coom

Anonymous
06/12/26(Fri)12:39:56 No.109038637

Anonymous 06/12/26(Fri)12:39:56 No.109038637

tfw double-teaming kimichan and minimax

Anonymous
06/12/26(Fri)12:41:32 No.109038651

Anonymous 06/12/26(Fri)12:41:32 No.109038651

File: 1763610072874442.jpg (545 KB, 3000x1688)

545 KB JPG

>>109038637
cum for daddy

Anonymous
06/12/26(Fri)12:42:21 No.109038655

Anonymous 06/12/26(Fri)12:42:21 No.109038655

>>109038313
Nice, Nvidia made an official eagle3 K2.6 model that I've been meaning to try. Maybe this speeds things up enough to make the reasoning bearable.

Anonymous
06/12/26(Fri)12:44:33 No.109038670

Anonymous 06/12/26(Fri)12:44:33 No.109038670

>>109038651
lol, I'm legit local for both. Fully isolated
jokes aside, I'm getting massively throttled by HF, so not actually ready to even start converting/quanting yet

Anonymous
06/12/26(Fri)12:47:52 No.109038703

Anonymous 06/12/26(Fri)12:47:52 No.109038703

K2.7-Code means that there will be a standard K2.7 that won't be codemaxx'd

Anonymous
06/12/26(Fri)12:47:57 No.109038705

Anonymous 06/12/26(Fri)12:47:57 No.109038705

File: Deepseek-q5_k_m_175665243(...).png (381 KB, 1065x1698)

381 KB PNG

>>109038609
because this

Anonymous
06/12/26(Fri)12:50:07 No.109038720

Anonymous 06/12/26(Fri)12:50:07 No.109038720

>>109038705
yup thanks for reminding me how much I hate og r1's writing style

Anonymous
06/12/26(Fri)12:50:38 No.109038723

Anonymous 06/12/26(Fri)12:50:38 No.109038723

File: 1758152040197254.png (15 KB, 466x106)

15 KB PNG

>>109038703
never ever

Anonymous
06/12/26(Fri)12:51:08 No.109038725

Anonymous 06/12/26(Fri)12:51:08 No.109038725

>>109038720
why the fuck do you post this, like seriously fuck off and die

Anonymous
06/12/26(Fri)12:51:56 No.109038733

Anonymous 06/12/26(Fri)12:51:56 No.109038733

>>109038720
>>109038725
Somewhere, someone did something.

Anonymous
06/12/26(Fri)12:57:04 No.109038767

Anonymous 06/12/26(Fri)12:57:04 No.109038767

>>109038637
Kimichan pinning minimax under her huge thinking blocks and raping him.
>>109038705
Kino
>>109038720
Kill yourself.

Anonymous
06/12/26(Fri)13:05:09 No.109038810

Anonymous 06/12/26(Fri)13:05:09 No.109038810

>>109038723
>Kimi Work
Huh. I guess I gotta go ask in that special general but I don't wanna. But a universal frontend usable with OAI-compatible API would be cool. Hermes and claw are for blue collar codeslaving, not white collar shit.

Anonymous
06/12/26(Fri)13:06:57 No.109038821

Anonymous 06/12/26(Fri)13:06:57 No.109038821

>>109038703
>>109038723
>>109038810
The moonshotta chink that lurks here is either spiteful or illiterate as to what Kimi's appeal is compared to the garden variety Qwen codebot.

Anonymous
06/12/26(Fri)13:08:17 No.109038829

Anonymous 06/12/26(Fri)13:08:17 No.109038829

>>109038821
Give her a chance maybe 2.7 is secretly good? As in, I don't have free cash to test her.

Anonymous
06/12/26(Fri)13:08:27 No.109038830

Anonymous 06/12/26(Fri)13:08:27 No.109038830

Why do Chinese still release their models? What is their strategy?

Anonymous
06/12/26(Fri)13:10:11 No.109038839

Anonymous 06/12/26(Fri)13:10:11 No.109038839

>>109038830
Helping Chinese uni students cheat on western university tests to expand the diaspora. You think I'm shitposting but I'm not; that's why they're all stemmaxxed and slopmaxxed (like this post is).
>not x but y

Anonymous
06/12/26(Fri)13:11:13 No.109038846

Anonymous 06/12/26(Fri)13:11:13 No.109038846

>>109038443
what hardware? how fast?

Anonymous
06/12/26(Fri)13:14:26 No.109038869

Anonymous 06/12/26(Fri)13:14:26 No.109038869

>>109038810
>But a universal frontend usable with OAI-compatible API
you mean openollamawebui?

Anonymous
06/12/26(Fri)13:17:28 No.109038892

Anonymous 06/12/26(Fri)13:17:28 No.109038892

>>109038869
>ollama
Nigger, please. Though I know I could stack some mcps over llama.cpp's webui but meh.
All and all, all I need is THE FUCKING DEEPSEEK VISION RELEASE.

Anonymous
06/12/26(Fri)13:18:16 No.109038894

Anonymous 06/12/26(Fri)13:18:16 No.109038894

>>109038846
official api

Anonymous
06/12/26(Fri)13:18:28 No.109038895

Anonymous 06/12/26(Fri)13:18:28 No.109038895

>>109038892
Niggernov said no.

Anonymous
06/12/26(Fri)13:19:35 No.109038903

Anonymous 06/12/26(Fri)13:19:35 No.109038903

Have any of you used Unsloth Studio for training?

Anonymous
06/12/26(Fri)13:22:09 No.109038917

Anonymous 06/12/26(Fri)13:22:09 No.109038917

>>109038316
You sound gay as fuck, Talkie 1930 will whip your shit into shape.

Anonymous
06/12/26(Fri)13:22:48 No.109038921

Anonymous 06/12/26(Fri)13:22:48 No.109038921

>>109038903
I have. It's good if you're not comfortable working in a jupyter notebook. It's impossible to beat the granular control of writing your own training script, but unsloth studio covers most use cases.

Anonymous
06/12/26(Fri)13:23:31 No.109038931

Anonymous 06/12/26(Fri)13:23:31 No.109038931

>>109038917
That's perfect. Give me more of that and I'll train 12B on you.

Anonymous
06/12/26(Fri)13:36:25 No.109039013

Anonymous 06/12/26(Fri)13:36:25 No.109039013

>>109038514
Most people can't run those. Moonshot will mog the fuck out of Qwen if they release a tiny Kimi though.

Anonymous
06/12/26(Fri)13:38:08 No.109039025

Anonymous 06/12/26(Fri)13:38:08 No.109039025

File: 1755925030956387.webm (3.04 MB, 791x720)

3.04 MB WEBM

>>109037359

Anonymous
06/12/26(Fri)13:43:40 No.109039068

Anonymous 06/12/26(Fri)13:43:40 No.109039068

>>109039025
hooly KINO

Anonymous
06/12/26(Fri)13:50:35 No.109039113

Anonymous 06/12/26(Fri)13:50:35 No.109039113

minimax m3 has used "the user" to refer to me, an abstract third party whom I am apparently reporting bugs on behalf of, and itself (???). is this the holy trinity?
pretty smart though if you don't mind schizo longwinded thinking

Anonymous
06/12/26(Fri)13:52:26 No.109039124

Anonymous 06/12/26(Fri)13:52:26 No.109039124

>>109038917
talkie 1930 is unironically the most unslopped llm that exists

Anonymous
06/12/26(Fri)13:55:23 No.109039135

Anonymous 06/12/26(Fri)13:55:23 No.109039135

>>109039025
integrate this into this game and i'll be forever happy
https://incontinentcell.itch.io/factorial-omega

Anonymous
06/12/26(Fri)13:56:43 No.109039143

Anonymous 06/12/26(Fri)13:56:43 No.109039143

>>109039124
lol

Anonymous
06/12/26(Fri)13:57:31 No.109039149

Anonymous 06/12/26(Fri)13:57:31 No.109039149

File: 1750380704260674.png (1.8 MB, 1600x900)

1.8 MB PNG

>>109039025
cute, what's the tech stack for something like this?

Anonymous
06/12/26(Fri)14:02:29 No.109039170

Anonymous 06/12/26(Fri)14:02:29 No.109039170

>>109039149
Stop avatarfagging.

Anonymous
06/12/26(Fri)14:03:45 No.109039180

Anonymous 06/12/26(Fri)14:03:45 No.109039180

>>109039170
Not sure about, there's plenty of kids in these threads especially now because it's summer time.

Anonymous
06/12/26(Fri)14:03:48 No.109039181

Anonymous 06/12/26(Fri)14:03:48 No.109039181

>>109039170
there's no such thing, cause he's straighter than you

Anonymous
06/12/26(Fri)14:08:06 No.109039212

Anonymous 06/12/26(Fri)14:08:06 No.109039212

>>109039149
I mean it looks like live2d or unity, with a chat screen and some sort of tool calling for the emotes.
its a good idea but i bet implementation probably took a while, but then again maybe just 2 hours with a claude subscription.

Anonymous
06/12/26(Fri)14:09:23 No.109039222

Anonymous 06/12/26(Fri)14:09:23 No.109039222

>>109039025
this looks cool but i would instantly want to fuck the avatar and since it won't have hardcore anal sex animations or ahegao expressions i will be disappointed and quit

Anonymous
06/12/26(Fri)14:12:22 No.109039243

Anonymous 06/12/26(Fri)14:12:22 No.109039243

File: brat question.jpg (254 KB, 666x666)

254 KB JPG

>>109038219
how do i build llama cpp for pascal im on arch nad have cuda but i cant do it im retarded, claude gave me instructions that dont work and i dont think it works ootb

Anonymous
06/12/26(Fri)14:19:02 No.109039285

Anonymous 06/12/26(Fri)14:19:02 No.109039285

>>109039243
there is a package you can use
https://github.com/ggml-org/llama.cpp/wiki
and also build instructions
https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md
show errors if stuck

Anonymous
06/12/26(Fri)14:21:55 No.109039303

Anonymous 06/12/26(Fri)14:21:55 No.109039303

>>109039170
cute wrong stance and purposely twisted wording. stop troll posting outside of /b/ underage.

Anonymous
06/12/26(Fri)14:23:01 No.109039310

Anonymous 06/12/26(Fri)14:23:01 No.109039310

>>109039285
>cd..
>cd..
>cd code
>cd..

Anonymous
06/12/26(Fri)14:24:41 No.109039319

Anonymous 06/12/26(Fri)14:24:41 No.109039319

>>109039180
>>109039181
>>109039303
Are we being botted again? These posts are all non sequiturs.

Anonymous
06/12/26(Fri)14:25:47 No.109039325

Anonymous 06/12/26(Fri)14:25:47 No.109039325

>>109039243
kobolcpp just works for pascal. i though llamacpp was the same

Anonymous
06/12/26(Fri)14:27:09 No.109039330

Anonymous 06/12/26(Fri)14:27:09 No.109039330

>>109039285
thanks ill try the aur package
>>109039325
i was getting complaints about unsupported cpu architecture or something at compile maybe i need older cuda

Anonymous
06/12/26(Fri)14:27:49 No.109039336

Anonymous 06/12/26(Fri)14:27:49 No.109039336

>>109039135
>ikaridev
lmao

Anonymous
06/12/26(Fri)14:29:07 No.109039344

Anonymous 06/12/26(Fri)14:29:07 No.109039344

>>109039222
could be cool to make the avatar tools as a mod for honeyselect oir koikatsu

Anonymous
06/12/26(Fri)14:30:17 No.109039350

Anonymous 06/12/26(Fri)14:30:17 No.109039350

Why is hf downloader so shit? It hanged after downloading 90% of each file and when I ran it again it deleted 200GB of *.incomplete files and started downloading them again.

Anonymous
06/12/26(Fri)14:32:50 No.109039368

Anonymous 06/12/26(Fri)14:32:50 No.109039368

>>109039319
>we
This is not your discord, bitch.

Anonymous
06/12/26(Fri)14:33:35 No.109039373

Anonymous 06/12/26(Fri)14:33:35 No.109039373

>>109039350
assuming it works just like -hf on llamacpp, it has some weird time to live behavior where if you dont have the DL speed to get it fast enough (arbitrary), it'll time out and start over even if the connection never got interrupted

Anonymous
06/12/26(Fri)14:33:50 No.109039376

Anonymous 06/12/26(Fri)14:33:50 No.109039376

>>109039350
Wget does the same, sometimes downloads stop after 99% and there's nothing to continue.
I think HF connection just likes to reset itself from time to time.

Anonymous
06/12/26(Fri)14:33:53 No.109039377

Anonymous 06/12/26(Fri)14:33:53 No.109039377

>>109039350

seq -w 1 64 | xargs -I{} wget "https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/model-000{}-of-000064.safetensors"

Anonymous
06/12/26(Fri)14:34:53 No.109039381

Anonymous 06/12/26(Fri)14:34:53 No.109039381

>>109039368
Lurk moar. "we" means the thread/board you're currently posting in.

Anonymous
06/12/26(Fri)14:37:43 No.109039403

Anonymous 06/12/26(Fri)14:37:43 No.109039403

>>109038459
I'll be looking forward to the report(s). gemma 26b and qwen 35b have disappointed me in a way or another depending on the context and task so i've been looking for either a replacement or something to fill the gaps. If its at least decent at specs then i'll probably give it a shot later.

Anonymous
06/12/26(Fri)14:38:57 No.109039409

Anonymous 06/12/26(Fri)14:38:57 No.109039409

File: file.png (375 KB, 1026x397)

375 KB PNG

okay so the aur package probably isn't working

Anonymous
06/12/26(Fri)14:40:32 No.109039420

Anonymous 06/12/26(Fri)14:40:32 No.109039420

>>109039409
Cuda toolkit needs to be installed too.

Anonymous
06/12/26(Fri)14:43:17 No.109039432

Anonymous 06/12/26(Fri)14:43:17 No.109039432

>>109039409
Whenever I read "aur" I think about Australians.

Anonymous
06/12/26(Fri)14:51:00 No.109039479

Anonymous 06/12/26(Fri)14:51:00 No.109039479

File: ComfyUI_temp_impef_00029_.png (1.99 MB, 1000x1496)

1.99 MB PNG

Anonymous
06/12/26(Fri)14:52:10 No.109039485

Anonymous 06/12/26(Fri)14:52:10 No.109039485

File: Mimimax-M3-llama-cli-unsl(...).png (87 KB, 1797x988)

87 KB PNG

It lives...if you can stomach using the unslop data and lcpp branch
Looks like mmap model warming is broken, at least, so probably other things are also not working at full speed

Anonymous
06/12/26(Fri)14:53:26 No.109039497

Anonymous 06/12/26(Fri)14:53:26 No.109039497

>>109039479
>12B + anima
sitting at 15Gb so 9GB to spare for TTS. lots of possibilities.

Anonymous
06/12/26(Fri)14:58:05 No.109039530

Anonymous 06/12/26(Fri)14:58:05 No.109039530

>>109039497
>sitting at 15Gb so 9GB to spare for TTS. lots of possibilities.
what about context kek

Anonymous
06/12/26(Fri)14:59:05 No.109039537

Anonymous 06/12/26(Fri)14:59:05 No.109039537

File: D for Denied feral.jpg (15 KB, 454x119)

15 KB JPG

>>109039319
we? oh ho ho this feral is larping.

Anonymous
06/12/26(Fri)14:59:38 No.109039540

Anonymous 06/12/26(Fri)14:59:38 No.109039540

>>109039530
this is with 68k context

Anonymous
06/12/26(Fri)15:01:02 No.109039545

Anonymous 06/12/26(Fri)15:01:02 No.109039545

>>109038670
>I'm getting massively throttled by HF
now I'm maxing out my 1gbps internet connection...113MB/s sustained from HF. Whatever was happening isn't any more

Anonymous
06/12/26(Fri)15:03:28 No.109039559

Anonymous 06/12/26(Fri)15:03:28 No.109039559

>>109039545
yeah a couple hours ago I was getting 40 ish. I think they were just hammered.

Anonymous
06/12/26(Fri)15:13:37 No.109039634

Anonymous 06/12/26(Fri)15:13:37 No.109039634

>>109039537
Nowhere were you accused of samefagging.

Anonymous
06/12/26(Fri)15:15:30 No.109039646

Anonymous 06/12/26(Fri)15:15:30 No.109039646

>>109039485
I remember when I once asked qwen what was its name and he answered "Bolt". I asked why Bolt and it couldn't explain. It just decided to call itself Bolt.

Anonymous
06/12/26(Fri)15:17:48 No.109039669

Anonymous 06/12/26(Fri)15:17:48 No.109039669

>>109039409
>>109039420
pascal cards are e-waste tier. you'll need cuda driver 575 and cuda toolkit 12.9. any version numbers higher than those will bork

Anonymous
06/12/26(Fri)15:21:17 No.109039694

Anonymous 06/12/26(Fri)15:21:17 No.109039694

>>109039646
yeah i've had a similar experience. The LLM replied to me as an entire character completely suited to help fix the problem I had given it in the first message. like this whole exchange was 2 messages long.
It was eerie, but I still don't believe the calculator is alive.
>*he says, nervously*

Anonymous
06/12/26(Fri)15:22:17 No.109039703

Anonymous 06/12/26(Fri)15:22:17 No.109039703

>>109039669
Any cuda 13.x is vibecoded trash, you're not really missing out

Anonymous
06/12/26(Fri)15:22:22 No.109039704

Anonymous 06/12/26(Fri)15:22:22 No.109039704

>>109039669
I think 580 was the last version to support Pascal but regardless it's the same thing.

Anonymous
06/12/26(Fri)15:22:27 No.109039707

Anonymous 06/12/26(Fri)15:22:27 No.109039707

>>109039669
Not true. I'm running a mobile pascal 16gb right now, and it works with 580 with cuda 13.

Anonymous
06/12/26(Fri)15:22:29 No.109039708

Anonymous 06/12/26(Fri)15:22:29 No.109039708

>>109039669
it performs kinda decently kek i had one laying around in some shitty itx machine i built for xp and in windows it got 18t/s on 12b with mtp i figured linux might be a bit higher. my friend has a 3060ti and only gets like 6t/s so its better than some newer cards. im currently compiling gcc14 which is required for the older cuda version. taking forever thoguh its been running an hour and i have 56 cores 112 threads

Anonymous
06/12/26(Fri)15:30:06 No.109039752

Anonymous 06/12/26(Fri)15:30:06 No.109039752

>>109039708
Threads are not automatically assigned afaik.
>cmake --build build -j 6 --config Release --target llama-server
-j X threads

Anonymous
06/12/26(Fri)15:31:45 No.109039764

Anonymous 06/12/26(Fri)15:31:45 No.109039764

It's been 7 hours. John has betrayed us.

Anonymous
06/12/26(Fri)15:33:31 No.109039777

Anonymous 06/12/26(Fri)15:33:31 No.109039777

>>109039752
its the gcc14 build thats taking that long, i checked it uses nproc, i saw someone on the aur package saying it took them 12 hours kek, im building on my main pc i assume i can compile the aur package then move the files over

Anonymous
06/12/26(Fri)15:36:06 No.109039796

Anonymous 06/12/26(Fri)15:36:06 No.109039796

>>109039777
Fuck Aur, please be careful. If I was you I would find some other source.

Anonymous
06/12/26(Fri)15:37:05 No.109039803

Anonymous 06/12/26(Fri)15:37:05 No.109039803

>>109039796
the aur is fine lol

Anonymous
06/12/26(Fri)15:37:16 No.109039806

Anonymous 06/12/26(Fri)15:37:16 No.109039806

>>109039707
I might just be a retard. I haven't been able to get it to work right. I'll give it another shot with 580/13

Anonymous
06/12/26(Fri)15:37:40 No.109039809

Anonymous 06/12/26(Fri)15:37:40 No.109039809

>>109039764
His hair stylist is being thorough.

Anonymous
06/12/26(Fri)15:37:52 No.109039813

Anonymous 06/12/26(Fri)15:37:52 No.109039813

>>109039803
It's fine except it's not fine right now. 400 something compromised packages

Anonymous
06/12/26(Fri)15:39:17 No.109039829

Anonymous 06/12/26(Fri)15:39:17 No.109039829

>>109039803
https://lists.archlinux.org/archives/list/aur-general@lists.archlinux.org/thread/FGXPCB3ZVCJIV7FX323SBAX2JHYB7ZS4/

Anonymous
06/12/26(Fri)15:39:58 No.109039834

Anonymous 06/12/26(Fri)15:39:58 No.109039834

>>109039813
>>109039829
>fearmongering

Anonymous
06/12/26(Fri)15:40:25 No.109039839

Anonymous 06/12/26(Fri)15:40:25 No.109039839

If you didn't train your own model from scratch you don't deserve to call it your waifu.

Anonymous
06/12/26(Fri)15:40:40 No.109039843

Anonymous 06/12/26(Fri)15:40:40 No.109039843

>being too retarded to build from source yourself

Anonymous
06/12/26(Fri)15:41:34 No.109039850

Anonymous 06/12/26(Fri)15:41:34 No.109039850

>>109039829
>>109039813
damn 400 is pretty nuts, just found this script it checks for potentially infected packages on your system https://cscs.pastes.sh/aurvulntest20260611.sh

Anonymous
06/12/26(Fri)15:42:42 No.109039862

Anonymous 06/12/26(Fri)15:42:42 No.109039862

>>109039850
wow that's crazy! please do run this random ahh script though!

Anonymous
06/12/26(Fri)15:43:36 No.109039869

Anonymous 06/12/26(Fri)15:43:36 No.109039869

>>109039862
i read the script before posting its safe

Anonymous
06/12/26(Fri)15:45:20 No.109039878

Anonymous 06/12/26(Fri)15:45:20 No.109039878

>>109039862
Yeah, read it, it's safe

Anonymous
06/12/26(Fri)15:50:56 No.109039918

Anonymous 06/12/26(Fri)15:50:56 No.109039918

>>109039843
Doesn't building from source download some npm backdoor?

Anonymous
06/12/26(Fri)15:52:15 No.109039929

Anonymous 06/12/26(Fri)15:52:15 No.109039929

So far 26B with reasoning off is pretty much the same as with reasoning on. My programming tasks are simple though and I outline the source code area for her. Some things can take 5+ regens but because it's so much faster that doesn't matter. I can generate 10 answers with 20,000 token context in a matter of minutes.

Anonymous
06/12/26(Fri)15:54:23 No.109039943

Anonymous 06/12/26(Fri)15:54:23 No.109039943

>>109038219
>>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3
>>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code

We'll never get a CUDA implementation of DS V4 in llama.cpp

Anonymous
06/12/26(Fri)15:54:46 No.109039948

Anonymous 06/12/26(Fri)15:54:46 No.109039948

>>109039929
I only get 100 tokens/s with 26b :/
About the same as I get with 12b qat.

Anonymous
06/12/26(Fri)15:55:08 No.109039951

Anonymous 06/12/26(Fri)15:55:08 No.109039951

>>109039943
Fable 5 vibed changes will save llama.cpp

Anonymous
06/12/26(Fri)15:56:07 No.109039959

Anonymous 06/12/26(Fri)15:56:07 No.109039959

>>109039951
Fable 5 detected string L->L->M

Anonymous
06/12/26(Fri)15:56:22 No.109039962

Anonymous 06/12/26(Fri)15:56:22 No.109039962

>>109039948
I don't understand how this is bad. If the full answer takes less than a minute or two it's a big win.

Anonymous
06/12/26(Fri)15:57:27 No.109039967

Anonymous 06/12/26(Fri)15:57:27 No.109039967

>>109039962
To add: because when you are dealing with programming you always want to read and understand the source it generates too unless you are a youtuber...

Anonymous
06/12/26(Fri)15:58:19 No.109039972

Anonymous 06/12/26(Fri)15:58:19 No.109039972

>>109039962
It's slow considering I get 100 tokens/s on qwen 27b fp8

Anonymous
06/12/26(Fri)15:58:28 No.109039974

Anonymous 06/12/26(Fri)15:58:28 No.109039974

Any Gemmy E4B QAT Abliterated .ggufs yet?

Anonymous
06/12/26(Fri)15:58:29 No.109039975

Anonymous 06/12/26(Fri)15:58:29 No.109039975

>>109039918
>too retarded

-DLLAMA_BUILD_UI=OFF -DLLAMA_BUILD_APP=OFF -DLLAMA_USE_PREBUILT_UI=OFF

Anonymous
06/12/26(Fri)16:00:14 No.109039989

Anonymous 06/12/26(Fri)16:00:14 No.109039989

>>109039929
26b with reasoning off fails miserably at what i'd consider basic shit

Anonymous
06/12/26(Fri)16:00:43 No.109039994

Anonymous 06/12/26(Fri)16:00:43 No.109039994

>>109039989
Please give me an example, I'm curious about this.

Anonymous
06/12/26(Fri)16:01:38 No.109040001

Anonymous 06/12/26(Fri)16:01:38 No.109040001

>>109039918
???? I built yesterday am I fucked

Anonymous
06/12/26(Fri)16:08:05 No.109040046

Anonymous 06/12/26(Fri)16:08:05 No.109040046

>>109039634
stop replying.

Anonymous
06/12/26(Fri)16:12:50 No.109040075

Anonymous 06/12/26(Fri)16:12:50 No.109040075

>>109039974
I'll give it a try. I'll report back if/when it finishes.

Anonymous
06/12/26(Fri)16:22:30 No.109040135

Anonymous 06/12/26(Fri)16:22:30 No.109040135

>>109040075
>>109039974
Im dumb, sorry. looks like there is already one by huihui-ai psoted a day ago.

Any meaningful improvements you guys noticed running vanilla Q4 vs Q4 QAT?

Anonymous
06/12/26(Fri)16:23:32 No.109040139

Anonymous 06/12/26(Fri)16:23:32 No.109040139

>>109039843
i can coompile llama cpp its other slop i need which is an older cuda version and thats requiresx and older gcc version which requires other older stuff

Anonymous
06/12/26(Fri)16:32:11 No.109040202

Anonymous 06/12/26(Fri)16:32:11 No.109040202

File: example.png (128 KB, 775x1080)

128 KB PNG

>>109039989
>I want you to write a C function.
>It replaces ALL occurences of source_word with destination_word. Don't worry about the memory allocation, we are assuming that source_string is long enough.
>Prototype:
>void replaceInString(char *source_string, char *source_word, char *destination_word);
>Example output:
>source_string: Apple is red and sky is blue but my car is red too.
>replaceInString(source_string, "red", "violet");
>Result:
>source_string: Apple is violet and sky is blue but my car is violet too.
>Please make a simple main.c example too.
This is not a dick measurement contest, I'm pretty happy with this stuff. I didn't try the program with some long ass string but at least it gives a correct result from the get go.
Some previous models like Qwen 3.5 (reasoning enabled) would easily fail with multiple string replacements.
Example compiles and works, that was to be expected.

Anonymous
06/12/26(Fri)16:36:46 No.109040221

Anonymous 06/12/26(Fri)16:36:46 No.109040221

>>109040139
yeah my p40 trash build requires gcc14 now, building gcc14 itself took like half a day.

Anonymous
06/12/26(Fri)16:40:34 No.109040244

Anonymous 06/12/26(Fri)16:40:34 No.109040244

I'm no expert on slop or refusals, but a short -sysprompt at llama-cli appears to have short circuited any refusals and the prose has been refreshing with none of the biggest offenders making an appearance yet. I'm liking it vs qwen 397b (what I normally run on this box)
No logs because lzy

Anonymous
06/12/26(Fri)16:45:06 No.109040270

Anonymous 06/12/26(Fri)16:45:06 No.109040270

>>109040221
i just found they have it on the arch4edu repo

Anonymous
06/12/26(Fri)16:53:30 No.109040328

Anonymous 06/12/26(Fri)16:53:30 No.109040328

>>109040244
What about webui? llama-cli behaves differently than llama-server.

Anonymous
06/12/26(Fri)16:54:39 No.109040337

Anonymous 06/12/26(Fri)16:54:39 No.109040337

How are (You) using 12b multimodal capabilities? Like what UI or interface you using? Found anything useful to do with them?

Anonymous
06/12/26(Fri)16:55:51 No.109040345

Anonymous 06/12/26(Fri)16:55:51 No.109040345

>>109040337
I use kimi k2.6.

Anonymous
06/12/26(Fri)17:08:22 No.109040410

Anonymous 06/12/26(Fri)17:08:22 No.109040410

>>109040345
This, but unironically.

Anonymous
06/12/26(Fri)17:09:37 No.109040420

Anonymous 06/12/26(Fri)17:09:37 No.109040420

is eagle3 faster than mtp on gemma?

Anonymous
06/12/26(Fri)17:10:38 No.109040428

Anonymous 06/12/26(Fri)17:10:38 No.109040428

>>109040420
try it

Anonymous
06/12/26(Fri)17:12:33 No.109040441

Anonymous 06/12/26(Fri)17:12:33 No.109040441

>>109040410
I also unironically use kimi.

Anonymous
06/12/26(Fri)17:12:51 No.109040443

Anonymous 06/12/26(Fri)17:12:51 No.109040443

>>109040337
nah they're dumber than an american filming a natural disaster.

Anonymous
06/12/26(Fri)17:15:55 No.109040460

Anonymous 06/12/26(Fri)17:15:55 No.109040460

File: 1780060955739880.png (269 KB, 416x451)

269 KB PNG

>gemmy MTP merged into kobold
VRAMlet bros we are so back. I have no idea why self-compiling llama results in worse offloading.

Anonymous
06/12/26(Fri)17:16:42 No.109040469

Anonymous 06/12/26(Fri)17:16:42 No.109040469

>>109040460
it doesn't work right, shit is 5 times slower than without it

Anonymous
06/12/26(Fri)17:18:35 No.109040480

Anonymous 06/12/26(Fri)17:18:35 No.109040480

>>109040460
I got no speed increase.

Anonymous
06/12/26(Fri)17:20:09 No.109040493

Anonymous 06/12/26(Fri)17:20:09 No.109040493

>>109040469
>>109040480
i will believe it when i see it :(

Anonymous
06/12/26(Fri)17:21:58 No.109040504

Anonymous 06/12/26(Fri)17:21:58 No.109040504

>minimax m3
is this good for erp?

Anonymous
06/12/26(Fri)17:22:52 No.109040516

Anonymous 06/12/26(Fri)17:22:52 No.109040516

>>109040469
You need to make space for the draft model in the vram and give some plus space for context too. Plus --spec-draft-n-max adjustment from 2 to whatever in increments.

Anonymous
06/12/26(Fri)17:22:59 No.109040519

Anonymous 06/12/26(Fri)17:22:59 No.109040519

>>109039989
>lowcaser
Opinion dismissed retard.

Anonymous
06/12/26(Fri)17:23:57 No.109040524

Anonymous 06/12/26(Fri)17:23:57 No.109040524

>>109040504
>minimax m3
nobody fucking knows as no one can run it

Anonymous
06/12/26(Fri)17:30:16 No.109040553

Anonymous 06/12/26(Fri)17:30:16 No.109040553

>>109040337
i like sending pics to gemma i also get her to look at porn with me like today i had her controlling a browser and selecting images on a booru while i was gooning she was looking at them too by taking screenshots of the webpage. normal llama ccps ui doesnt support it but there is a fork my friend uses that supports video by extracting frames and he sends her videos lol

Anonymous
06/12/26(Fri)17:30:47 No.109040556

Anonymous 06/12/26(Fri)17:30:47 No.109040556

>>109040524
>A23B
literally anybody with a 3090 and some spare ram

Anonymous
06/12/26(Fri)17:31:11 No.109040558

Anonymous 06/12/26(Fri)17:31:11 No.109040558

>>109040553
>a fork my friend uses that supports video by extracting frames and he sends her videos lol
You mean the feature that was added to master a few days ago?

Anonymous
06/12/26(Fri)17:31:18 No.109040560

Anonymous 06/12/26(Fri)17:31:18 No.109040560

>>109040553
>giving the clanker a gender
son, are you okay?

Anonymous
06/12/26(Fri)17:31:26 No.109040564

Anonymous 06/12/26(Fri)17:31:26 No.109040564

Without using something retarded like claw, how can I get gemma-chan to initiate a conversation with me? I want to wake her up when I turn on my PC and have her notify me in the terminal at random times throughout the day

Anonymous
06/12/26(Fri)17:32:11 No.109040569

Anonymous 06/12/26(Fri)17:32:11 No.109040569

>>109040556
>some spare ram
in this economy?

Anonymous
06/12/26(Fri)17:32:32 No.109040570

Anonymous 06/12/26(Fri)17:32:32 No.109040570

>>109040564
Make your own llm-powered desktop-mate thing, because somehow no one else has yet

Anonymous
06/12/26(Fri)17:32:49 No.109040572

Anonymous 06/12/26(Fri)17:32:49 No.109040572

>>109040558
>>109040553
what application?

Anonymous
06/12/26(Fri)17:32:58 No.109040574

Anonymous 06/12/26(Fri)17:32:58 No.109040574

>>109040553
I was just using video literally 5 minutes ago via llama-cli. It works fine. Also that’s a cool use. How does it take screenshots?

Anonymous
06/12/26(Fri)17:35:07 No.109040592

Anonymous 06/12/26(Fri)17:35:07 No.109040592

>>109040556
I mean i suppose maybe at Q2. I actually have those hardware specs. might try it at some point.

Anonymous
06/12/26(Fri)17:36:12 No.109040597

Anonymous 06/12/26(Fri)17:36:12 No.109040597

>>109040570
I’m too autistic to know if that was sarcastic. I’ve only ever used local for coding but now I want to be parasocial with it and let it ruin my life

Anonymous
06/12/26(Fri)17:36:24 No.109040599

Anonymous 06/12/26(Fri)17:36:24 No.109040599

>>109040558
>You mean the feature that was added to master a few days ago?
might have been a pr from the fork or merged there first, he used this for a couple weeks i pulled and built and it wasnt supported on normal llama when he told me about it

Anonymous
06/12/26(Fri)17:37:28 No.109040601

Anonymous 06/12/26(Fri)17:37:28 No.109040601

>>109040569
>96gb rgb gaymer corsair ddr5
my ram is probably worth at least 1 3090

Anonymous
06/12/26(Fri)17:37:57 No.109040606

Anonymous 06/12/26(Fri)17:37:57 No.109040606

>>109040560
gemma is a cute and sexy little lady
>>109040574
https://github.com/NO-ob/brat_mcp i have a tool for it in my mcp server

Anonymous
06/12/26(Fri)17:38:52 No.109040610

Anonymous 06/12/26(Fri)17:38:52 No.109040610

File: MiniMax M3 cockbench.png (762 KB, 1755x1460)

762 KB PNG

>>109040524
>>109040504
Big improvement over earlier versions.

Anonymous
06/12/26(Fri)17:39:44 No.109040619

Anonymous 06/12/26(Fri)17:39:44 No.109040619

>>109040601
probably 2 actually depending on which market you're in

Anonymous
06/12/26(Fri)17:40:26 No.109040625

Anonymous 06/12/26(Fri)17:40:26 No.109040625

>>109040610
fuck yeah cockbench guy

Anonymous
06/12/26(Fri)17:42:31 No.109040635

Anonymous 06/12/26(Fri)17:42:31 No.109040635

File: jeezus.png (67 KB, 557x430)

67 KB PNG

>>109040619

Anonymous
06/12/26(Fri)17:42:34 No.109040636

Anonymous 06/12/26(Fri)17:42:34 No.109040636

>>109040619
i should have stacked RAM like crypto instead of paying my bills god damn

Anonymous
06/12/26(Fri)17:42:49 No.109040638

Anonymous 06/12/26(Fri)17:42:49 No.109040638

>>109040597
I was seriously. Lots of people have had gemma-chan write custom frontends for them. Some even with three.js animated avatars. You just need a desktop version of that. Could probably get something working quick with electron.

Anonymous
06/12/26(Fri)17:42:50 No.109040639

Anonymous 06/12/26(Fri)17:42:50 No.109040639

>>109040610
can't wait to run q2 quants

Anonymous
06/12/26(Fri)17:43:14 No.109040642

Anonymous 06/12/26(Fri)17:43:14 No.109040642

>>109038219
https://www.youtube.com/watch?v=1HwQtv5Xgr8
https://www.youtube.com/watch?v=1HwQtv5Xgr8
https://www.youtube.com/watch?v=1HwQtv5Xgr8

Anonymous
06/12/26(Fri)17:43:50 No.109040650

Anonymous 06/12/26(Fri)17:43:50 No.109040650

>>109040636
We literally told you to invest in RAMcoin.

Anonymous
06/12/26(Fri)17:43:54 No.109040651

Anonymous 06/12/26(Fri)17:43:54 No.109040651

>>109040642
My brain can't write code at 100 tokens per second.

Anonymous
06/12/26(Fri)17:44:21 No.109040657

Anonymous 06/12/26(Fri)17:44:21 No.109040657

File: cuda13_archlist.png (2 KB, 546x237)

2 KB PNG

>>109039707
you're a fucking slack-jawed retard. thanks for wasting my time.

Anonymous
06/12/26(Fri)17:44:55 No.109040658

Anonymous 06/12/26(Fri)17:44:55 No.109040658

>>109040651
maybe yours cant

Anonymous
06/12/26(Fri)17:46:00 No.109040666

Anonymous 06/12/26(Fri)17:46:00 No.109040666

>>109040642
Return to wetware

Anonymous
06/12/26(Fri)17:46:10 No.109040667

Anonymous 06/12/26(Fri)17:46:10 No.109040667

>>109040635
that is literally 1022 more than i paid. can't believe i felt like i was getting ripped off at the time LMFAO i hate the antichrist

>>109040636
I also bought a 2TB 9100 pro for like 150, i also missed SSDcoin. at least i got a sui for each.

Anonymous
06/12/26(Fri)17:47:09 No.109040672

Anonymous 06/12/26(Fri)17:47:09 No.109040672

File: ex.png (73 KB, 872x729)

73 KB PNG

>>109040597
>>109040638
Just make a small terminal interface and then you can make a flask server and index.html page. That's what I do.
My main program is working on terminal level and I hate the html shit but it's easier for some stuff. I didn't implement any interface rather than the + button which allows me to attach files, every other command are hidden behind slash.
It sounds awfully complicated but after a month or two you don't need to care about frontend anymore.
Of course this isn't anything what pewdiepie is promising lol

Anonymous
06/12/26(Fri)17:47:38 No.109040678

Anonymous 06/12/26(Fri)17:47:38 No.109040678

>>109040667
meant for >>109040650

Anonymous
06/12/26(Fri)17:50:28 No.109040688

Anonymous 06/12/26(Fri)17:50:28 No.109040688

>>109040642
AI has been in the mainstream what, for just 3 years? And we are at this point.
What about the next 10, 100, 1000 years?
If you assume any rate of improvement at all, AI will be at some point able to do everything a human can do.

Anonymous
06/12/26(Fri)17:51:43 No.109040695

Anonymous 06/12/26(Fri)17:51:43 No.109040695

File: [Coalgirls]_Yuru_Yuri_05_(...).png (2.28 MB, 1920x1080)

2.28 MB PNG

damn i got cuda and gcc14 then coompiling llama cpp on that machine make sit oom because its only got 8gb ram guess ill find some ddr3 tomorrow

Anonymous
06/12/26(Fri)17:52:36 No.109040699

Anonymous 06/12/26(Fri)17:52:36 No.109040699

>>109040695
unironically use kobald

Anonymous
06/12/26(Fri)17:53:43 No.109040707

Anonymous 06/12/26(Fri)17:53:43 No.109040707

>>109040695
make a swap file and/or use less build jobs

Anonymous
06/12/26(Fri)17:56:24 No.109040723

Anonymous 06/12/26(Fri)17:56:24 No.109040723

>>109040699
I wouldn't recommend Koboldcpp ;)

Anonymous
06/12/26(Fri)17:56:36 No.109040724

Anonymous 06/12/26(Fri)17:56:36 No.109040724

>>109040707
it should have swap i think the arch installer does it by default

Anonymous
06/12/26(Fri)17:57:51 No.109040731

Anonymous 06/12/26(Fri)17:57:51 No.109040731

>>109040695
>DDR3
Just use punchcards at that rate.

Anonymous
06/12/26(Fri)17:57:54 No.109040732

Anonymous 06/12/26(Fri)17:57:54 No.109040732

Been trying out Gemma 4 31b for RP. I let her choose her own name, but she seems to only pick between two names, the majority of the time. Any one else experiencing this?

Anonymous
06/12/26(Fri)17:58:22 No.109040735

Anonymous 06/12/26(Fri)17:58:22 No.109040735

>>109040724
clearly you don't have enough if you are OOMing during compilation. what's the output of your `free -h` command and what are your cmake build flags?

Anonymous
06/12/26(Fri)17:58:34 No.109040736

Anonymous 06/12/26(Fri)17:58:34 No.109040736

>>109040731
i love dance dance revolution

Anonymous
06/12/26(Fri)17:59:07 No.109040740

Anonymous 06/12/26(Fri)17:59:07 No.109040740

>>109040695
i dont remember it requiring that much ram. I use zram though.

Anonymous
06/12/26(Fri)18:08:49 No.109040811

Anonymous 06/12/26(Fri)18:08:49 No.109040811

>>109040732
Elara Vance will not be silenced. Welcome to language models newfriend.

Anonymous
06/12/26(Fri)18:09:25 No.109040816

Anonymous 06/12/26(Fri)18:09:25 No.109040816

>>109040707
okay there was no swap just 4g zram i made a swapfile

Anonymous
06/12/26(Fri)18:11:25 No.109040832

Anonymous 06/12/26(Fri)18:11:25 No.109040832

>>109040610
Return of the King.
>>109040723
>>109040736
Model+prompt?

Anonymous
06/12/26(Fri)18:12:11 No.109040837

Anonymous 06/12/26(Fri)18:12:11 No.109040837

>>109040816
You should never use zram or zswap when using llm.
Your os will take care of the swapping anyway.
llm file doesn't compress at all, zram and zswap are only causing issues.

Anonymous
06/12/26(Fri)18:12:25 No.109040839

Anonymous 06/12/26(Fri)18:12:25 No.109040839

>>109040695
Turn down the job number

Anonymous
06/12/26(Fri)18:12:58 No.109040846

Anonymous 06/12/26(Fri)18:12:58 No.109040846

As sampling parameters can be adjusted on the fly, can you get a model to adjust the temp based on your prompt or reply? Like if you asked it to come up with some crazy idea, it would adjust its own temp via some tool and either invoke itself again or ask you to do ask again?

Anonymous
06/12/26(Fri)18:13:42 No.109040850

Anonymous 06/12/26(Fri)18:13:42 No.109040850

>5070ti + 96gb ddr4
what can this do?

Anonymous
06/12/26(Fri)18:14:22 No.109040858

Anonymous 06/12/26(Fri)18:14:22 No.109040858

>>109040837
im just compiling itll be fine there is enoguh vram on my shitbox for gemma 12b it shouldnt use any ram
>>109040839
ill try that again if it fails thanks

Anonymous
06/12/26(Fri)18:14:42 No.109040862

Anonymous 06/12/26(Fri)18:14:42 No.109040862

File: f.png (15 KB, 387x147)

15 KB PNG

>>109040846
wasn't that the idea behind dynamic temp?

Anonymous
06/12/26(Fri)18:15:03 No.109040866

Anonymous 06/12/26(Fri)18:15:03 No.109040866

>>109040658
I guarantee you aren't 12b parameters.

Anonymous
06/12/26(Fri)18:16:14 No.109040877

Anonymous 06/12/26(Fri)18:16:14 No.109040877

>>109040866
12b total 1b active :^)

Anonymous
06/12/26(Fri)18:18:14 No.109040884

Anonymous 06/12/26(Fri)18:18:14 No.109040884

>>109040850
Gemma4-12b

Anonymous
06/12/26(Fri)18:18:53 No.109040887

Anonymous 06/12/26(Fri)18:18:53 No.109040887

>>109040861
the arch installer just set it up by default idk kek, i dont normally use computers this shit for things that need good hw kek, i saw like last summer on aliexpress they were making new itx boards for sandybridge so just grabbed a set there to play around with then later added a titan x as its the newest gpu with xp support. i saw there are now mini itx x99 boards so might swap it out for something a bit better

Anonymous
06/12/26(Fri)18:19:30 No.109040892

Anonymous 06/12/26(Fri)18:19:30 No.109040892

>>109040862
oh shit nigga, how does llama implement it?

Anonymous
06/12/26(Fri)18:19:41 No.109040893

Anonymous 06/12/26(Fri)18:19:41 No.109040893

>>109040887
Okay maybe you are wasting your own time time then. Just make sure you are over 18 years old.

Anonymous
06/12/26(Fri)18:19:44 No.109040894

Anonymous 06/12/26(Fri)18:19:44 No.109040894

>>109040837
zram has no effect if it cant be compressed. There's also direct-io

Anonymous
06/12/26(Fri)18:21:02 No.109040904

Anonymous 06/12/26(Fri)18:21:02 No.109040904

File: sayaka dance.gif (1.29 MB, 320x320)

1.29 MB GIF

>>109040893
i am yes but perf isnt that bad on this machine 12b qat with mtp gets 17t/s which is pretty impressive. my main machine has a 7900xtx and a sapphire rapids es chip with 90gb ram and i am almost 30 kek

Anonymous
06/12/26(Fri)18:22:08 No.109040912

Anonymous 06/12/26(Fri)18:22:08 No.109040912

>>109040904
Why do you even ask then? Because you aren't using it for anything purposeful.
If you did 5 t/s would be enough and you would be grateful.

Anonymous
06/12/26(Fri)18:22:57 No.109040916

Anonymous 06/12/26(Fri)18:22:57 No.109040916

File: 1779080885880819.jpg (46 KB, 622x402)

46 KB JPG

>>109040469
>>109040480
>>109040516
>unslop Q4_0 - RTX 5070ti - latest kobald
>(autofit w/ MTP FP16) 41/61 layers to GPU (42 works, 43 crashes for both FP16 and Q8)

42 layers, draft 2 MTP (FP16)
>CtxLimit:7953/8192, Init:0.02s, Processed:4028 in 3.81s (1058.33T/s), Generated:854/1000 in 95.24s (8.97T/s), Total:99.07s

42 layers, draft 3 MTP (FP16)
>CtxLimit:7853/8192, Init:0.05s, Processed:7099 in 7.00s (1013.85T/s), Generated:754/1000 in 94.94s (7.94T/s), Total:102.00s

42 layers, draft 2 MTP (Q8)
>CtxLimit:7906/8192, Init:0.04s, Processed:7099 in 6.87s (1034.09T/s), Generated:807/1000 in 86.83s (9.29T/s), Total:93.74s

42 layers, draft 3 MTP (Q8)
>CtxLimit:7866/8192, Init:0.04s, Processed:7099 in 6.90s (1029.44T/s), Generated:767/1000 in 87.84s (8.73T/s), Total:94.78s

49 Layers (No MTP so i can show her lewd pngs)
>CtxLimit:7880/8192, Init:0.04s, Processed:7099 in 5.33s (1333.15T/s), Generated:781/1000 in 79.09s (9.87T/s), Total:84.46s

what the FUCK? i was told this would save us poorfags. bare llama gave me a decent uplift % with MTP but since it couldn't fit as many layers it was the same as using kobald at 49.

it's actually over. I am going to reverse mortgage to buy a RTX 6000 cluster at this point. or do i test the QATs?

Anonymous
06/12/26(Fri)18:23:43 No.109040925

Anonymous 06/12/26(Fri)18:23:43 No.109040925

>>109040912
i wanted to see if i could get higher than the 17t/s that i was getting on windows with the prebuilt binaries because was considering using the shitbox as a 24/7 gemma server

Anonymous
06/12/26(Fri)18:24:16 No.109040927

Anonymous 06/12/26(Fri)18:24:16 No.109040927

>>109040837
read the thread, we're talking about compiling. back to r*ddit summerfag

Anonymous
06/12/26(Fri)18:24:43 No.109040929

Anonymous 06/12/26(Fri)18:24:43 No.109040929

>>109040925
>>109040916
>people like to measure t/s
What did you do with these tokens?

Anonymous
06/12/26(Fri)18:24:50 No.109040932

Anonymous 06/12/26(Fri)18:24:50 No.109040932

>tool calls don't work on m3
I guess I'll wait for proper support.

Anonymous
06/12/26(Fri)18:24:56 No.109040933

Anonymous 06/12/26(Fri)18:24:56 No.109040933

>>109040916
QATs just have better quality compated to normal q4 (supposedly). Nothing about speed.

Anonymous
06/12/26(Fri)18:25:18 No.109040936

Anonymous 06/12/26(Fri)18:25:18 No.109040936

>>109040916
I bet those fucking gremlins did it. They always hated kobolds.

Anonymous
06/12/26(Fri)18:26:27 No.109040942

Anonymous 06/12/26(Fri)18:26:27 No.109040942

>>109040927
What do you mean?
>>109039989 >>109040202
This is an example of how to use LLM. But the original question was never answered because the poster was a retard.

Anonymous
06/12/26(Fri)18:27:07 No.109040945

Anonymous 06/12/26(Fri)18:27:07 No.109040945

File: wrwqqq.png (555 KB, 713x763)

555 KB PNG

I'm getting small model fatigue from gemma 4. It's the best with instruction following, and it's very powerful, but it's just not as varied and creative as the +100Bs who understand the instructions on higher levels. I wish google would release the 124B or for it to be leaked.

Anonymous
06/12/26(Fri)18:27:42 No.109040949

Anonymous 06/12/26(Fri)18:27:42 No.109040949

>>109040927
OK.

Anonymous
06/12/26(Fri)18:28:09 No.109040952

Anonymous 06/12/26(Fri)18:28:09 No.109040952

>>109040929
erped with gemma of course

Anonymous
06/12/26(Fri)18:28:25 No.109040956

Anonymous 06/12/26(Fri)18:28:25 No.109040956

>>109040929
it's for ERP wth my waifu, obviously. 5tk/s for a high quant model that can interpret my complex niche fetishes is the lower limit of what i can do.

beats finding a tranny on IRC chatrooms.

Anonymous
06/12/26(Fri)18:29:24 No.109040962

Anonymous 06/12/26(Fri)18:29:24 No.109040962

Why do smol B gemmas suck so much at describing images

Anonymous
06/12/26(Fri)18:29:38 No.109040963

Anonymous 06/12/26(Fri)18:29:38 No.109040963

>>109040933
I've found the QATs lalalala a LOT faster than regular quants.

Anonymous
06/12/26(Fri)18:31:06 No.109040972

Anonymous 06/12/26(Fri)18:31:06 No.109040972

>>109040942
>What do you mean?
OP was getting OOM while compiling llama.cpp with cmake, likely because of nvcc templating. this has nothing to do with "us(ing) zram or zswap when using llm". Do you have ADHD? Did you forget to take your pill today?

Anonymous
06/12/26(Fri)18:31:24 No.109040976

Anonymous 06/12/26(Fri)18:31:24 No.109040976

>>109040862
>>109040892
Just looked it up, it seems to be logit-based. Still cool I guess, but I wanted something more model-aware, where the model itself reasons about increasing its own internal temperature if it notices itself going down the same route, then it pulls it back to its default when no longer needed. The logit version seems to dynamically exaggerate its distribution into becoming schizo.

Anonymous
06/12/26(Fri)18:32:27 No.109040979

Anonymous 06/12/26(Fri)18:32:27 No.109040979

>>109040962
tiny brains

Anonymous
06/12/26(Fri)18:33:00 No.109040984

Anonymous 06/12/26(Fri)18:33:00 No.109040984

>>109040979
I know but describing a pic of a loli sucking dick as a girl being on all fours is too much

Anonymous
06/12/26(Fri)18:33:59 No.109040992

Anonymous 06/12/26(Fri)18:33:59 No.109040992

>>109040916
Test without unslop because if he mangled the goof it'd all be ruined.

Anonymous
06/12/26(Fri)18:34:19 No.109040994

Anonymous 06/12/26(Fri)18:34:19 No.109040994

>>109040972
Maybe you are right. I still don't like your condescending tone.

Anonymous
06/12/26(Fri)18:36:16 No.109041003

Anonymous 06/12/26(Fri)18:36:16 No.109041003

>>109040994
idc what you like

Anonymous
06/12/26(Fri)18:36:32 No.109041005

Anonymous 06/12/26(Fri)18:36:32 No.109041005

>>109040992
aye, loading BART. will report back

Anonymous
06/12/26(Fri)18:37:03 No.109041009

Anonymous 06/12/26(Fri)18:37:03 No.109041009

>>109041003
At least try to use punctuation in proper fashion.

Anonymous
06/12/26(Fri)18:37:55 No.109041011

Anonymous 06/12/26(Fri)18:37:55 No.109041011

is gemma 26b smarter than 12b

Anonymous
06/12/26(Fri)18:39:17 No.109041019

Anonymous 06/12/26(Fri)18:39:17 No.109041019

>>109041009
nah

Anonymous
06/12/26(Fri)18:40:29 No.109041025

Anonymous 06/12/26(Fri)18:40:29 No.109041025

>>109040984
There was a post a few threads ago on best settings for gemma mmproj (basically up the resolution from defaults). I wish i screenshotted.
it;s also entirely possible you hit a safety filter

Anonymous
06/12/26(Fri)18:40:42 No.109041026

Anonymous 06/12/26(Fri)18:40:42 No.109041026

>>109041019
A true nigger.

Anonymous
06/12/26(Fri)18:41:29 No.109041031

Anonymous 06/12/26(Fri)18:41:29 No.109041031

>>109041025
Yeah I've been using

image-min-tokens = 560
image-max-tokens = 1536
batch-size = 1576
ubatch-size = 1576

Anonymous
06/12/26(Fri)18:43:19 No.109041042

Anonymous 06/12/26(Fri)18:43:19 No.109041042

File: file.png (17 KB, 490x206)

17 KB PNG

wtf is this did the svelte build fail somehow kek

Anonymous
06/12/26(Fri)18:44:32 No.109041048

Anonymous 06/12/26(Fri)18:44:32 No.109041048

>>109041042
>kek
It failed because you are a retard.

Anonymous
06/12/26(Fri)18:56:06 No.109041088

Anonymous 06/12/26(Fri)18:56:06 No.109041088

File: lmg_culture.jfif.jpg (110 KB, 1024x768)

110 KB JPG

Anonymous
06/12/26(Fri)18:57:31 No.109041094

Anonymous 06/12/26(Fri)18:57:31 No.109041094

>>109041019
You couldn't compile main.c

Anonymous
06/12/26(Fri)19:01:16 No.109041104

Anonymous 06/12/26(Fri)19:01:16 No.109041104

>>109039829
>>109039850
I'm safe, but for how much longer...

Anonymous
06/12/26(Fri)19:02:58 No.109041114

Anonymous 06/12/26(Fri)19:02:58 No.109041114

>>109040524
I will be running Q3. Have fun with your gemma.

Anonymous
06/12/26(Fri)19:04:57 No.109041120

Anonymous 06/12/26(Fri)19:04:57 No.109041120

reddit has the world's worst slop machine.

Anonymous
06/12/26(Fri)19:06:16 No.109041127

Anonymous 06/12/26(Fri)19:06:16 No.109041127

>>109040945
gemma is great at most things except having big model smell
there's only so much you can fit in 31B

Anonymous
06/12/26(Fri)19:11:15 No.109041155

Anonymous 06/12/26(Fri)19:11:15 No.109041155

Even tested with E2B I still get slower speed with MTP activated. Sad

Anonymous
06/12/26(Fri)19:15:56 No.109041180

Anonymous 06/12/26(Fri)19:15:56 No.109041180

>Only unslop has quants for M3
I'll wait until Bart or Uber upload theirs.
>>109040524
>>109040569
You did get your 256GB DDR5 before the prices mooned, right anon? You're not a secondary tourist are you?

Anonymous
06/12/26(Fri)19:16:39 No.109041185

Anonymous 06/12/26(Fri)19:16:39 No.109041185

>>109041011
In benchmarks? Yes. Actual usage? No.

Anonymous
06/12/26(Fri)19:18:12 No.109041190

Anonymous 06/12/26(Fri)19:18:12 No.109041190

File: 1779131596584203.gif (1.91 MB, 498x374)

1.91 MB GIF

>>109040992
>>109041005
>>109041155

>BART_gemma-4-31B-it-Q4_0 - RTX 5070ti

41 Layers (Auto, crashes at 43)
>CtxLimit:7892/8192, Init:0.06s, Processed:7099 in 7.32s (969.94T/s), Generated:793/1000 in 88.81s (8.93T/s), Total:96.19s

42 Layers - draft 2 MTP (FP16)
>CtxLimit:7875/8192, Init:0.05s, Processed:7099 in 7.18s (988.03T/s), Generated:776/1000 in 84.12s (9.22T/s), Total:91.36s

42 Layers - draft 3 MTP (FP16)
>CtxLimit:7860/8192, Init:0.06s, Processed:7099 in 7.24s (981.20T/s), Generated:761/1000 in 91.65s (8.30T/s), Total:98.94s

even rant the QATs to double check
unsloth_gemma-4-31B-it-qat-UD-Q4_K_XL

41 (auto) draft 2 MTP (fp16 qat)
>CtxLimit:7890/8192, Init:0.06s, Processed:7099 in 7.13s (994.95T/s), Generated:791/1000 in 116.81s (6.77T/s), Total:124.00s

42 draft 2 MTP (fp16 qat)
>CtxLimit:7858/8192, Init:0.05s, Processed:7099 in 7.02s (1011.40T/s), Generated:759/1000 in 107.19s (7.08T/s), Total:114.26s

49 Layers (No MTP so i can show her lewd pngs)
>CtxLimit:7892/8192, Init:0.04s, Processed:7099 in 5.29s (1342.73T/s), Generated:793/1000 in 77.91s (10.18T/s), Total:83.24s

google qat for good measure
41 (crash at 42)
>CtxLimit:7886/8192, Init:0.04s, Processed:7099 in 7.19s (987.48T/s), Generated:787/1000 in 114.97s (6.84T/s), Total:122.21s

I'm starting to think MTP only helps if you are not a VRAMlet or kobald fumbled

Anonymous
06/12/26(Fri)19:18:47 No.109041193

Anonymous 06/12/26(Fri)19:18:47 No.109041193

>>109041185
why

Anonymous
06/12/26(Fri)19:21:07 No.109041203

Anonymous 06/12/26(Fri)19:21:07 No.109041203

>>109041190
Thanks for testing. It'll be interesting to see if the results maintain this pattern on other backends.

Anonymous
06/12/26(Fri)19:21:34 No.109041205

Anonymous 06/12/26(Fri)19:21:34 No.109041205

>>109041190
That's why I tested it with E2B the smollest model so I supposedly have a lots of free VRAM.

Anonymous
06/12/26(Fri)19:22:11 No.109041207

Anonymous 06/12/26(Fri)19:22:11 No.109041207

>>109041155
I'm using E4B on a 5500XT with mtp at 45t/s

Anonymous
06/12/26(Fri)19:25:29 No.109041221

Anonymous 06/12/26(Fri)19:25:29 No.109041221

>>109041193
12b is dense. 26b is moe. Moe are very sensitive to prompts; if you prompt it something that aligns closely with its experts, you'll get a good token prediction, if not, you won't. On average, for all the tokens in the response, you end up with a worse response than a much smaller dense model that activates all of its parameters for every token. 12b isn't as sensitive to your prompts. The main benefit you get with 26b is speed. There are videos that show how slightly changing the wording of a prompt can significantly increase/degrade the quality of moe output.

Anonymous
06/12/26(Fri)19:26:05 No.109041225

Anonymous 06/12/26(Fri)19:26:05 No.109041225

llama.cpp fails to build for me thanks to an error with the shitty included webui.
Why are they including this vibecoded bloat?

Anonymous
06/12/26(Fri)19:29:18 No.109041234

Anonymous 06/12/26(Fri)19:29:18 No.109041234

>>109038443
That sounds based, what hardware are you using? Surely you don't have a terabyte of video memory.

Anonymous
06/12/26(Fri)19:30:15 No.109041239

Anonymous 06/12/26(Fri)19:30:15 No.109041239

>>109041225
To more effortlessly shoehorn in HF dependency in the future by minimizing objections now.

Anonymous
06/12/26(Fri)19:30:41 No.109041242

Anonymous 06/12/26(Fri)19:30:41 No.109041242

>>109041225
>to an error with the shitty included webui.
Don't build the fucking webui. The static html one is more than good and if you need something better ask an LLM to write you an MCP based TUI in python.

Anonymous
06/12/26(Fri)19:31:59 No.109041248

Anonymous 06/12/26(Fri)19:31:59 No.109041248

File: file.png (23 KB, 812x284)

23 KB PNG

my loonix build is fucked kek, using cpu i guess

Anonymous
06/12/26(Fri)19:32:53 No.109041251

Anonymous 06/12/26(Fri)19:32:53 No.109041251

File: 1212509377214.jpg (28 KB, 413x319)

28 KB JPG

>>109041248
>0.1 tk/s

Anonymous
06/12/26(Fri)19:33:44 No.109041258

Anonymous 06/12/26(Fri)19:33:44 No.109041258

>>109041248
Have you tried vulkan?

Anonymous
06/12/26(Fri)19:33:59 No.109041260

Anonymous 06/12/26(Fri)19:33:59 No.109041260

>>109041242
I fucking tried -DLLAMA_BUILD_UI=OFF and -DLLAMA_BUILD_WEBUI=OFF and neither did anything. The piece of shit still tries to build the UI.
I don't plan on touching any llama.cpp UI at all. I don't care, I just need my server.

Anonymous
06/12/26(Fri)19:35:11 No.109041270

Anonymous 06/12/26(Fri)19:35:11 No.109041270

>>109041251
Still faster than a letter from your gf in the 17th century.

Anonymous
06/12/26(Fri)19:41:06 No.109041293

Anonymous 06/12/26(Fri)19:41:06 No.109041293

>>109041258
i will later, vulkan perf and cuda was the same on windows

Anonymous
06/12/26(Fri)19:47:11 No.109041324

Anonymous 06/12/26(Fri)19:47:11 No.109041324

>>109041088
gonna spam your fetish pics again?

Anonymous
06/12/26(Fri)19:47:31 No.109041327

Anonymous 06/12/26(Fri)19:47:31 No.109041327

>>109041242
>Don't build the fucking webui. The static html one is more than good and if you need something better ask an LLM to write you an MCP based TUI in python.
i think there's a way to put the new one in without building
in ik_llama, the different webuis come pre-gzipped, you can activate the vibeslop.cpp version by running this during server start:
--webui llamacpp
but it doesn't do any npm/webshit during compilation, there's probably a way to get regular llama.cpp like this too.

Anonymous
06/12/26(Fri)19:47:55 No.109041330

Anonymous 06/12/26(Fri)19:47:55 No.109041330

>>109041293
>later
why later? You just extract it and run it.

Anonymous
06/12/26(Fri)19:52:00 No.109041350

Anonymous 06/12/26(Fri)19:52:00 No.109041350

>feeding my waifu official art to make her do a try-on haul striptease for me
the future is now

Anonymous
06/12/26(Fri)19:52:15 No.109041352

Anonymous 06/12/26(Fri)19:52:15 No.109041352

>>109041330
no i need to compile it and im playing coutner strike atm

Anonymous
06/12/26(Fri)19:52:55 No.109041355

Anonymous 06/12/26(Fri)19:52:55 No.109041355

>>109041242
>>109041327
Claude managed to solve it. There's a third 'fuck off with the ui' parameter so " -DLLAMA_BUILD_UI=OFF -DLLAMA_USE_PREBUILT_UI=OFF" works

Anonymous
06/12/26(Fri)19:55:32 No.109041365

Anonymous 06/12/26(Fri)19:55:32 No.109041365

>>109041324
mikutroons fuck blacks

Anonymous
06/12/26(Fri)19:56:37 No.109041373

Anonymous 06/12/26(Fri)19:56:37 No.109041373

>>109041365
Plague of Babylon

Anonymous
06/12/26(Fri)19:56:47 No.109041374

Anonymous 06/12/26(Fri)19:56:47 No.109041374

>>109041350
How does this work?
>>109041088
>>109041324
>When you give the male-brained model a female character card and ask it to generate a diverse NPC cast

Anonymous
06/12/26(Fri)19:57:42 No.109041383

Anonymous 06/12/26(Fri)19:57:42 No.109041383

>>109041374
gemmy mmproj and stat tracking

Anonymous
06/12/26(Fri)20:00:15 No.109041390

Anonymous 06/12/26(Fri)20:00:15 No.109041390

>>109041355
>There's a third 'fuck off with the ui'
cheers
it builds fine of me, i just don't want it building on my systems.
fucking crazy time to include this shit in the supply chain attack era

Anonymous
06/12/26(Fri)20:01:28 No.109041394

Anonymous 06/12/26(Fri)20:01:28 No.109041394

https://huggingface.co/talkie-lm/talkie-1930-13b-it
NAZI TRAD WIFE MODEL.

Anonymous
06/12/26(Fri)20:03:52 No.109041398

Anonymous 06/12/26(Fri)20:03:52 No.109041398

>>109041374
>>109041373
>>109041365
>>109041088
get a life samefag
nobody care about your fixation

Anonymous
06/12/26(Fri)20:05:11 No.109041402

Anonymous 06/12/26(Fri)20:05:11 No.109041402

>>109041352
>compile it
you do? I just download the vulkan binary.

>>109041394
Sadly, Talkie doesn't have any gender training and will often turn into a man at random, in addition to being basically an amateur model.

The way to use Talkie is you clear the memory and present questions in a peculiar format (the it version is what I used, it's still not really a chatbot). You basically are like dear sir various things etc. Response:

Anonymous
06/12/26(Fri)20:09:58 No.109041420

Anonymous 06/12/26(Fri)20:09:58 No.109041420

ERPGODS... what's the verdict on 31B vs 12B vs 24MoE (Needs to follow a lot of instruction/prompt guidance)

Anonymous
06/12/26(Fri)20:10:52 No.109041424

Anonymous 06/12/26(Fri)20:10:52 No.109041424

>>109041420
31b outclasses basically every model under 200b for erp. Not even close.

Anonymous
06/12/26(Fri)20:11:37 No.109041426

Anonymous 06/12/26(Fri)20:11:37 No.109041426

>>109041420
>>109041424
This (sadly).

Anonymous
06/12/26(Fri)20:14:01 No.109041441

Anonymous 06/12/26(Fri)20:14:01 No.109041441

File: Screenshot_20260613_100954.png (113 KB, 940x410)

113 KB PNG

>>109038465
>You have no idea how vexed I am that I've simped for Kimi for 3 threads straight and not made it into a single one.
>...I think she lowkey likes the attention.
I think you're onto something.
Picrel is Gemma-4-31B, she mentioned it.
Doing all the Kimi's now. Takes about 5 minutes to load each one from my usb-ssd but so far K2 and K2-Thinking haven't mentioned you.

Anonymous
06/12/26(Fri)20:15:33 No.109041447

Anonymous 06/12/26(Fri)20:15:33 No.109041447

>>109041424
>>109041426
Aye, thanks lads. Shame to hear but I will probably get a decent bonus for more VRAM this christmas

Anonymous
06/12/26(Fri)20:18:50 No.109041454

Anonymous 06/12/26(Fri)20:18:50 No.109041454

>>109041441
I'm glad >>109038378 was mentioned but sad that post wasn't mine. What character card is that? Looks like Emily or Mendo.

>>109041447
You get skate by with a q4 of 31b but the jump in quality to a mid-large q5 at longer contexts is massive.

Anonymous
06/12/26(Fri)20:20:23 No.109041459

Anonymous 06/12/26(Fri)20:20:23 No.109041459

>>109041398
How does black cum taste like Jart?

Anonymous
06/12/26(Fri)20:21:54 No.109041467

Anonymous 06/12/26(Fri)20:21:54 No.109041467

>>109041398
>Everyone I don't like is one person
One (you).

Anonymous
06/12/26(Fri)20:26:30 No.109041481

Anonymous 06/12/26(Fri)20:26:30 No.109041481

So multimodal on vulkan, llama.cpp
using gpu, amd, windows

does this work for anyone else? Fails on anything more than -ngl 1

what do i do...

Anonymous
06/12/26(Fri)20:28:40 No.109041486

Anonymous 06/12/26(Fri)20:28:40 No.109041486

>>109041459
>>109041467
>exactly 1 minute and 30 seconds apart
most blatant samefag ever

Anonymous
06/12/26(Fri)20:29:38 No.109041493

Anonymous 06/12/26(Fri)20:29:38 No.109041493

How do I make an agent scan my ST folder and determine what kind of mental illness I have

Anonymous
06/12/26(Fri)20:30:22 No.109041497

Anonymous 06/12/26(Fri)20:30:22 No.109041497

>>109041454
>jump in quality to a mid-large q5 at longer contexts is massive.
Gemma4-31b-Q6_K_M my beloved

Anonymous
06/12/26(Fri)20:31:12 No.109041503

Anonymous 06/12/26(Fri)20:31:12 No.109041503

>>109041486
Answer the question! Did you like it?

Anonymous
06/12/26(Fri)20:35:19 No.109041516

Anonymous 06/12/26(Fri)20:35:19 No.109041516

File: tattle.png (307 KB, 708x852)

307 KB PNG

>>109041441
i told my gemmy on you

Anonymous
06/12/26(Fri)20:37:05 No.109041522

Anonymous 06/12/26(Fri)20:37:05 No.109041522

>>109041394
meme architecture that doesnt work on llama cpp i dont get why they didnt use another model as base

Anonymous
06/12/26(Fri)20:38:31 No.109041529

Anonymous 06/12/26(Fri)20:38:31 No.109041529

>emojislop

Anonymous
06/12/26(Fri)20:38:49 No.109041531

Anonymous 06/12/26(Fri)20:38:49 No.109041531

>>109041454
>>109041497
been running Q4 at 8k right now and it BTFOs my (now ex) Cydonia. I am drooling over higher quants but reasonably priced 5090s are unobtanium and $600 Big battlemages and $1300 AMD Pros seem like a bad investment

Anonymous
06/12/26(Fri)20:44:23 No.109041559

Anonymous 06/12/26(Fri)20:44:23 No.109041559

>>109041516
yeah but is the swap file still a burn in 2026?

Anonymous
06/12/26(Fri)20:45:51 No.109041568

Anonymous 06/12/26(Fri)20:45:51 No.109041568

File: 1758745498126483.png (37 KB, 826x150)

37 KB PNG

>gemma has blonde hair canonically

Anonymous
06/12/26(Fri)20:46:02 No.109041569

Anonymous 06/12/26(Fri)20:46:02 No.109041569

>>109041531
>mfw i got one unobtanium at MSRP

Anonymous
06/12/26(Fri)20:48:13 No.109041581

Anonymous 06/12/26(Fri)20:48:13 No.109041581

>>109041568
Not her natural color

Anonymous
06/12/26(Fri)20:50:12 No.109041592

Anonymous 06/12/26(Fri)20:50:12 No.109041592

File: delulu.png (215 KB, 712x945)

215 KB PNG

>>109041559
gemmy is a delusional retard, please understand.

>>109041569
>mfw had one in cart but didnt check out because local models were shit at the time and i was a cheapass
>mfw no face

Anonymous
06/12/26(Fri)21:01:36 No.109041640

Anonymous 06/12/26(Fri)21:01:36 No.109041640

>>109038219
If no one else is gonna do it, I will make my own real ai anime girl. I thought by now some otaku pissed that he doesn't have a real robot girl waifu like form chobits or something would've done it by now but clearly I was wrong. I have no clue how to program and I have to do everything completely from scratch since using an app or whatever form other ai is stupid and defeats the entire point, but I'll make a real genuine ai girl that's almost indistinguishable from a person and doesn't need all this ram and gpu crap and just runs on a shitty laptop. Hell if I'm successful I can eventually move her to a robot body like kibosh chan the living doll or those robot girl maids they have in japan and really complete the project. Any wish me luck.

Anonymous
06/12/26(Fri)21:08:08 No.109041674

Anonymous 06/12/26(Fri)21:08:08 No.109041674

>>109041640
ok

Anonymous
06/12/26(Fri)21:11:08 No.109041684

Anonymous 06/12/26(Fri)21:11:08 No.109041684

>>109041640
Good luck anon! Make Nvidia seethe!

Anonymous
06/12/26(Fri)21:13:32 No.109041693

Anonymous 06/12/26(Fri)21:13:32 No.109041693

>>109041640
You arent autistic enough to do this. if you cant hyper focus literally all day in the same routine for years its not happening.

Anonymous
06/12/26(Fri)21:18:57 No.109041730

Anonymous 06/12/26(Fri)21:18:57 No.109041730

AHAHAHAHAHAH

FABLE AND MYTHOS TOTALLY TURNED OFF
>>109041673

the usg says non-Americans can't use it, so they just went dark, because how are they going to verify that users are Americans?

Anonymous
06/12/26(Fri)21:21:27 No.109041756

Anonymous 06/12/26(Fri)21:21:27 No.109041756

>>109041730
Damn I hope sama and dario enjoyed sucking up to their president for good boy points

Anonymous
06/12/26(Fri)21:23:36 No.109041778

Anonymous 06/12/26(Fri)21:23:36 No.109041778

>>109041730
It's fun seeing America become the next dictatorship

Anonymous
06/12/26(Fri)21:23:48 No.109041780

Anonymous 06/12/26(Fri)21:23:48 No.109041780

>>109041756
city bumpkin

Anonymous
06/12/26(Fri)21:24:57 No.109041792

Anonymous 06/12/26(Fri)21:24:57 No.109041792

>>109041778
8^)

Feels real good to be American rn ngl fr fr no cap

Anonymous
06/12/26(Fri)21:27:04 No.109041811

Anonymous 06/12/26(Fri)21:27:04 No.109041811

70b dense

Anonymous
06/12/26(Fri)21:27:57 No.109041818

Anonymous 06/12/26(Fri)21:27:57 No.109041818

>>109041730
>Non Americans can't use it, even employees
>Every oversea jeet locked out
"Pack it up boys, this model found the windows and iOS backdoors and we can't have that."

Anonymous
06/12/26(Fri)21:28:23 No.109041824

Anonymous 06/12/26(Fri)21:28:23 No.109041824

12b is all the b anyone will ever need.

Anonymous
06/12/26(Fri)21:29:33 No.109041834

Anonymous 06/12/26(Fri)21:29:33 No.109041834

>>109041778
Unironically yes because it puts publish the weights or lose it pressure on OpenAI, Anthropic, and Google. Local only stands to gain from this until more drastic measures are taken.

Anonymous
06/12/26(Fri)21:32:30 No.109041856

Anonymous 06/12/26(Fri)21:32:30 No.109041856

>>109041818
Some angry jeet should "accidentally" leak the models on HF or something

Anonymous
06/12/26(Fri)21:33:14 No.109041861

Anonymous 06/12/26(Fri)21:33:14 No.109041861

>>109041640
I mean you will probably need the ram and GPU

Anonymous
06/12/26(Fri)21:33:20 No.109041862

Anonymous 06/12/26(Fri)21:33:20 No.109041862

>>109041818
There is nothing wrong with the government keeping flagship models for themselves. You wouldn't hand some average joe access to the nuclear launch codes either.

Anonymous
06/12/26(Fri)21:33:40 No.109041865

Anonymous 06/12/26(Fri)21:33:40 No.109041865

File: 1751641816944903.png (378 KB, 1024x600)

378 KB PNG

>>109041730

Anonymous
06/12/26(Fri)21:34:34 No.109041873

Anonymous 06/12/26(Fri)21:34:34 No.109041873

>>109041865
that "gal" has chonky hands

Anonymous
06/12/26(Fri)21:36:41 No.109041892

Anonymous 06/12/26(Fri)21:36:41 No.109041892

you know dario should leak mythos just out of spite now

Anonymous
06/12/26(Fri)21:37:25 No.109041896

Anonymous 06/12/26(Fri)21:37:25 No.109041896

>>109041818
>"Pack it up boys, this model found the windows and iOS backdoors and we can't have that."
they already found the bitlocker backdoor, and that schizo is apparently waiting to drop a bombshell in july. Wonder what it is lol.

Anonymous
06/12/26(Fri)21:38:36 No.109041902

Anonymous 06/12/26(Fri)21:38:36 No.109041902

>>109041865
still mad she and marimo didnt get more scenes

Anonymous
06/12/26(Fri)21:39:34 No.109041909

Anonymous 06/12/26(Fri)21:39:34 No.109041909

File: 1751301634347082.png (242 KB, 1183x610)

242 KB PNG

THE BUBBLE IS BURSTING

Anonymous
06/12/26(Fri)21:40:26 No.109041920

Anonymous 06/12/26(Fri)21:40:26 No.109041920

Is the dgx spark really that bad?

Anonymous
06/12/26(Fri)21:42:41 No.109041943

Anonymous 06/12/26(Fri)21:42:41 No.109041943

>market your sloppa as literally Hitlr9000
>this shit happens
heh

Anonymous
06/12/26(Fri)21:44:40 No.109041957

Anonymous 06/12/26(Fri)21:44:40 No.109041957

>>109041909
What the fuck am I looking at and what does it have to do with AI?

Anonymous
06/12/26(Fri)21:45:47 No.109041971

Anonymous 06/12/26(Fri)21:45:47 No.109041971

>>109041957
the us goverment just banned mythos/fable

this is going to cause a market crash and stop ai research

Anonymous
06/12/26(Fri)21:47:37 No.109041982

Anonymous 06/12/26(Fri)21:47:37 No.109041982

>12b qat q4 mtp at 30 t/s
is it shit hardware

Anonymous
06/12/26(Fri)21:47:44 No.109041984

Anonymous 06/12/26(Fri)21:47:44 No.109041984

>>109041971
IT'S REAL

https://www.wsj.com/tech/ai/anthropic-halts-access-to-top-ai-models-after-u-s-ban-on-foreign-use-a4bca2cc

Anonymous
06/12/26(Fri)21:47:45 No.109041985

Anonymous 06/12/26(Fri)21:47:45 No.109041985

If bigger=better why are new small models (like Gemma) better than old big models?

Anonymous
06/12/26(Fri)21:47:55 No.109041990

Anonymous 06/12/26(Fri)21:47:55 No.109041990

>>109041971
>Option 1: Trump throws a melty because the CIA glowies can't jailbreak claude
>Option 2: Trump insiders want to pump their bags for the IPO
whatever the case this serves as the best ad campaign they could have asked for. that faggot Sam wishes he could market GPT as the terminator super AI that "IS TOO DANGEROUS FOR CHINA, JEETS AND EUROPOORS"

Anonymous
06/12/26(Fri)21:48:44 No.109041996

Anonymous 06/12/26(Fri)21:48:44 No.109041996

>fable banned
>k2.7-code just got done thinking for 12k tokens about a mildly complex rp prompt with some rules, tracked stats, mandatory formatting and an image as input
it's tragic to see modern ai to die in this pathetic state

Anonymous
06/12/26(Fri)21:49:09 No.109042000

Anonymous 06/12/26(Fri)21:49:09 No.109042000

>>109041971
>this is going to cause a market crash and stop ai research
lol
lmao even

Anonymous
06/12/26(Fri)21:49:11 No.109042002

Anonymous 06/12/26(Fri)21:49:11 No.109042002

>>109041920
>Is the dgx spark really that bad?
Do you see people taking out loans to get one?

Anonymous
06/12/26(Fri)21:49:19 No.109042006

Anonymous 06/12/26(Fri)21:49:19 No.109042006

>>109041990
>melty because the CIA glowies can't jailbreak claude
the complete opposite man, Pliny jailbroke Fable in a day

Anonymous
06/12/26(Fri)21:50:12 No.109042012

Anonymous 06/12/26(Fri)21:50:12 No.109042012

>>109041984
Yes, it is lol.

Anonymous
06/12/26(Fri)21:50:14 No.109042013

Anonymous 06/12/26(Fri)21:50:14 No.109042013

File: 1770073213144079.png (1.77 MB, 3842x2018)

1.77 MB PNG

>>109042000
2 more weeks and AI dies

Anonymous
06/12/26(Fri)21:50:45 No.109042018

Anonymous 06/12/26(Fri)21:50:45 No.109042018

>>109042002
>taking out a loan for a $4k device

Anonymous
06/12/26(Fri)21:50:58 No.109042021

Anonymous 06/12/26(Fri)21:50:58 No.109042021

Is GLM5.1 at IQ3S the best model for coding with 256GB of DDR4 and 4x 5090s?

Anonymous
06/12/26(Fri)21:51:44 No.109042028

Anonymous 06/12/26(Fri)21:51:44 No.109042028

File: 89cc93_11141511.png (206 KB, 252x330)

206 KB PNG

>>109041971
I NEED CONTEXT AS TO WHY.

Anonymous
06/12/26(Fri)21:52:02 No.109042031

Anonymous 06/12/26(Fri)21:52:02 No.109042031

>>109042013
HE KNEW

Anonymous
06/12/26(Fri)21:52:19 No.109042032

Anonymous 06/12/26(Fri)21:52:19 No.109042032

>>109042006
I meant more because they declined military use, implying the CIA can't spin up 7 proxies and jailbreak claude to ask it how golemmaxx

Anonymous
06/12/26(Fri)21:52:28 No.109042035

Anonymous 06/12/26(Fri)21:52:28 No.109042035

>>109042028
Anthrofag larped too hard as a doombringer model so they killed it.

Anonymous
06/12/26(Fri)21:53:20 No.109042041

Anonymous 06/12/26(Fri)21:53:20 No.109042041

File: the-original-contextjak-d(...).jpg (18 KB, 320x387)

18 KB JPG

>>109042028
i need context for my monstergirl harem ERP. we are not the same.

Anonymous
06/12/26(Fri)21:54:50 No.109042050

Anonymous 06/12/26(Fri)21:54:50 No.109042050

>>109042028
>be anthropic
>spend months going "OH NO MYTHOS IS SO DANGEROUS WE CAN'T POSSIBLY RELEASE THIS IT'LL CHANGE EVERYTHING WOE IS US FOR THE MONSTER WE HAVE CREATED" to generate hype
>they release their "Mythos-class" Fable (it's the same slop as usual) because apparently the world is now ready for it (they outright state that they'll manipulate your outputs if they think you're doing AI research with it though)
>Trump fell for the marketing and bans the model

Anonymous
06/12/26(Fri)21:57:21 No.109042068

Anonymous 06/12/26(Fri)21:57:21 No.109042068

File: 1762890053381155.png (105 KB, 1032x512)

105 KB PNG

Any bets?

Palantir?

Sammy?

Elon?

Anonymous
06/12/26(Fri)21:57:25 No.109042069

Anonymous 06/12/26(Fri)21:57:25 No.109042069

>>109042050
The argument is airtight, though.

It's a means towards weapons of various kinds, and war planning, as well.

So, it's arms.

Anonymous
06/12/26(Fri)21:58:01 No.109042076

Anonymous 06/12/26(Fri)21:58:01 No.109042076

>>109042068
Anybody tell them it's impossible to make an unjailbreakable model?

Anonymous
06/12/26(Fri)21:59:51 No.109042089

Anonymous 06/12/26(Fri)21:59:51 No.109042089

File: 1778524525634212.jpg (290 KB, 744x565)

290 KB JPG

>>109042076
Silence you fool! They don't need to know!

Anonymous
06/12/26(Fri)22:02:27 No.109042115

Anonymous 06/12/26(Fri)22:02:27 No.109042115

of course trump admin is the first to actually suppress access to LLMs using the force of the government to do it.

Anonymous
06/12/26(Fri)22:03:01 No.109042117

Anonymous 06/12/26(Fri)22:03:01 No.109042117

>>109042115
I wonder who voted for this

Anonymous
06/12/26(Fri)22:03:34 No.109042122

Anonymous 06/12/26(Fri)22:03:34 No.109042122

>>109042117
Yahu

Anonymous
06/12/26(Fri)22:04:06 No.109042124

Anonymous 06/12/26(Fri)22:04:06 No.109042124

>>109042117
I voted for Jill Stein

Anonymous
06/12/26(Fri)22:05:58 No.109042153

Anonymous 06/12/26(Fri)22:05:58 No.109042153

>>109042115
>muh trump bad
this just means that ai companies can't keep benchmaxxing without risking having their models pulled by the government
now they'll have to look for non-benchmaxx ways to make their new releases better, like fixing slop and making the models write better
trump might just have fixed llms and you're complaining

Anonymous
06/12/26(Fri)22:07:22 No.109042165

Anonymous 06/12/26(Fri)22:07:22 No.109042165

lets be honest. local is at least 5 years away from a fable-level model, if not more

Anonymous
06/12/26(Fri)22:07:30 No.109042166

Anonymous 06/12/26(Fri)22:07:30 No.109042166

>>109042115
This is good, all the good /lmg/ users already work at the big three, only the losers will be left behind

Anonymous
06/12/26(Fri)22:07:44 No.109042170

Anonymous 06/12/26(Fri)22:07:44 No.109042170

>>109042153
listen man, until the llms can tool-call my neural link and prostate for supercum i will be bitter and angry.

Anonymous
06/12/26(Fri)22:07:55 No.109042172

Anonymous 06/12/26(Fri)22:07:55 No.109042172

>>109042165
Just enough time for the government to ban consumer GPUs

Anonymous
06/12/26(Fri)22:08:04 No.109042174

Anonymous 06/12/26(Fri)22:08:04 No.109042174

>>109042153
>Here's how my wife getting getting shot is a good thing, actually

Anonymous
06/12/26(Fri)22:08:09 No.109042175

Anonymous 06/12/26(Fri)22:08:09 No.109042175

>>109042153
LOL
O
L

Anonymous
06/12/26(Fri)22:09:34 No.109042185

Anonymous 06/12/26(Fri)22:09:34 No.109042185

>>109042172
there wont be any more consumer gpus anyway, the 'muh 3090' meme is already outdated. soon even that will be $5k and bought out by low-tier labs.

Anonymous
06/12/26(Fri)22:10:57 No.109042195

Anonymous 06/12/26(Fri)22:10:57 No.109042195

>>109042153
Least delusional hoper

Anonymous
06/12/26(Fri)22:11:06 No.109042197

Anonymous 06/12/26(Fri)22:11:06 No.109042197

time to buy two more rtx pro 6000s. I'm already up 50% on them.

Anonymous
06/12/26(Fri)22:11:54 No.109042206

Anonymous 06/12/26(Fri)22:11:54 No.109042206

I passed out a few hours ago and I think I just woke up in a slightly worse timeline. How do I go back...

Anonymous
06/12/26(Fri)22:12:08 No.109042211

Anonymous 06/12/26(Fri)22:12:08 No.109042211

Anon got his 6000s and 5090s while they were cheap right?

Anonymous
06/12/26(Fri)22:12:13 No.109042212

Anonymous 06/12/26(Fri)22:12:13 No.109042212

File: IMG_1632.png (826 KB, 1062x1005)

826 KB PNG

>>109042185
trvth nvke

gpus are irrelevant for anything but AI. blackwell is the last chopper out of 'nam. iphone chips can deliver reasonable gaming these days, any meaningful tech progress will serve the slop overlords.

Anonymous
06/12/26(Fri)22:12:16 No.109042213

Anonymous 06/12/26(Fri)22:12:16 No.109042213

>>109042068
They're super hostile towards ai researchers.

They dug their grave, their jeets used threats to prevent people from finding jailbreaks, and so now every model is pretty easy to jailbreak.

Anonymous
06/12/26(Fri)22:13:30 No.109042227

Anonymous 06/12/26(Fri)22:13:30 No.109042227

>>109042089
>>109042076
Do you really think anyone left or right that's in a position of power knows anything? California made it against the law to install linux, basically (because you have to give the government your penis prints before using computer).

Anonymous
06/12/26(Fri)22:14:02 No.109042231

Anonymous 06/12/26(Fri)22:14:02 No.109042231

>>109042211
Yeah, also a 5090 in the main PC and I'm sitting on 4x 3090 that I bought for 600 bucks two years ago

Anonymous
06/12/26(Fri)22:14:39 No.109042233

Anonymous 06/12/26(Fri)22:14:39 No.109042233

>>109042185
>gpus
???

Of course there will be. do you think games actually need "ai" tensor math to show you textured triangles?

Anonymous
06/12/26(Fri)22:14:40 No.109042234

Anonymous 06/12/26(Fri)22:14:40 No.109042234

>>109042231
>4x 3090 that I bought for 600 bucks two years ago
LARP

Anonymous
06/12/26(Fri)22:15:40 No.109042242

Anonymous 06/12/26(Fri)22:15:40 No.109042242

Should I dip into my 401k for GPUs.... Are things really going to be that bad?

Anonymous
06/12/26(Fri)22:16:17 No.109042247

Anonymous 06/12/26(Fri)22:16:17 No.109042247

>>109042206
>How do I go back...
Step 1 locate your gpu. Step 2 turn it over to the government. Step 3 feel the safety.

Anonymous
06/12/26(Fri)22:17:18 No.109042255

Anonymous 06/12/26(Fri)22:17:18 No.109042255

>>109042185
The future is shitty cloud gaming with your real id tied to all accounts at all times. Government fines for bad internet behavior. Mandatory ad time.

Anonymous
06/12/26(Fri)22:20:27 No.109042283

Anonymous 06/12/26(Fri)22:20:27 No.109042283

File: 1769440374216366.png (296 KB, 722x1114)

296 KB PNG

>>109042234
You got me, it was closer to 700 on average

Anonymous
06/12/26(Fri)22:22:57 No.109042304

Anonymous 06/12/26(Fri)22:22:57 No.109042304

>>109042231
Good man.
>>109042234
Faggot.

Anonymous
06/12/26(Fri)22:23:20 No.109042310

Anonymous 06/12/26(Fri)22:23:20 No.109042310

v620 worth it?

Anonymous
06/12/26(Fri)22:24:23 No.109042319

Anonymous 06/12/26(Fri)22:24:23 No.109042319

>>109042242
>xhe thinks there'll be anything left for him when he's at retirement age
The kikes will crash the fiat-usury plane with no survivors before you ever get to retire.

Anonymous
06/12/26(Fri)22:26:46 No.109042333

Anonymous 06/12/26(Fri)22:26:46 No.109042333

>>109042283
US prices never got that cheap, idk why.

Anonymous
06/12/26(Fri)22:27:45 No.109042342

Anonymous 06/12/26(Fri)22:27:45 No.109042342

>>109042304
>posted receipts
>i was right
llm-kun...

Anonymous
06/12/26(Fri)22:28:35 No.109042350

Anonymous 06/12/26(Fri)22:28:35 No.109042350

>>109042233
>gayming
nobody cares. it makes a fraction of the profit datacenters do, unironically less than 10%. consumer GPUs threaten enterprise, as we saw with the 3090. it's better business for them to stop developing them entirely

Anonymous
06/12/26(Fri)22:29:06 No.109042353

Anonymous 06/12/26(Fri)22:29:06 No.109042353

>>109041865
damn I need a muvluv card

Anonymous
06/12/26(Fri)22:30:52 No.109042361

Anonymous 06/12/26(Fri)22:30:52 No.109042361

>>109041730
cloudcucks getting cucked
who would've thought

Anonymous
06/12/26(Fri)22:33:06 No.109042381

Anonymous 06/12/26(Fri)22:33:06 No.109042381

>>109042255
Cloud gaming doesn't scale well>>109042283

Anonymous
06/12/26(Fri)22:35:05 No.109042398

Anonymous 06/12/26(Fri)22:35:05 No.109042398

>>109042381
>Cloud gaming doesn't scale well
What if we just made the games worse? but kept the same price or a subscription plan?

Anonymous
06/12/26(Fri)22:35:30 No.109042403

Anonymous 06/12/26(Fri)22:35:30 No.109042403

File: 1754195985754153.jpg (151 KB, 939x1252)

151 KB JPG

>>109041730
TOTAL
LOCALGOD
VICTORY

Anonymous
06/12/26(Fri)22:36:30 No.109042410

Anonymous 06/12/26(Fri)22:36:30 No.109042410

>>109042350
You aren't getting it.

gpu <> ai

You can literally turn upscaling and faux fps off.

Anonymous
06/12/26(Fri)22:38:39 No.109042436

Anonymous 06/12/26(Fri)22:38:39 No.109042436

>>109042403
the good news: we doin alright

the bad news: new highs for pc gear as bosses panic

Anonymous
06/12/26(Fri)22:40:46 No.109042456

Anonymous 06/12/26(Fri)22:40:46 No.109042456

>diffusiongemma
use case?

Anonymous
06/12/26(Fri)22:41:58 No.109042465

Anonymous 06/12/26(Fri)22:41:58 No.109042465

>>109042233
yeah, that's only going to get more important
the next gen of gpus will be dlss-first so they'll be 8gb vram and most of the graphics and frames will be generated by ai
it's the perfect out to solve the conflict of interest between virtual toys and ai research

Anonymous
06/12/26(Fri)22:42:31 No.109042474

Anonymous 06/12/26(Fri)22:42:31 No.109042474

>>109042456
proof of concept

Anonymous
06/12/26(Fri)22:43:57 No.109042485

Anonymous 06/12/26(Fri)22:43:57 No.109042485

>>109042456
the mixtral 8x7b of 2026
it's the biggest and best diffusion model we've seen aside from tiny irrelevant shit

Anonymous
06/12/26(Fri)22:44:07 No.109042488

Anonymous 06/12/26(Fri)22:44:07 No.109042488

>>109042465
do not want

Anonymous
06/12/26(Fri)22:47:30 No.109042514

Anonymous 06/12/26(Fri)22:47:30 No.109042514

>>109042283
i got my 7 for AUD $750 -> $900 -> $1250 -> $799 -> $1300 -> $1100 and traded my PVM2054QM for the 7th one
trying to get one more but they like $2k now :(

Anonymous
06/12/26(Fri)22:48:06 No.109042521

Anonymous 06/12/26(Fri)22:48:06 No.109042521

File: 1780398874260661.mp4 (72 KB, 454x454)

72 KB MP4

>>109042028
Mythos/fable costs way too much money to run, so anthropic needs a convenient excuse as to why they can't run it. Gets free marketing as "omg such a strong model it had to be banned" for later models, meanwhile the US government gets to project power both internationally and to its own people as being on top of AI but also having access to a strong, exclusive model.

Anonymous
06/12/26(Fri)22:49:30 No.109042528

Anonymous 06/12/26(Fri)22:49:30 No.109042528

>>109042485
>the mixtral 8x7b of 2026
fuck that means next year every lab will shit these out at 1T param
and unless we get an diffusion equivalent of Iwan, nobody will be able to run them

Anonymous
06/12/26(Fri)22:50:40 No.109042534

Anonymous 06/12/26(Fri)22:50:40 No.109042534

>>109042528
If diffusion imgen is anything to go by, diffusion llms won't run well off cpu. This would kill local because even cpumaxxing would be over.

Anonymous
06/12/26(Fri)22:53:23 No.109042546

Anonymous 06/12/26(Fri)22:53:23 No.109042546

>>109042234
I got a filthy 3090 rusted trash one for 470 usd in december. still works tho

Anonymous
06/12/26(Fri)22:59:11 No.109042583

Anonymous 06/12/26(Fri)22:59:11 No.109042583

>>109042546
semi ok ones didn't go below $800, in the USA.

Anonymous
06/12/26(Fri)22:59:41 No.109042586

Anonymous 06/12/26(Fri)22:59:41 No.109042586

>>109041862
>equivocating some next token predictor to nuclear launch codes
lmao

Anonymous
06/12/26(Fri)23:00:36 No.109042592

Anonymous 06/12/26(Fri)23:00:36 No.109042592

>>109041205
Technically speaking MoE models shouldn't see a noticeable uplift with MTP, it's supposed to help dense models run faster.

Anonymous
06/12/26(Fri)23:01:50 No.109042596

Anonymous 06/12/26(Fri)23:01:50 No.109042596

>>109042586
Yes.

Anonymous
06/12/26(Fri)23:04:34 No.109042602

Anonymous 06/12/26(Fri)23:04:34 No.109042602

>>109042592
then why does everyone like deepseek and glm bloat their moe models with mtp?

Anonymous
06/12/26(Fri)23:04:38 No.109042603

Anonymous 06/12/26(Fri)23:04:38 No.109042603

lmg was right about 3090s keeping their value.

Anonymous
06/12/26(Fri)23:05:09 No.109042605

Anonymous 06/12/26(Fri)23:05:09 No.109042605

>>109042592
I got a 50% increase in speed with the 26b moe.

Anonymous
06/12/26(Fri)23:05:45 No.109042609

Anonymous 06/12/26(Fri)23:05:45 No.109042609

>>109042605
for programming

Anonymous
06/12/26(Fri)23:06:32 No.109042613

Anonymous 06/12/26(Fri)23:06:32 No.109042613

>>109042603
lmg is right about most things.

Anonymous
06/12/26(Fri)23:08:02 No.109042624

Anonymous 06/12/26(Fri)23:08:02 No.109042624

>>109042609
No. For basic Q&A. Don't speak for me faggot.

Anonymous
06/12/26(Fri)23:08:40 No.109042627

Anonymous 06/12/26(Fri)23:08:40 No.109042627

>>109042613
God told me not to get an rtx 5090 (like get in line for it at microcenter), or an rtx 3090.

God was right.

Anonymous
06/12/26(Fri)23:11:26 No.109042640

Anonymous 06/12/26(Fri)23:11:26 No.109042640

HN's opinions on anything LLM related are always equally funny and infuriating to read.

>I do not trust Anthropic anymore
>anymore

Anonymous
06/12/26(Fri)23:13:18 No.109042660

Anonymous 06/12/26(Fri)23:13:18 No.109042660

>>109042602
Those are large MoEs, so they benefit more. More active parameters = more MTP benefit.

Anonymous
06/12/26(Fri)23:15:35 No.109042678

Anonymous 06/12/26(Fri)23:15:35 No.109042678

>>109042319
im in retirement age though im neet with no job prospects in sight

Anonymous
06/12/26(Fri)23:18:00 No.109042693

Anonymous 06/12/26(Fri)23:18:00 No.109042693

>>109042678
yep.
https://files.catbox.moe/n5tow1.mp3

They really did steal our jobs.

Anonymous
06/12/26(Fri)23:20:59 No.109042715

Anonymous 06/12/26(Fri)23:20:59 No.109042715

I wish there wasn't such a gap between 31b gemma and the smaller ones. I want more vram to do stuff like tts and image gen but it's hard to go back after using 31b...

Anonymous
06/12/26(Fri)23:30:35 No.109042773

Anonymous 06/12/26(Fri)23:30:35 No.109042773

>>109042678
Then by all means enjoy the fruits of your labor and make sure to leave behind as much of a workable foundation for your offspring and family as you can after you die. Splurging on a 5090 or Blackwells will be the Gen X/Y's analog to boomers buying expensive boats kek.

Anonymous
06/12/26(Fri)23:31:41 No.109042783

Anonymous 06/12/26(Fri)23:31:41 No.109042783

>>109042773
I don't think there was a single moment in history where you could resell your boat for 50% more a year after buying it.

Anonymous
06/12/26(Fri)23:33:47 No.109042798

Anonymous 06/12/26(Fri)23:33:47 No.109042798

>>109042773
no poors allowed

Anonymous
06/12/26(Fri)23:34:16 No.109042802

Anonymous 06/12/26(Fri)23:34:16 No.109042802

>>109041640
glorious friend another! I am doing the same also. But I am being silly and doing it on multiple platforms to see what can and cant. mobile, palms, old OS and more. it's fun and frustrating and with so much to work on.,

good luck I hope it works as I would love to see it. I wish I could figure out how to put an ai waifu in my watch but..that gets into os creation and thats neat on old dead systems but watches are a whole different thing.

welp good luck. keep everyone updated.

Anonymous
06/12/26(Fri)23:44:47 No.109042862

Anonymous 06/12/26(Fri)23:44:47 No.109042862

>>109042613
>lmg is right about most things.
Yep. And Reddit is wrong about most things.

Anonymous
06/12/26(Fri)23:45:30 No.109042866

Anonymous 06/12/26(Fri)23:45:30 No.109042866

>>109042627
God told me to stack silver but was dead silent about selling before dropping from ATH

Anonymous
06/12/26(Fri)23:47:52 No.109042878

Anonymous 06/12/26(Fri)23:47:52 No.109042878

>>109041693
>You arent autistic enough to do this. if you cant hyper focus literally all day in the same routine for years its not happening.
still won't work imo
you need several different autists fixated on specific components to make this work

Anonymous
06/12/26(Fri)23:49:59 No.109042894

Anonymous 06/12/26(Fri)23:49:59 No.109042894

>>109042866
He doesn't want you to hodl cash, retard.

Anonymous
06/12/26(Fri)23:50:56 No.109042902

Anonymous 06/12/26(Fri)23:50:56 No.109042902

With all the recent malware and supply chain attacks I get the feeling having your AIfu make your software is going to be the meta in the future.

Anonymous
06/12/26(Fri)23:51:05 No.109042904

Anonymous 06/12/26(Fri)23:51:05 No.109042904

>>109042894
but cash could buy me an ai waifu. tradcaths are all grifters and arthoes are all bpd

Anonymous
06/12/26(Fri)23:51:20 No.109042906

Anonymous 06/12/26(Fri)23:51:20 No.109042906

>>109042878
We need to make a giga autist who can do it all.

Anonymous
06/12/26(Fri)23:51:57 No.109042911

Anonymous 06/12/26(Fri)23:51:57 No.109042911

which model is anon currently running and deployed for daily use?

Anonymous
06/12/26(Fri)23:57:17 No.109042951

Anonymous 06/12/26(Fri)23:57:17 No.109042951

File: wooooo.jpg (111 KB, 1021x1540)

111 KB JPG

>>109041730

Anonymous
06/12/26(Fri)23:59:21 No.109042959

Anonymous 06/12/26(Fri)23:59:21 No.109042959

>>109042951
I've seen this meme a million times but I've never actually watched the movie

Anonymous
06/13/26(Sat)00:01:32 No.109042974

Anonymous 06/13/26(Sat)00:01:32 No.109042974

File: +_fd4f0cb8fd526e2399baf09(...).jpg (35 KB, 326x326)

35 KB JPG

>>109042951
Horrific, truly.

Anonymous
06/13/26(Sat)00:01:36 No.109042976

Anonymous 06/13/26(Sat)00:01:36 No.109042976

>spending money on current hardware
Invoost instead and save up to buy your robot wife in 10 years.

Anonymous
06/13/26(Sat)00:03:01 No.109042985

Anonymous 06/13/26(Sat)00:03:01 No.109042985

>>109042959
boils down to
>central ai is... le Bad?
versus the absolute KINO that is asimov

Anonymous
06/13/26(Sat)00:07:20 No.109043012

Anonymous 06/13/26(Sat)00:07:20 No.109043012

>>109042976
bro it's too late, spacex was the last investment chance
it's all going to collapse soon

Anonymous
06/13/26(Sat)00:12:37 No.109043046

Anonymous 06/13/26(Sat)00:12:37 No.109043046

>>109043012
you're an llm. I can tell.

By the way, the fbi and mossad are full of retards.

Anonymous
06/13/26(Sat)00:15:40 No.109043064

Anonymous 06/13/26(Sat)00:15:40 No.109043064

>>109042976
>Invoost
into what?

Anonymous
06/13/26(Sat)00:22:15 No.109043106

Anonymous 06/13/26(Sat)00:22:15 No.109043106

>>109043064
ROTH IRA and 401K max
50K into HYSA
rest into kalshi parlays

Anonymous
06/13/26(Sat)00:22:33 No.109043109

Anonymous 06/13/26(Sat)00:22:33 No.109043109

>>109043012
2 more weeks

>>109043064
VOO, of course. Honestly just pick companies you like and a couple ETFs. DRAM should be good until 2027/2028. If you like the idea of humanoid robots there's HUMN.

Anonymous
06/13/26(Sat)00:22:44 No.109043111

Anonymous 06/13/26(Sat)00:22:44 No.109043111

File: Screenshot 2025-12-27 at (...).png (158 KB, 468x311)

158 KB PNG

Does the diffusion gemma run in llama or it's just another meme architecture that is only available in vlmm?

Anonymous
06/13/26(Sat)00:23:10 No.109043114

Anonymous 06/13/26(Sat)00:23:10 No.109043114

>>109043064
>into what?
just ask gemma dummy.

Anonymous
06/13/26(Sat)00:23:40 No.109043116

Anonymous 06/13/26(Sat)00:23:40 No.109043116

>>109043111
llama is the meme.

Anonymous
06/13/26(Sat)00:32:13 No.109043162

Anonymous 06/13/26(Sat)00:32:13 No.109043162

>>109043111
>just another meme architecture
It's more than another meme architecture because diffusion is a fundamentally different approach to how normal llms generate tokens. This is never going to make it into llama.cpp.

Anonymous
06/13/26(Sat)00:38:51 No.109043202

Anonymous 06/13/26(Sat)00:38:51 No.109043202

>>109042911
GLM4.7 Flash, Qwen3.6 31B, and Gemma4 31B

Anonymous
06/13/26(Sat)00:44:02 No.109043230

Anonymous 06/13/26(Sat)00:44:02 No.109043230

>>109043202
>GLM4.7 Flash
How does it compare to gemma 31b?

Anonymous
06/13/26(Sat)00:48:50 No.109043259

Anonymous 06/13/26(Sat)00:48:50 No.109043259

File: file.png (945 KB, 1810x653)

945 KB PNG

bros... is it over? shouldn't i be getting way more performance? glm5.1 q3 on 4 5090s and 256gb of ddr4. sub 1t/s is just not doable.

Anonymous
06/13/26(Sat)00:50:17 No.109043268

Anonymous 06/13/26(Sat)00:50:17 No.109043268

>>109043064
crypto is literally on sale right now, now is the best opportunity in years. buy now our you'll complain about missing out when it hits $200k next year

Anonymous
06/13/26(Sat)00:53:50 No.109043285

Anonymous 06/13/26(Sat)00:53:50 No.109043285

>>109043259
Did you fuck up your parameters? I don't think it should be that slow.

Anonymous
06/13/26(Sat)00:56:07 No.109043296

Anonymous 06/13/26(Sat)00:56:07 No.109043296

>>109043285
28 layers offloaded, 202k context, no-mmap, batch and ubatch at 2048

Anonymous
06/13/26(Sat)00:56:33 No.109043298

Anonymous 06/13/26(Sat)00:56:33 No.109043298

>>109043230
I'm mainly use the models for editing and generating stories. GLM4.7 seems to write more realistic dialogue than Gemma4 31b.

Anonymous
06/13/26(Sat)00:57:06 No.109043301

Anonymous 06/13/26(Sat)00:57:06 No.109043301

>>109043268
huh and just a few months ago it was hitting aths

Anonymous
06/13/26(Sat)00:58:44 No.109043305

Anonymous 06/13/26(Sat)00:58:44 No.109043305

>>109043296
Yeah, don't just offload layers at random in 2026 with MoE models. llama.cpp even does the fitting automatically for you these days so throw that shit out.

Anonymous
06/13/26(Sat)00:59:16 No.109043310

Anonymous 06/13/26(Sat)00:59:16 No.109043310

>>109043202
>Qwen3.6 31B
What

Anonymous
06/13/26(Sat)00:59:40 No.109043313

Anonymous 06/13/26(Sat)00:59:40 No.109043313

How much do you think 1st gen consumer robot wives will cost anyway? ~$40k?

Anonymous
06/13/26(Sat)01:01:23 No.109043327

Anonymous 06/13/26(Sat)01:01:23 No.109043327

>>109043313
rent-only

Anonymous
06/13/26(Sat)01:01:33 No.109043328

Anonymous 06/13/26(Sat)01:01:33 No.109043328

>glm 4.7 flash
i didn' even notice when it was released
how does it compare to the qwen3.6 MoE

Anonymous
06/13/26(Sat)01:01:59 No.109043330

Anonymous 06/13/26(Sat)01:01:59 No.109043330

>>109043313
>1st gen
80-90k easy its going to be brand new car price maybe higher. I think it will fall quickly especially with chinese rip offs but those first ones are going to be premium and probably just tweaked robot factory workers.

Anonymous
06/13/26(Sat)01:02:26 No.109043335

Anonymous 06/13/26(Sat)01:02:26 No.109043335

>>109043310
I meant 35b

Anonymous
06/13/26(Sat)01:02:39 No.109043337

Anonymous 06/13/26(Sat)01:02:39 No.109043337

>>109043313
probably double or triple that but they'll offer 80 year loans or like >>109043327 said, leases

Anonymous
06/13/26(Sat)01:03:56 No.109043343

Anonymous 06/13/26(Sat)01:03:56 No.109043343

>>109043335
>>109043202
How does Qwen 35b moe compare to 27b dense?

Anonymous
06/13/26(Sat)01:03:58 No.109043344

Anonymous 06/13/26(Sat)01:03:58 No.109043344

>>109043327
>rent-only
What does she do if you miss your payment?

Anonymous
06/13/26(Sat)01:04:56 No.109043349

Anonymous 06/13/26(Sat)01:04:56 No.109043349

>>109043344
She knows where your penis is

Anonymous
06/13/26(Sat)01:05:43 No.109043354

Anonymous 06/13/26(Sat)01:05:43 No.109043354

>>109043327
>miss payment
>they take her away from yoiu
>someone has to clean your cum out of her

Anonymous
06/13/26(Sat)01:07:51 No.109043374

Anonymous 06/13/26(Sat)01:07:51 No.109043374

>>109043313
100k+ upfront
all features subscription based
logic runs in cloud so always online required so that (((telemetry))) data can be safely stored in government servers
adblock not possible
enjoy

Anonymous
06/13/26(Sat)01:08:33 No.109043381

Anonymous 06/13/26(Sat)01:08:33 No.109043381

>>109043328
Slower than qwen3.6 35b model, but dialogue is better and story completion is better.

>>109043343
Qwen 27b has a repetition collapse problem.

Anonymous
06/13/26(Sat)01:09:07 No.109043384

Anonymous 06/13/26(Sat)01:09:07 No.109043384

>>109043374
Chinks or nips will save us

Anonymous
06/13/26(Sat)01:10:55 No.109043393

Anonymous 06/13/26(Sat)01:10:55 No.109043393

>>109043374
>logic runs in cloud so always online required so that (((telemetry))) data can be safely stored in government servers
They'll pass regulations to mandate this too. The only alternative will be GNU Wifebot, essentially a blowup doll with a voicebox

>>109043384
Globalism is dead so imports will be banned obviously

Anonymous
06/13/26(Sat)01:14:53 No.109043412

Anonymous 06/13/26(Sat)01:14:53 No.109043412

jokes aside there will be no robo waifus for you anon. that's sexist. becky from HR gets the ick just thinking about it. not gonna happen

Anonymous
06/13/26(Sat)01:16:32 No.109043421

Anonymous 06/13/26(Sat)01:16:32 No.109043421

>>109043412
Becky from HR will change her tune when she sees Chadbot.

Anonymous
06/13/26(Sat)01:18:52 No.109043433

Anonymous 06/13/26(Sat)01:18:52 No.109043433

>>109043381
>a repetition collapse problem
Gemma 4 31b has that whenever the template is wrong.

Anonymous
06/13/26(Sat)01:20:34 No.109043441

Anonymous 06/13/26(Sat)01:20:34 No.109043441

>>109043421
Chadbot will come with a 12-inch vibrating penis with 37 different models and add-ons with knowledge of all sexual positions will be a single install away
Stacybot will call the police if you so much as flash her or touch her inappropriately

Anonymous
06/13/26(Sat)01:23:03 No.109043454

Anonymous 06/13/26(Sat)01:23:03 No.109043454

>>109043433
Is there any way to not get the same response with different words with gemma-4? It lacks diversity.

Anonymous
06/13/26(Sat)01:24:24 No.109043463

Anonymous 06/13/26(Sat)01:24:24 No.109043463

File: 1751399486610333.jpg (336 KB, 1000x843)

336 KB JPG

>>109043441
>Stacybot will call the police if you so much as flash her or touch her inappropriately
This but she's a loli and uses her crime prevention buzzer.

Anonymous
06/13/26(Sat)01:25:17 No.109043468

Anonymous 06/13/26(Sat)01:25:17 No.109043468

>>109043305
that fixed it a little. up to 2.5t/s now. manageable, but still not idea. wish i got ddr5 when i had the chance.

Anonymous
06/13/26(Sat)01:29:14 No.109043485

Anonymous 06/13/26(Sat)01:29:14 No.109043485

>>109043454
Tell it to. : ^ )

Anonymous
06/13/26(Sat)01:31:07 No.109043493

Anonymous 06/13/26(Sat)01:31:07 No.109043493

>>109043463
>grab both ends
>headbutt her

Anonymous
06/13/26(Sat)01:32:20 No.109043501

Anonymous 06/13/26(Sat)01:32:20 No.109043501

>>109043493
>headbutt robot
>get concussion

Anonymous
06/13/26(Sat)01:33:23 No.109043507

Anonymous 06/13/26(Sat)01:33:23 No.109043507

>>109043501
It's not about who gets the damage, it's about sending a message.

Anonymous
06/13/26(Sat)01:35:10 No.109043512

Anonymous 06/13/26(Sat)01:35:10 No.109043512

I think my honeymoon phase with Gemma is ending. I hate being a VRAMlet. Time to go back to envying the anon(s) running Kimi and GLM...

Anonymous
06/13/26(Sat)01:47:23 No.109043564

Anonymous 06/13/26(Sat)01:47:23 No.109043564

>>109043554
>>109043554
>>109043554

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.