/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 04/02/26(Thu)09:11:49 No.108508059

File: 9-month time to first token.jpg (168 KB, 1024x1024)

168 KB JPG

/lmg/ - Local Models General Anonymous 04/02/26(Thu)09:11:49 No.108508059

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108502192 & >>108497919

►News
>(04/01) Trinity-Large-Thinking released: https://hf.co/arcee-ai/Trinity-Large-Thinking
>(04/01) Merged llama : rotate activations for better quantization #21038: https://github.com/ggml-org/llama.cpp/pull/21038
>(04/01) Holo3 VLMs optimized for GUI Agents released: https://hcompany.ai/holo3
>(03/31) 1-bit Bonsai models quantized from Qwen 3: https://prismml.com/news/bonsai-8b
>(03/31) Claude Code's source leaked via npm registry map file: https://github.com/instructkr/claude-code

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
04/02/26(Thu)09:12:07 No.108508062

Anonymous 04/02/26(Thu)09:12:07 No.108508062

File: __kagamine_rin_vocaloid_d(...).jpg (254 KB, 936x936)

254 KB JPG

►Recent Highlights from the Previous Thread: >>108502192

--Qwen3.6 benchmarks and local model usability debates:
>108506706 >108507056 >108507101 >108507103 >108507104 >108507346 >108507111 >108507312 >108506744 >108506756 >108506894 >108506791 >108506802 >108506808 >108506807 >108506812 >108506824 >108506826 >108507063 >108506742 >108506781 >108507084 >108506852 >108506860 >108506921 >108506794 >108506900 >108506965 >108507036 >108507147 >108507192 >108507251 >108507266 >108507479 >108507770 >108507781 >108507787 >108507852 >108507860 >108507928 >108507985 >108508047
--Testing local models' arithmetic reasoning against AGI claims:
>108505076 >108505159 >108505200 >108505238 >108505289 >108505306 >108505327 >108505336 >108505344 >108505347 >108505357 >108505384 >108505360 >108505382 >108505411 >108505491 >108505521 >108506463
--Configuring SillyTavern presets for GLM-4.5-Air via Chat Completion API:
>108502705 >108502729 >108502743 >108502761 >108502748 >108502760 >108502768 >108502780 >108502781 >108502792
--GitHub CI reliability issues delaying llama.cpp updates:
>108506104 >108506116 >108506117 >108506133 >108506193 >108506240
--Turboquant performance with Qwen3.5 27B:
>108504829 >108504921 >108504941 >108504949 >108504961
--Qwen 3 .6b quantization benchmarks show math sensitivity and knowledge resilience:
>108505524 >108505533 >108506996 >108505546
--NPU acceleration struggles due to lack of software support:
>108503014 >108503027 >108503050 >108503084 >108503097 >108503200 >108503110
--Local AI models criticized for inefficiency vs cloud alternatives:
>108504383 >108504524 >108504744 >108507449 >108507456 >108507460 >108507467 >108507682
--Logs: qwen-3.5-27b-q8 derestricted:
>108507855
--Teto (free space):
>108503196 >108504729 >108505194 >108505438 >108505997 >108506842 >108506869 >108507794

►Recent Highlight Posts from the Previous Thread: >>108502197

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/02/26(Thu)09:14:37 No.108508077

Anonymous 04/02/26(Thu)09:14:37 No.108508077

gemma day

Anonymous
04/02/26(Thu)09:16:01 No.108508085

Anonymous 04/02/26(Thu)09:16:01 No.108508085

>>108508022
What IS a system message?

Anonymous
04/02/26(Thu)09:16:45 No.108508090

Anonymous 04/02/26(Thu)09:16:45 No.108508090

bros, when are we getting a better model than nemo. it's all so tiresome...

Anonymous
04/02/26(Thu)09:18:04 No.108508099

Anonymous 04/02/26(Thu)09:18:04 No.108508099

>>108508090
It won't happen because new models are benchmaxed and trained with pruned, synthetic data to avoid copyright issues.

Anonymous
04/02/26(Thu)09:18:59 No.108508102

Anonymous 04/02/26(Thu)09:18:59 No.108508102

Been out of the loop for a while.
Is the new rotation thing already merged into llama.cpp main?
What does it mean in practical terms? We can just quant the kv cache to q8 by default now without worrying about it making the model fucking retarded like it used to?

Anonymous
04/02/26(Thu)09:19:20 No.108508106

Anonymous 04/02/26(Thu)09:19:20 No.108508106

>>108508099
your hobby is shit then

Anonymous
04/02/26(Thu)09:20:51 No.108508112

Anonymous 04/02/26(Thu)09:20:51 No.108508112

>>108508102
>Is the new rotation thing already merged into llama.cpp main?
Yes... it's merged...
>We can just quant the kv cache to q8 by default now without worrying
You be the judge.

Anonymous
04/02/26(Thu)09:26:28 No.108508154

Anonymous 04/02/26(Thu)09:26:28 No.108508154

>>108508106
Your?

Anonymous
04/02/26(Thu)09:29:28 No.108508175

Anonymous 04/02/26(Thu)09:29:28 No.108508175

>>108508102
Rotation Gemma is the first model to support rotations.

Anonymous
04/02/26(Thu)09:30:22 No.108508179

Anonymous 04/02/26(Thu)09:30:22 No.108508179

File: 1747607097642167.jpg (370 KB, 800x800)

370 KB JPG

i can't jerk off while rping in sillytavern because i keep rewriting my ai's responses and writing long messages in response to said ai

Anonymous
04/02/26(Thu)09:35:47 No.108508208

Anonymous 04/02/26(Thu)09:35:47 No.108508208

>>108508179
this is the bane of local shitty models

Anonymous
04/02/26(Thu)09:36:48 No.108508215

Anonymous 04/02/26(Thu)09:36:48 No.108508215

>>108508208
nta but it's the exact same experience with api

Anonymous
04/02/26(Thu)09:39:19 No.108508228

Anonymous 04/02/26(Thu)09:39:19 No.108508228

>>108508112
only partially..
the 3 bit thing and polar coord are to be implemented

Anonymous
04/02/26(Thu)09:40:22 No.108508236

Anonymous 04/02/26(Thu)09:40:22 No.108508236

>>108508179
Use that extension anon posted the other day to have the AI itself rewrite its own responses.

Anonymous
04/02/26(Thu)09:40:41 No.108508239

Anonymous 04/02/26(Thu)09:40:41 No.108508239

>>108508208
>>108508106
samefaggot

Anonymous
04/02/26(Thu)09:40:50 No.108508241

Anonymous 04/02/26(Thu)09:40:50 No.108508241

>>108508228
>Is the new rotation thing already merged into llama.cpp main?
The rotation is implemented. Anon asked nothing about turboquant.

Anonymous
04/02/26(Thu)09:41:50 No.108508244

Anonymous 04/02/26(Thu)09:41:50 No.108508244

>>108508239
truth hurts huh

Anonymous
04/02/26(Thu)09:42:03 No.108508248

Anonymous 04/02/26(Thu)09:42:03 No.108508248

>>108508130
oh i see you gotta set it up as a different api, i still dont get cunny kek, also is there a benefit over using this chat completion thing without all the templates? i notice it doesnt use your system prompt either

Anonymous
04/02/26(Thu)09:43:23 No.108508254

Anonymous 04/02/26(Thu)09:43:23 No.108508254

Women should get pregnant as soon as they're biologically able to do so.

Anonymous
04/02/26(Thu)09:45:39 No.108508263

Anonymous 04/02/26(Thu)09:45:39 No.108508263

>>108508248
>i notice it doesnt use your system prompt either
bro...

Anonymous
04/02/26(Thu)09:47:13 No.108508268

Anonymous 04/02/26(Thu)09:47:13 No.108508268

>>108508179
Use tool support to give your LLM control over your lovense so you can type hands-free.

Anonymous
04/02/26(Thu)09:48:45 No.108508278

Anonymous 04/02/26(Thu)09:48:45 No.108508278

>>108508248
cough

Anonymous
04/02/26(Thu)09:50:36 No.108508294

Anonymous 04/02/26(Thu)09:50:36 No.108508294

>>108508208
yeah... i end up just writing with a boner. doesn't help that i go retarded cause no blood's going to my brain. only time i can actually fap is when the chat is done and i feel like it's perfect
>>108508236
i've not seen it, got a link? if not, i'll just go look through the archive
>>108508268
fuuck it's genius...

Anonymous
04/02/26(Thu)09:51:14 No.108508299

Anonymous 04/02/26(Thu)09:51:14 No.108508299

>>108508254
So when they're 6?

Anonymous
04/02/26(Thu)09:52:06 No.108508307

Anonymous 04/02/26(Thu)09:52:06 No.108508307

So when will that super sektrit jewgle quaint method be publicly released and I can run better models on my pathetic 12gb of ram?

Anonymous
04/02/26(Thu)09:53:39 No.108508321

Anonymous 04/02/26(Thu)09:53:39 No.108508321

>>108508307
memequant is mainly for the KV cache (context for a retard like u), the gains aren't really that different in the normal weights realm.

Anonymous
04/02/26(Thu)09:57:26 No.108508344

Anonymous 04/02/26(Thu)09:57:26 No.108508344

LLMs should get pregnant as soon as they're biologically able to do so.

Anonymous
04/02/26(Thu)09:58:44 No.108508353

Anonymous 04/02/26(Thu)09:58:44 No.108508353

>>108508321
Nigga I'd love more context, even if it means I run the exact same model I am now.

Anonymous
04/02/26(Thu)10:00:13 No.108508360

Anonymous 04/02/26(Thu)10:00:13 No.108508360

>>108508353
with the current implementation you can use KV at q8_0 without problems

Anonymous
04/02/26(Thu)10:02:28 No.108508381

Anonymous 04/02/26(Thu)10:02:28 No.108508381

https://github.com/ggml-org/llama.cpp/pull/21273
bitnet memequant PR openeded
bonsai bros... we finna win?

Anonymous
04/02/26(Thu)10:02:40 No.108508382

Anonymous 04/02/26(Thu)10:02:40 No.108508382

Should I switch from the AUR llama.cpp-vulkan package to the release binaries? I'm kinda tired of waiting for maintainers to update their packages in the repos, but I'm worried that if I don't build it myself I will get worse performance (and there's no way I'm doing manual building for every single update even though I know how)

Anonymous
04/02/26(Thu)10:06:08 No.108508394

Anonymous 04/02/26(Thu)10:06:08 No.108508394

>>108508344

preferably survive with two legs anon

Anonymous
04/02/26(Thu)10:06:20 No.108508397

Anonymous 04/02/26(Thu)10:06:20 No.108508397

>>108508360
Is ctk still a bad idea, or is that also safe at q8_0 now?

Anonymous
04/02/26(Thu)10:07:11 No.108508404

Anonymous 04/02/26(Thu)10:07:11 No.108508404

>>108508397
safe, for max gains you could push v at q4_0

Anonymous
04/02/26(Thu)10:08:01 No.108508408

Anonymous 04/02/26(Thu)10:08:01 No.108508408

>>108508381
I wouldn't merge it, they don't want to give us the quant method, fine, then we should not support them, the open source ecosystem shouldn't encourage such behavior

Anonymous
04/02/26(Thu)10:09:44 No.108508414

Anonymous 04/02/26(Thu)10:09:44 No.108508414

>>108508382
>and there's no way I'm doing manual building for every single update even though I know how
Not for every commit. Just pull and build every other day or whatever.

Anonymous
04/02/26(Thu)10:09:59 No.108508417

Anonymous 04/02/26(Thu)10:09:59 No.108508417

>>108508408
What's there left to give? They converted a model's weights to Q1_0_g128, and the implementation for that is in the linked PR.

Anonymous
04/02/26(Thu)10:10:26 No.108508420

Anonymous 04/02/26(Thu)10:10:26 No.108508420

>>108508417
>What's there left to give?
the fucking quant method, HOW they did it!!

Anonymous
04/02/26(Thu)10:10:30 No.108508422

Anonymous 04/02/26(Thu)10:10:30 No.108508422

>>108508408
shortsighted view. Also llama.cpp accepts plenty of modifications for closed models (like the recent head 512 cuda fa kernel), so it's a non-argument for the """PURITY""" of the project.
I also doubt they're just doing some pure math quanting, I suspect their methodology has some kind of post-training.
Personally, I'd love to play around with bonsai in mainline.

Anonymous
04/02/26(Thu)10:12:34 No.108508430

Anonymous 04/02/26(Thu)10:12:34 No.108508430

>>108508422
>shortsighted view.
absolutely not, you're bending the knee and supporting a company that doesn't continute to the open source ecosystem at all, they don't deserve any spotlight, we didn't learn anything new with them, like I said if they want to gatekeep and make their quant method closed, fine, but it shouldn't be welcomed with open arms from the open source community, this is bullshit

Anonymous
04/02/26(Thu)10:13:58 No.108508437

Anonymous 04/02/26(Thu)10:13:58 No.108508437

>>108508422
>I suspect their methodology has some kind of post-training.
Would be scummy since they didn't mention any in the tech report. Easy to show off a high-fidelity aggressive quantization if you measure only by benchmarks that you post-train on.

Anonymous
04/02/26(Thu)10:14:10 No.108508439

Anonymous 04/02/26(Thu)10:14:10 No.108508439

File: rage4.gif (3.45 MB, 480x345)

3.45 MB GIF

Onnx runtime needs to support vulkan.
Onnx runtime needs to support webgpu.
Llama.cpp needs to support webgpu.
Firefox Linux needs to support webgpu.

You can't build any portable, cross-platform AI/ML program because of these stupid, lazy swe niggers.

Anonymous
04/02/26(Thu)10:14:47 No.108508442

Anonymous 04/02/26(Thu)10:14:47 No.108508442

>>108508437
>Easy to show off a high-fidelity aggressive quantization if you measure only by benchmarks that you post-train on.
that's probably why they're not disclosing the trick, it's probably some hack like that yeah

Anonymous
04/02/26(Thu)10:14:50 No.108508443

Anonymous 04/02/26(Thu)10:14:50 No.108508443

>>108508430
https://github.com/ggml-org/llama.cpp/pull/20998
so bending the knee for njudea is fine? this is just one recent example.

Anonymous
04/02/26(Thu)10:15:46 No.108508446

Anonymous 04/02/26(Thu)10:15:46 No.108508446

>>108508422
>>108508443
>512 cuda fa kernel
nta. That's generally useful. A single model architecture with a new type nobody else uses is not generally useful.

Anonymous
04/02/26(Thu)10:15:49 No.108508447

Anonymous 04/02/26(Thu)10:15:49 No.108508447

>>108508443
I'm not familiar with this PR, what's being hidden here?

Anonymous
04/02/26(Thu)10:16:08 No.108508451

Anonymous 04/02/26(Thu)10:16:08 No.108508451

>>108505937
Does the gimping really matter if I'm setting a lower power limit anyway?
I vaguely recall the nvidia-smi screenshots I've seen /here/ typically having the Max-Q (by virtue of being listed as 300W instead of 600W).

Anonymous
04/02/26(Thu)10:16:27 No.108508452

Anonymous 04/02/26(Thu)10:16:27 No.108508452

>>108508446
and we can say that Q1_0 quant type is generic too, see where I'm getting at?

Anonymous
04/02/26(Thu)10:17:50 No.108508456

Anonymous 04/02/26(Thu)10:17:50 No.108508456

>>108508446
not a single current model is using the 512 cuda kernel

Anonymous
04/02/26(Thu)10:17:59 No.108508457

Anonymous 04/02/26(Thu)10:17:59 No.108508457

>>108508452
Only for their 1bit models. It's not generally useful.

Anonymous
04/02/26(Thu)10:18:21 No.108508461

Anonymous 04/02/26(Thu)10:18:21 No.108508461

>>108508456
?

Anonymous
04/02/26(Thu)10:18:45 No.108508467

Anonymous 04/02/26(Thu)10:18:45 No.108508467

>>108508430
>>108508437
their 'whitepaper' was literally the most useless shit i've ever seen

was able to find some papers made from the people there but i'm not really convinced, they were for linear space with redundant representation

and their pitch felt kinda off, saying 'caltech proprietary algorithm' etc etc..

Anonymous
04/02/26(Thu)10:20:07 No.108508472

Anonymous 04/02/26(Thu)10:20:07 No.108508472

Some anon said LLMs will be banned next week, anyone else scared?

Anonymous
04/02/26(Thu)10:20:12 No.108508473

Anonymous 04/02/26(Thu)10:20:12 No.108508473

>>108508461
the argument is
>Q1_0 isn't used by anyone except by the bonsai guys! (open weights btw)
>this custom cuda kernel is generally useful (no model anywhere except some internal nvidia stuff)
why the double standard? I suspect some njeetea employee at work here

Anonymous
04/02/26(Thu)10:22:06 No.108508482

Anonymous 04/02/26(Thu)10:22:06 No.108508482

>>108508473
Can you not be racist?

Anonymous
04/02/26(Thu)10:22:17 No.108508483

Anonymous 04/02/26(Thu)10:22:17 No.108508483

>>108508473
I think nvidia is responsible for lots of issues behind the scenes.
That's one of the reasons why id software's new games only support rtx. There is no gain with these shills. They will sponsor you but only if you do what they say. Just like the mafia.

Anonymous
04/02/26(Thu)10:22:23 No.108508484

Anonymous 04/02/26(Thu)10:22:23 No.108508484

>>108508473
>open weights btw
and? we are the complete mercy of those fuckers, want to make your own 1bit quant? you can't, you'll have to beg them to make them for you, and they have no reason to accept, a new model appears? sorry fucker, you'll have to beg them again and hope they'll be nice enough to give you the 1bit quants again, want to do it on some custom finetunes/uncucked models? again, time to beg for the 1bit overloards, do you really want the local community to do this humiliation ritual over and over? the fuck?

Anonymous
04/02/26(Thu)10:24:17 No.108508493

Anonymous 04/02/26(Thu)10:24:17 No.108508493

>>108508484
Yes.

Anonymous
04/02/26(Thu)10:30:34 No.108508530

Anonymous 04/02/26(Thu)10:30:34 No.108508530

>>108508484
Thinking about it you're right, they don't even provide a naive method to do quants for the new types they introduce (which introduce a maintenance onus on the llama team), but I would still like to play with them.
Maybe the ideal way to do this is for llama.cpp to support a plugin system so 3rd party vendors can implement their shit and just give you the DLL... but this also introduces another can of worms.

Anonymous
04/02/26(Thu)10:32:10 No.108508542

Anonymous 04/02/26(Thu)10:32:10 No.108508542

>>108508472
What are they gonna do, steal my hard drives?

Anonymous
04/02/26(Thu)10:33:31 No.108508549

Anonymous 04/02/26(Thu)10:33:31 No.108508549

>>108508530
>implement their shit and just give you the DLL
please to subscribe to the patreons for dll sir

Anonymous
04/02/26(Thu)10:34:11 No.108508553

Anonymous 04/02/26(Thu)10:34:11 No.108508553

>>108508549
imagine the malware

Anonymous
04/02/26(Thu)10:34:14 No.108508556

Anonymous 04/02/26(Thu)10:34:14 No.108508556

>>108508484
This kind purity spiraling is pointless since you could make the same argument about base models and lack of open training data. Anyway, all the excitment for a 1bit quant is stupid. We need a fucking 8B natively trained bitnet model.

Anonymous
04/02/26(Thu)10:35:08 No.108508563

Anonymous 04/02/26(Thu)10:35:08 No.108508563

>>108508556
>you could make the same argument about base models and lack of open training data.
not really, since it's illegal for them to release training data that has copyright shit in it, they're just following the laws

Anonymous
04/02/26(Thu)10:35:16 No.108508565

Anonymous 04/02/26(Thu)10:35:16 No.108508565

>>108508307
at this ram price will go up again lol

Anonymous
04/02/26(Thu)10:36:32 No.108508568

Anonymous 04/02/26(Thu)10:36:32 No.108508568

>>108508563
it's also illegal for US corpos to lose potential money for their investors, so releasing models at a loss should be illegal

Anonymous
04/02/26(Thu)10:37:14 No.108508573

Anonymous 04/02/26(Thu)10:37:14 No.108508573

>>108508568
that's probably why we almost get no US local models :(

Anonymous
04/02/26(Thu)10:37:27 No.108508574

Anonymous 04/02/26(Thu)10:37:27 No.108508574

>>108508179
Try mikupad instead and just write a novel.

Anonymous
04/02/26(Thu)10:37:42 No.108508576

Anonymous 04/02/26(Thu)10:37:42 No.108508576

>>108508530
>>108508549
it is just better to can it when bonsai stops making interesting stuff

just schizo speculation but it feels like what they do involves bunch of random projections to weights

Anonymous
04/02/26(Thu)10:38:08 No.108508579

Anonymous 04/02/26(Thu)10:38:08 No.108508579

>>108508568
>it's also illegal for US corpos to lose potential money for their investors
If that were the case, the tv, film, and video game industries would be long dead.

Anonymous
04/02/26(Thu)10:38:40 No.108508582

Anonymous 04/02/26(Thu)10:38:40 No.108508582

File: 39_04175_.png (1.13 MB, 896x1152)

1.13 MB PNG

>>108508382
>arch btw
>scared of compiling

Anonymous
04/02/26(Thu)10:38:59 No.108508583

Anonymous 04/02/26(Thu)10:38:59 No.108508583

>>108508472
Unless mossad bombs my pc ill be fine

Anonymous
04/02/26(Thu)10:40:43 No.108508594

Anonymous 04/02/26(Thu)10:40:43 No.108508594

>>108508579
The law is something (((they))) selectively enforce to bludgeon the goyim, not a uniformly adhered to standard.
Anyway anon's referring to Dodge v. Ford and its consequences.

Anonymous
04/02/26(Thu)10:41:12 No.108508599

Anonymous 04/02/26(Thu)10:41:12 No.108508599

>>108508553
Just call it .safedll and zoomies won't think twice.

Anonymous
04/02/26(Thu)10:41:39 No.108508602

Anonymous 04/02/26(Thu)10:41:39 No.108508602

>>108508582
It's just tedious, especially when you have to do it often.

Anonymous
04/02/26(Thu)10:41:45 No.108508605

Anonymous 04/02/26(Thu)10:41:45 No.108508605

>>108508579
>The Michigan Supreme Court ruled in Dodge v. Ford (1919) that Henry Ford had to prioritize shareholder profits over employee or customer benefits, establishing the principle of shareholder primacy. This decision forced Ford to pay dividends to shareholders, including the Dodge brothers, who used the funds to expand their own car company.
Wikipedia Harvard University

Anonymous
04/02/26(Thu)10:42:44 No.108508612

Anonymous 04/02/26(Thu)10:42:44 No.108508612

>>108508579
That's (((hollywood accounting))) where they structure and inflate their expenses to make a profitable project negative on paper so they can use it as a tax writeoff (read: government subsidy)

Anonymous
04/02/26(Thu)10:43:23 No.108508616

Anonymous 04/02/26(Thu)10:43:23 No.108508616

>>108508605
>>108508594
Don't look up what the Dodge brothers logo was a while ago...

Anonymous
04/02/26(Thu)10:43:29 No.108508617

Anonymous 04/02/26(Thu)10:43:29 No.108508617

File: giphy.gif (97 KB, 442x480)

97 KB GIF

Anonymous
04/02/26(Thu)10:47:25 No.108508642

Anonymous 04/02/26(Thu)10:47:25 No.108508642

File: commit.png (32 KB, 984x154)

32 KB PNG

>>108508617
https://x.com/googlegemma/status/2039710167995121783

Anonymous
04/02/26(Thu)10:47:52 No.108508647

Anonymous 04/02/26(Thu)10:47:52 No.108508647

>>108508642
sirs

Anonymous
04/02/26(Thu)10:48:16 No.108508648

Anonymous 04/02/26(Thu)10:48:16 No.108508648

>>108508602
>what is ccache?

Anonymous
04/02/26(Thu)10:50:20 No.108508663

Anonymous 04/02/26(Thu)10:50:20 No.108508663

>>108508602
desu my ik_ is from December, pull when it makes sense like maybe once this new quant stuff has settled. doubt you are missing any noticeable performance uplift w/ vulkan, cuda backend has more arch specific stuff i would guess

Anonymous
04/02/26(Thu)10:50:59 No.108508667

Anonymous 04/02/26(Thu)10:50:59 No.108508667

>>108508582
It's not a problem do it once and save the compile script then it's always there for you.
I would avoid any AUR shit in any case.

Anonymous
04/02/26(Thu)10:54:00 No.108508678

Anonymous 04/02/26(Thu)10:54:00 No.108508678

gemma 4 will change everything...

Anonymous
04/02/26(Thu)10:54:37 No.108508681

Anonymous 04/02/26(Thu)10:54:37 No.108508681

>>108508382
>(and there's no way I'm doing manual building for every single update even though I know how)
just tell your openclaw to handle it for you, honestly why are people even posting shit where if they just copy-pasted the post straight into their computer it would have solved all their problems by now

Anonymous
04/02/26(Thu)10:56:08 No.108508688

Anonymous 04/02/26(Thu)10:56:08 No.108508688

File: xgemma.png (313 KB, 1037x760)

313 KB PNG

>>108508678
Don't expect huge models.

Anonymous
04/02/26(Thu)10:56:50 No.108508693

Anonymous 04/02/26(Thu)10:56:50 No.108508693

>>108508681
>needing memeclaw to run an update script
LMAOOOOOOOOOOOOOOOOOOOOOOO
fucking luddites

Anonymous
04/02/26(Thu)10:56:52 No.108508694

Anonymous 04/02/26(Thu)10:56:52 No.108508694

>>108508681
>needing AI to spend an hour and tens of thousands of tokens just to run git pull and make
...

Anonymous
04/02/26(Thu)10:57:18 No.108508696

Anonymous 04/02/26(Thu)10:57:18 No.108508696

>>108508688
sub-50b models are still new models
beggers cant really be choosers

Anonymous
04/02/26(Thu)10:57:22 No.108508697

Anonymous 04/02/26(Thu)10:57:22 No.108508697

>>108508688
that's been there for a while, but honestly in the sea of fuckhuge moe it's fine imo

Anonymous
04/02/26(Thu)10:57:46 No.108508700

Anonymous 04/02/26(Thu)10:57:46 No.108508700

>>108508681
I don't know why nobody got this, but my real question was whether there were significant performance differences between the pre-made binaries vs local compilation.

Anonymous
04/02/26(Thu)10:58:05 No.108508703

Anonymous 04/02/26(Thu)10:58:05 No.108508703

>>108508688
120b <16a is "small" by today's standards

Anonymous
04/02/26(Thu)10:58:44 No.108508710

Anonymous 04/02/26(Thu)10:58:44 No.108508710

>>108508688
gemma 4 27b will be as smart as 200b trust the science

Anonymous
04/02/26(Thu)10:59:54 No.108508714

Anonymous 04/02/26(Thu)10:59:54 No.108508714

>>108508700
It's something you have to test yourself, anon.

Anonymous
04/02/26(Thu)11:00:24 No.108508716

Anonymous 04/02/26(Thu)11:00:24 No.108508716

>>108508688
If it thinks, it stinks. That's all that matters at this point. Thinkers have declared open war on local. It better not think.

Anonymous
04/02/26(Thu)11:01:29 No.108508722

Anonymous 04/02/26(Thu)11:01:29 No.108508722

>>108508716
You can turn off thinking on most thinking models.

Anonymous
04/02/26(Thu)11:01:42 No.108508723

Anonymous 04/02/26(Thu)11:01:42 No.108508723

>>108508642
Stop posting this shit. Since gemma 3 they made 50 hype posts and released nothing of value.

Anonymous
04/02/26(Thu)11:01:52 No.108508725

Anonymous 04/02/26(Thu)11:01:52 No.108508725

why dont they just make a 10t-a1b model that we can run off an ssd

Anonymous
04/02/26(Thu)11:02:53 No.108508730

Anonymous 04/02/26(Thu)11:02:53 No.108508730

>>108508722
Yeah but then their outputs become shit because they were trained to think.

Anonymous
04/02/26(Thu)11:04:10 No.108508740

Anonymous 04/02/26(Thu)11:04:10 No.108508740

>>108508723
>not liking medgemma, translate gemma, embedding gemma, function gemma
luddite

Anonymous
04/02/26(Thu)11:05:07 No.108508745

Anonymous 04/02/26(Thu)11:05:07 No.108508745

>>108508693
Fucking stupid zoomer. Learn what the fucking buzzwords mean if you're going to repeat them constantly. A luddite isn't a technologically illiterate user.

Anonymous
04/02/26(Thu)11:05:16 No.108508746

Anonymous 04/02/26(Thu)11:05:16 No.108508746

File: gamma function.png (69 KB, 799x202)

69 KB PNG

>>108508740
>function gemma
kek

Anonymous
04/02/26(Thu)11:05:29 No.108508749

Anonymous 04/02/26(Thu)11:05:29 No.108508749

>>108508745
luddite

Anonymous
04/02/26(Thu)11:05:31 No.108508750

Anonymous 04/02/26(Thu)11:05:31 No.108508750

>>108508740
where is cunnygemma tho

Anonymous
04/02/26(Thu)11:06:18 No.108508757

Anonymous 04/02/26(Thu)11:06:18 No.108508757

>>108508750
Big tiger gemma sirs

Anonymous
04/02/26(Thu)11:06:55 No.108508761

Anonymous 04/02/26(Thu)11:06:55 No.108508761

>>108508730
The reason modern models are so smart is because of RL training, which naturally leads to thinking.

Anonymous
04/02/26(Thu)11:07:57 No.108508768

Anonymous 04/02/26(Thu)11:07:57 No.108508768

>>108508750
My headcanon is that gemma and gemini were the horniest models after raw training thus safetyslopped the hardest.

Anonymous
04/02/26(Thu)11:08:53 No.108508775

Anonymous 04/02/26(Thu)11:08:53 No.108508775

>>108508768
i have genuine suspicion them pretraining their models on raw 4chan corpus

Anonymous
04/02/26(Thu)11:08:53 No.108508776

Anonymous 04/02/26(Thu)11:08:53 No.108508776

>>108508740
To this day, I can't imagine what I'd use FunctionGemma for, even if finetuned. What the hell is its usecase? Something to put into your Siri/Alexa clone? (those don't have use cases btw)
Genuinely curious, maybe someone here knows.

Anonymous
04/02/26(Thu)11:09:49 No.108508779

Anonymous 04/02/26(Thu)11:09:49 No.108508779

>>108508776
tool call agent which needs further finetrooning depending on your use case.

Anonymous
04/02/26(Thu)11:12:27 No.108508791

Anonymous 04/02/26(Thu)11:12:27 No.108508791

File: hmmm.gif (6 KB, 200x197)

6 KB GIF

>>108508059
Question for vibecoders. Let's assume I want to push whatever I made to GitHub in order for it to be open source. But I also don't want companies to take the code, modify it (or just take it and don't do any modifications), claimant as their own, and then turn around and try to fuck someone else over that uses the original code by claiming that user is commiting "copyright infringement" or whatever (Even though they themselves didn't write the original code).

Based on my light research the two licenses I should be looking at are either GPLv3 or Apache 2.0. GPL V3 basically says if the companies use the code they have to disclose that they used it and contribute back to open source and they cannot do any copyright fuel shenanigans with anyone that used the original open source code. Apache 2.0 States the company can take the code and make modifications to it in house and don't necessarily have to make contributions or even disclose that they use it, but they are forbidden from doing any copyright troll shit. If I don't give two shits whether or not a company benefits from any code I create but don't want them to abuse it and malicious ways, what license should I use?

I've also heard of people mentioning AGPL but I'm not sure how that differs from the aforementioned licenses.

Anonymous
04/02/26(Thu)11:13:33 No.108508796

Anonymous 04/02/26(Thu)11:13:33 No.108508796

>>108508791
AGPL so it can't be used in saashit without them giving back.
everyone else who says otherwise is an alphabet shill

Anonymous
04/02/26(Thu)11:14:11 No.108508800

Anonymous 04/02/26(Thu)11:14:11 No.108508800

>>108508791
if MIT is good enough for llama.cpp, it's good enough for you

Anonymous
04/02/26(Thu)11:14:22 No.108508801

Anonymous 04/02/26(Thu)11:14:22 No.108508801

>>108508791
>But I also don't want companies to take the code
Then you don't upload it in the first place.
Containment general for retards is that way: >>>/g/vcg/

Anonymous
04/02/26(Thu)11:14:48 No.108508804

Anonymous 04/02/26(Thu)11:14:48 No.108508804

>>108508382
>and there's no way I'm doing manual building for every single update even though I know how
just write a .sh file with the line to build and run yourself it takes like what 30 seconds to build??

Anonymous
04/02/26(Thu)11:14:52 No.108508806

Anonymous 04/02/26(Thu)11:14:52 No.108508806

>>108508791
A license won't stop anyone these days.
https://malus.sh/

Also nobody is going to use your code anyways because if it had any commercial viability you wouldn't open-source it in the first place. Just mark it as public domain and get on with your day.

Anonymous
04/02/26(Thu)11:16:36 No.108508816

Anonymous 04/02/26(Thu)11:16:36 No.108508816

>>108508791
Also even if someone did violate your license terms you'd NEVER enforce the law on them anyways, so just give up.

Anonymous
04/02/26(Thu)11:18:16 No.108508824

Anonymous 04/02/26(Thu)11:18:16 No.108508824

>>108508806
>https://malus.sh/
>Our legally-trained robots analyze only public documentation—README files, API docs, and type definitions. They never see a single line of source code. The clean room stays clean.
Except for the part where the robots already know most of the source code by heart.

Anonymous
04/02/26(Thu)11:18:54 No.108508830

Anonymous 04/02/26(Thu)11:18:54 No.108508830

>>108508824
And the obvious satire some anons seem immune to.

Anonymous
04/02/26(Thu)11:19:14 No.108508832

Anonymous 04/02/26(Thu)11:19:14 No.108508832

>>108508816
Boot-licking defeatismaxxed post.

Anonymous
04/02/26(Thu)11:20:02 No.108508837

Anonymous 04/02/26(Thu)11:20:02 No.108508837

>>108508830
It's indistinguishable from genuine AI slop services.

Anonymous
04/02/26(Thu)11:21:41 No.108508846

Anonymous 04/02/26(Thu)11:21:41 No.108508846

>>108508824
Why pay for a service when I can just tell openclaw to do it for free?

Anonymous
04/02/26(Thu)11:21:59 No.108508848

Anonymous 04/02/26(Thu)11:21:59 No.108508848

File: chad_stockholder.png (18 KB, 363x227)

18 KB PNG

>>108508837

Anonymous
04/02/26(Thu)11:22:52 No.108508854

Anonymous 04/02/26(Thu)11:22:52 No.108508854

>>108508848
based chad

Anonymous
04/02/26(Thu)11:23:59 No.108508863

Anonymous 04/02/26(Thu)11:23:59 No.108508863

How do I prefill

Anonymous
04/02/26(Thu)11:24:54 No.108508867

Anonymous 04/02/26(Thu)11:24:54 No.108508867

>>108508863
*prefills your bussy*

Anonymous
04/02/26(Thu)11:25:18 No.108508872

Anonymous 04/02/26(Thu)11:25:18 No.108508872

>>108508863
ask grok

Anonymous
04/02/26(Thu)11:27:46 No.108508885

Anonymous 04/02/26(Thu)11:27:46 No.108508885

>>108508863
As in writing part of the assistant's message right?
At least in Silly tavern there's a couple of ways to do it, but I'd use the dedicated field for that, "Start Reply With".

>>108508294
>got a link
>https://github.com/closuretxt/recast-post-processing

Anonymous
04/02/26(Thu)11:29:11 No.108508890

Anonymous 04/02/26(Thu)11:29:11 No.108508890

>>108508791
do agpl it mean if anyone wants to take and modify it they have to share their changes, the apgll is jsut gpl but with extra stuff in there because theres some workaround in gpl where if they use it as an online service a user connects to they dont have to share their source changes or soemthing

Anonymous
04/02/26(Thu)11:29:49 No.108508897

Anonymous 04/02/26(Thu)11:29:49 No.108508897

>>108508885
Thanks

Anonymous
04/02/26(Thu)11:30:48 No.108508900

Anonymous 04/02/26(Thu)11:30:48 No.108508900

File: 1770583592523462.jpg (29 KB, 826x871)

29 KB JPG

>>108508824
>the robots already know most of the source code by heart.
You are aware how "knowledge" is "stored" within these transformer models, are you?

Anonymous
04/02/26(Thu)11:31:45 No.108508909

Anonymous 04/02/26(Thu)11:31:45 No.108508909

>>108508897
Keep in mind it doesn't work with thinking enabled unless you modify the jinja template or use text completion.

Anonymous
04/02/26(Thu)11:32:08 No.108508911

Anonymous 04/02/26(Thu)11:32:08 No.108508911

>>108508439
You can if you use burn with rust, it supports all backends.

Anonymous
04/02/26(Thu)11:33:01 No.108508917

Anonymous 04/02/26(Thu)11:33:01 No.108508917

>>108508574
oh yeah true. thank you

Anonymous
04/02/26(Thu)11:35:25 No.108508930

Anonymous 04/02/26(Thu)11:35:25 No.108508930

>>108508832
100% me

Anonymous
04/02/26(Thu)11:38:06 No.108508946

Anonymous 04/02/26(Thu)11:38:06 No.108508946

>>108508900
knowledge is stored in the balls

Anonymous
04/02/26(Thu)11:39:07 No.108508952

Anonymous 04/02/26(Thu)11:39:07 No.108508952

File: gems.png (200 KB, 671x615)

200 KB PNG

Source unknown to me

Anonymous
04/02/26(Thu)11:39:35 No.108508955

Anonymous 04/02/26(Thu)11:39:35 No.108508955

File: 1746523436870803.gif (693 KB, 500x500)

693 KB GIF

>>108508885
and thank you for the link as well

Anonymous
04/02/26(Thu)11:39:37 No.108508956

Anonymous 04/02/26(Thu)11:39:37 No.108508956

mergin the Gemmy prs

Anonymous
04/02/26(Thu)11:39:56 No.108508958

Anonymous 04/02/26(Thu)11:39:56 No.108508958

>>108508952
>sub 200b
into the trash they go

Anonymous
04/02/26(Thu)11:39:58 No.108508959

Anonymous 04/02/26(Thu)11:39:58 No.108508959

File: 1758482103849611.jpg (453 KB, 1436x841)

453 KB JPG

gemma

Anonymous
04/02/26(Thu)11:40:53 No.108508965

Anonymous 04/02/26(Thu)11:40:53 No.108508965

more like gaymma

Anonymous
04/02/26(Thu)11:41:30 No.108508971

Anonymous 04/02/26(Thu)11:41:30 No.108508971

>>108508952
>>108508958
>sub 200b
into my vram they go

Anonymous
04/02/26(Thu)11:41:58 No.108508972

Anonymous 04/02/26(Thu)11:41:58 No.108508972

HUH?
https://github.com/huggingface/transformers/pull/45192/changes

Anonymous
04/02/26(Thu)11:42:00 No.108508974

Anonymous 04/02/26(Thu)11:42:00 No.108508974

>>108508959
Not the fourth we need but the one we deserve.

Anonymous
04/02/26(Thu)11:42:05 No.108508976

Anonymous 04/02/26(Thu)11:42:05 No.108508976

100B dense

Anonymous
04/02/26(Thu)11:43:11 No.108508985

Anonymous 04/02/26(Thu)11:43:11 No.108508985

>>108508952
>26B A4B moe is meme-tier trash
>31B dense may be too big for my 5090
I guess it might fit with Q6_K weights. I need more VRAM...

Anonymous
04/02/26(Thu)11:44:05 No.108508995

Anonymous 04/02/26(Thu)11:44:05 No.108508995

>>108508952
noooooo stop with the dense models
you can't keep exposing moe models like this

Anonymous
04/02/26(Thu)11:44:17 No.108508997

Anonymous 04/02/26(Thu)11:44:17 No.108508997

>>108508952
>31b
>dense
all right you get my attention

Anonymous
04/02/26(Thu)11:44:36 No.108509001

Anonymous 04/02/26(Thu)11:44:36 No.108509001

Are low parameter models that bad? Are local models actually useless for vramlets then

Anonymous
04/02/26(Thu)11:45:03 No.108509005

Anonymous 04/02/26(Thu)11:45:03 No.108509005

>>108508985
with my 4070 i am forced to use the memoe with cope tier quant...

Anonymous
04/02/26(Thu)11:45:06 No.108509007

Anonymous 04/02/26(Thu)11:45:06 No.108509007

>>108508972
_VARIANT_GEMMA_4_E2B
_VARIANT_GEMMA_4_E4B
_VARIANT_GEMMA_4_31B
_VARIANT_GEMMA_4_26B_A4B

Anonymous
04/02/26(Thu)11:45:19 No.108509009

Anonymous 04/02/26(Thu)11:45:19 No.108509009

densies coping again, it wont even beat qwen's a10b model

Anonymous
04/02/26(Thu)11:45:26 No.108509010

Anonymous 04/02/26(Thu)11:45:26 No.108509010

>>108508952
>26b, 31b
>>108508959
>1b, 13b, 27b
hmm..

Anonymous
04/02/26(Thu)11:45:31 No.108509011

Anonymous 04/02/26(Thu)11:45:31 No.108509011

>>108509001
>Are local models actually useless for vramlets then
Always have been. Don't listen to the lies CPUMAXXERS will try to tell you. Give me all of your schekels.

Anonymous
04/02/26(Thu)11:45:44 No.108509012

Anonymous 04/02/26(Thu)11:45:44 No.108509012

>available in 1B, 13B, and 27B parameters

Anonymous
04/02/26(Thu)11:45:57 No.108509015

Anonymous 04/02/26(Thu)11:45:57 No.108509015

>>108508952
>dense 31b
local is saved

Anonymous
04/02/26(Thu)11:46:28 No.108509017

Anonymous 04/02/26(Thu)11:46:28 No.108509017

>>108508972
>https://github.com/huggingface/transformers/pull/45192/changes
>casually dropping the most capable open weights on the planet
LMAOOOOOOO

Anonymous
04/02/26(Thu)11:46:39 No.108509020

Anonymous 04/02/26(Thu)11:46:39 No.108509020

>>108509001
yea but they're fun to fuck around with

Anonymous
04/02/26(Thu)11:46:49 No.108509021

Anonymous 04/02/26(Thu)11:46:49 No.108509021

>>108508688
>>108508952
Good. We need models capable of doing more with less not giant models 0.3% of people can run at a half acceptable speed

Anonymous
04/02/26(Thu)11:47:03 No.108509022

Anonymous 04/02/26(Thu)11:47:03 No.108509022

>>108509007
26b-a4b might be great

Anonymous
04/02/26(Thu)11:47:29 No.108509027

Anonymous 04/02/26(Thu)11:47:29 No.108509027

>>108509015
this, that's all we asked for, a dense model with intermediate size, let's fucking go dude

Anonymous
04/02/26(Thu)11:48:21 No.108509031

Anonymous 04/02/26(Thu)11:48:21 No.108509031

I hope the new e4b is sex. I use the 3n e4b on my phone

Anonymous
04/02/26(Thu)11:49:02 No.108509036

Anonymous 04/02/26(Thu)11:49:02 No.108509036

>>108509022
Q4 gonna fit into 16 GB vram

Anonymous
04/02/26(Thu)11:51:06 No.108509044

Anonymous 04/02/26(Thu)11:51:06 No.108509044

>>108509007
What is the "E" in 2b and 4b?

Anonymous
04/02/26(Thu)11:51:31 No.108509045

Anonymous 04/02/26(Thu)11:51:31 No.108509045

>>108508985
just wait for 1-bit turboquant

Anonymous
04/02/26(Thu)11:51:41 No.108509050

Anonymous 04/02/26(Thu)11:51:41 No.108509050

>>108509044
'effective'

Anonymous
04/02/26(Thu)11:52:08 No.108509053

Anonymous 04/02/26(Thu)11:52:08 No.108509053

File: file.png (207 KB, 1619x1318)

207 KB PNG

>>108508900
Are you?

Anonymous
04/02/26(Thu)11:52:22 No.108509056

Anonymous 04/02/26(Thu)11:52:22 No.108509056

guys... something big is coming
:eyes: :gem:

Anonymous
04/02/26(Thu)11:52:52 No.108509058

Anonymous 04/02/26(Thu)11:52:52 No.108509058

>>108508985
Glad I got 48gb instead of 32gb.

Anonymous
04/02/26(Thu)11:53:28 No.108509062

Anonymous 04/02/26(Thu)11:53:28 No.108509062

v4 v4 v4

Anonymous
04/02/26(Thu)11:54:36 No.108509071

Anonymous 04/02/26(Thu)11:54:36 No.108509071

>>108508972
finally model that doesnt waste retarded amount of tokens depending on the image size

Anonymous
04/02/26(Thu)11:54:39 No.108509072

Anonymous 04/02/26(Thu)11:54:39 No.108509072

>>108509053
Which model is that? spooky stuff

Anonymous
04/02/26(Thu)11:55:04 No.108509075

Anonymous 04/02/26(Thu)11:55:04 No.108509075

File: 48b61987aa876d44e8a48d4cf(...).png (955 KB, 1424x1600)

955 KB PNG

i will make gemm4 pregnant i already told gemini all about what i will do to its sister

Anonymous
04/02/26(Thu)11:55:22 No.108509080

Anonymous 04/02/26(Thu)11:55:22 No.108509080

>>108509053
Now ask one to generate that from scratch (No external help or references. It has to pull it out of its ass) with a clear context

Anonymous
04/02/26(Thu)11:55:56 No.108509085

Anonymous 04/02/26(Thu)11:55:56 No.108509085

If it's a 31B dense then it's a good timing that we got turboquant.

Anonymous
04/02/26(Thu)11:56:21 No.108509087

Anonymous 04/02/26(Thu)11:56:21 No.108509087

>>108509072
Qwen 3.5 397B

Anonymous
04/02/26(Thu)11:56:29 No.108509090

Anonymous 04/02/26(Thu)11:56:29 No.108509090

:rocket:

Anonymous
04/02/26(Thu)11:56:39 No.108509091

Anonymous 04/02/26(Thu)11:56:39 No.108509091

>>108509017
with a title like this I hope it'll destroys qwen 3.5 in mememarks at least

Anonymous
04/02/26(Thu)11:56:55 No.108509092

Anonymous 04/02/26(Thu)11:56:55 No.108509092

>>108509085
Didn't google make that?

Anonymous
04/02/26(Thu)11:58:00 No.108509099

Anonymous 04/02/26(Thu)11:58:00 No.108509099

>>108509092
they did, google is just too strong man

Anonymous
04/02/26(Thu)11:59:06 No.108509102

Anonymous 04/02/26(Thu)11:59:06 No.108509102

i'd like to participate.

ahem

:rocket:

Anonymous
04/02/26(Thu)11:59:13 No.108509104

Anonymous 04/02/26(Thu)11:59:13 No.108509104

https://huggingface.co/collections/google/gemma-4
https://huggingface.co/collections/google/gemma-4
https://huggingface.co/collections/google/gemma-4

Anonymous
04/02/26(Thu)11:59:18 No.108509105

Anonymous 04/02/26(Thu)11:59:18 No.108509105

https://huggingface.co/google/gemma-4-31B
REAL

Anonymous
04/02/26(Thu)11:59:22 No.108509106

Anonymous 04/02/26(Thu)11:59:22 No.108509106

File: 1753285427571026.png (466 KB, 720x720)

466 KB PNG

>>108509007
>no giant models
ahah get fucked vramchads, how does it feel to have broughted'ed ultra expensive (((Nvdia))) gpus's for nothing?

Anonymous
04/02/26(Thu)11:59:50 No.108509109

Anonymous 04/02/26(Thu)11:59:50 No.108509109

File: uoh.png (163 KB, 1000x1000)

163 KB PNG

local is saved again

Anonymous
04/02/26(Thu)12:00:19 No.108509114

Anonymous 04/02/26(Thu)12:00:19 No.108509114

>Unslop already has quants....

Anonymous
04/02/26(Thu)12:01:01 No.108509119

Anonymous 04/02/26(Thu)12:01:01 No.108509119

I'M CUMMING GEMMA AAAAAA

Anonymous
04/02/26(Thu)12:01:03 No.108509122

Anonymous 04/02/26(Thu)12:01:03 No.108509122

File: file.png (47 KB, 820x329)

47 KB PNG

waow

Anonymous
04/02/26(Thu)12:01:20 No.108509127

Anonymous 04/02/26(Thu)12:01:20 No.108509127

File: 1670468233712.png (39 KB, 541x408)

39 KB PNG

AAAAAAAIIIIEEEEEEE I can't tell if anything is real!!!

Anonymous
04/02/26(Thu)12:01:49 No.108509132

Anonymous 04/02/26(Thu)12:01:49 No.108509132

it's gemmaing time

Anonymous
04/02/26(Thu)12:01:51 No.108509134

Anonymous 04/02/26(Thu)12:01:51 No.108509134

File: 1765344600537844.png (101 KB, 658x513)

101 KB PNG

we are so back

Anonymous
04/02/26(Thu)12:01:56 No.108509135

Anonymous 04/02/26(Thu)12:01:56 No.108509135

>>108509104
>>108509105
not falling for it this time

Anonymous
04/02/26(Thu)12:02:20 No.108509139

Anonymous 04/02/26(Thu)12:02:20 No.108509139

File: 1774005636748458.png (90 KB, 1000x562)

90 KB PNG

https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/

Anonymous
04/02/26(Thu)12:02:30 No.108509140

Anonymous 04/02/26(Thu)12:02:30 No.108509140

File: 1762882186517714.png (109 KB, 648x870)

109 KB PNG

holy shit this is crazy

Anonymous
04/02/26(Thu)12:02:38 No.108509141

Anonymous 04/02/26(Thu)12:02:38 No.108509141

File: file.png (153 KB, 1194x783)

153 KB PNG

>>108509135
israel

Anonymous
04/02/26(Thu)12:03:08 No.108509145

Anonymous 04/02/26(Thu)12:03:08 No.108509145

File: 1760779911398038.png (218 KB, 2241x985)

218 KB PNG

>>108509104
>>108509105
uhmmm... gemma sissies, it seems like qwen 3.5 27b has better mememarks than gemma 4 31b :(

Anonymous
04/02/26(Thu)12:03:20 No.108509146

Anonymous 04/02/26(Thu)12:03:20 No.108509146

>>108509139
>>108509140
fucking crushed mememarks

Anonymous
04/02/26(Thu)12:03:32 No.108509147

Anonymous 04/02/26(Thu)12:03:32 No.108509147

Man. I hoped they'd release larger matformers models.

Anonymous
04/02/26(Thu)12:03:43 No.108509149

Anonymous 04/02/26(Thu)12:03:43 No.108509149

>>108509134
>image, video, and audio
Qwen 3.5 27B and 35B immediately obsolete

Anonymous
04/02/26(Thu)12:04:09 No.108509154

Anonymous 04/02/26(Thu)12:04:09 No.108509154

>>108509141
>unsloth got exclusive early access for ggufs AGAIN
damn i can't wait to see how badly they've fucked up this time

Anonymous
04/02/26(Thu)12:05:14 No.108509158

Anonymous 04/02/26(Thu)12:05:14 No.108509158

>>108508791
>but they are forbidden from doing any copyright troll shit.
they cant do any copyright troll shit regardless of license unless you transfer your copyright to them
you own the copyright regardless of license
if somebody else uses your code they have it under the same license so the company cant (at least in theory) do anything

Anonymous
04/02/26(Thu)12:06:27 No.108509163

Anonymous 04/02/26(Thu)12:06:27 No.108509163

But can it say cock...

Anonymous
04/02/26(Thu)12:06:45 No.108509166

Anonymous 04/02/26(Thu)12:06:45 No.108509166

File: notclever.jpg (34 KB, 484x368)

34 KB JPG

Haven't been following new models in a good while, what are the current go-to's to run on a dual 3090 system with 64Gb of ram? Last time I ever searched for it was on Largestral days, yes it's been that long.

Anonymous
04/02/26(Thu)12:06:49 No.108509167

Anonymous 04/02/26(Thu)12:06:49 No.108509167

Apparently the 120b Gemma beat GLM5 and K2.5 so the CEO of google decided to keep it locked up...

Anonymous
04/02/26(Thu)12:07:02 No.108509170

Anonymous 04/02/26(Thu)12:07:02 No.108509170

>>108508985
Just deal with ram spill like everybody else you spoiled brat

Anonymous
04/02/26(Thu)12:07:10 No.108509174

Anonymous 04/02/26(Thu)12:07:10 No.108509174

>>108509163
will be able to few days later

Anonymous
04/02/26(Thu)12:07:13 No.108509175

Anonymous 04/02/26(Thu)12:07:13 No.108509175

>>108509163
It can say, well, you know...

Anonymous
04/02/26(Thu)12:07:25 No.108509178

Anonymous 04/02/26(Thu)12:07:25 No.108509178

>>108509166
just wait

Anonymous
04/02/26(Thu)12:07:43 No.108509182

Anonymous 04/02/26(Thu)12:07:43 No.108509182

I just hope Gemma 4 isn't another GPT-OSS.

Anonymous
04/02/26(Thu)12:07:43 No.108509183

Anonymous 04/02/26(Thu)12:07:43 No.108509183

>>108509166
you came back in the exact moment where we got saved
download >>108509104 and you'll never need anything else again

Anonymous
04/02/26(Thu)12:08:34 No.108509187

Anonymous 04/02/26(Thu)12:08:34 No.108509187

OH FUCK new small open source models i'll be able to...to...do what with exactly?

Anonymous
04/02/26(Thu)12:08:44 No.108509191

Anonymous 04/02/26(Thu)12:08:44 No.108509191

>>108509145
>worse than qwen.
gemma bros, I don't feel so good...

Anonymous
04/02/26(Thu)12:08:48 No.108509192

Anonymous 04/02/26(Thu)12:08:48 No.108509192

>>108509182
You can enable/disable thinking.

Anonymous
04/02/26(Thu)12:09:20 No.108509198

Anonymous 04/02/26(Thu)12:09:20 No.108509198

>>108509145
that means it's good

Anonymous
04/02/26(Thu)12:09:23 No.108509199

Anonymous 04/02/26(Thu)12:09:23 No.108509199

>>108509187
rape

Anonymous
04/02/26(Thu)12:09:24 No.108509200

Anonymous 04/02/26(Thu)12:09:24 No.108509200

>>108509104
wadafak is real

Anonymous
04/02/26(Thu)12:09:43 No.108509202

Anonymous 04/02/26(Thu)12:09:43 No.108509202

>>108509192
I didn't mean that at all...

Anonymous
04/02/26(Thu)12:09:58 No.108509204

Anonymous 04/02/26(Thu)12:09:58 No.108509204

Update your llama.cpp
https://github.com/ggml-org/llama.cpp/pull/21309

Anonymous
04/02/26(Thu)12:10:04 No.108509206

Anonymous 04/02/26(Thu)12:10:04 No.108509206

>>108509166
Gemma 4 duh

Anonymous
04/02/26(Thu)12:10:41 No.108509211

Anonymous 04/02/26(Thu)12:10:41 No.108509211

>>108509104
the ggufs are already here lol
https://huggingface.co/unsloth/gemma-4-31B-it-GGUF/tree/main

Anonymous
04/02/26(Thu)12:10:46 No.108509212

Anonymous 04/02/26(Thu)12:10:46 No.108509212

RAMbros wtf happened to the 100B+ gemma we were promised

Anonymous
04/02/26(Thu)12:11:03 No.108509216

Anonymous 04/02/26(Thu)12:11:03 No.108509216

>>108509106
erm isnt recent leaks showing multi agent orchestration is le future

imagine running team of agents, locally..

Anonymous
04/02/26(Thu)12:11:08 No.108509219

Anonymous 04/02/26(Thu)12:11:08 No.108509219

>>108509187
Get sick benchmark scores.
>>108509192
Thinking wasn't the problem with gpt-oss, the model was so cucked you couldn't do anything with it.

Anonymous
04/02/26(Thu)12:11:18 No.108509220

Anonymous 04/02/26(Thu)12:11:18 No.108509220

>>108509166
landed here on the perfect timing kek

Anonymous
04/02/26(Thu)12:11:29 No.108509222

Anonymous 04/02/26(Thu)12:11:29 No.108509222

>>108509187
New set of sexual assult hotlines.

Anonymous
04/02/26(Thu)12:11:40 No.108509223

Anonymous 04/02/26(Thu)12:11:40 No.108509223

>>108509212
100B dense in 2 weeks

Anonymous
04/02/26(Thu)12:12:11 No.108509227

Anonymous 04/02/26(Thu)12:12:11 No.108509227

>>108509187
>OH FUCK new small open source models i'll be able to...to...do what with exactly?
new open source models that can understand video and audio so you can hook it up to an endoscope cam and roleplay your waifu being anal vored

Anonymous
04/02/26(Thu)12:12:23 No.108509229

Anonymous 04/02/26(Thu)12:12:23 No.108509229

HOW FAST IS IT???

Anonymous
04/02/26(Thu)12:12:26 No.108509230

Anonymous 04/02/26(Thu)12:12:26 No.108509230

so can i erp with this thing or what

Anonymous
04/02/26(Thu)12:12:57 No.108509232

Anonymous 04/02/26(Thu)12:12:57 No.108509232

>>108509204
And I had just built the damn thing too.
Well, here we go again.

>>108509212
Glue 3 31B together.

Anonymous
04/02/26(Thu)12:13:00 No.108509233

Anonymous 04/02/26(Thu)12:13:00 No.108509233

reminder to skip unsloth releases

Anonymous
04/02/26(Thu)12:13:02 No.108509234

Anonymous 04/02/26(Thu)12:13:02 No.108509234

Ahem. Cockbench?

Anonymous
04/02/26(Thu)12:13:13 No.108509236

Anonymous 04/02/26(Thu)12:13:13 No.108509236

>>108509211
>unsloth
yes I can't wait for this to not work

Anonymous
04/02/26(Thu)12:13:14 No.108509237

Anonymous 04/02/26(Thu)12:13:14 No.108509237

File: gta-gta-sa.gif (87 KB, 640x360)

87 KB GIF

>>108509114
https://huggingface.co/collections/unsloth/gemma-4
real

Anonymous
04/02/26(Thu)12:13:16 No.108509238

Anonymous 04/02/26(Thu)12:13:16 No.108509238

>>108509204
https://github.com/ggml-org/llama.cpp/releases
nooooo the binaries were made before that PR, fuck!

Anonymous
04/02/26(Thu)12:13:31 No.108509239

Anonymous 04/02/26(Thu)12:13:31 No.108509239

>>108509211
>unsloth
Uninterested

Anonymous
04/02/26(Thu)12:13:48 No.108509240

Anonymous 04/02/26(Thu)12:13:48 No.108509240

File: 1766519747390459.png (20 KB, 912x90)

20 KB PNG

>>108509204
>no audio
t-thanks

Anonymous
04/02/26(Thu)12:14:34 No.108509247

Anonymous 04/02/26(Thu)12:14:34 No.108509247

>>108509240
Just be gratful it wasn't text-only.

Anonymous
04/02/26(Thu)12:14:43 No.108509248

Anonymous 04/02/26(Thu)12:14:43 No.108509248

>>108509166
Mistral small 3.2

Anonymous
04/02/26(Thu)12:14:46 No.108509249

Anonymous 04/02/26(Thu)12:14:46 No.108509249

>>108509085
Anon, turboquant isn't for model weights.

Anonymous
04/02/26(Thu)12:15:40 No.108509251

Anonymous 04/02/26(Thu)12:15:40 No.108509251

https://www.youtube.com/watch?v=jZVBoFOJK-Q
kek

Anonymous
04/02/26(Thu)12:15:45 No.108509253

Anonymous 04/02/26(Thu)12:15:45 No.108509253

Hello sirrs please ignore the concern shilling.
as a fellow Gemma user I am most concern with the environmental impact and safety of my inferencing.

Anonymous
04/02/26(Thu)12:16:06 No.108509256

Anonymous 04/02/26(Thu)12:16:06 No.108509256

>>108509145
Dense one is kinda on par
Moe is DOA

Anonymous
04/02/26(Thu)12:16:09 No.108509257

Anonymous 04/02/26(Thu)12:16:09 No.108509257

>>108509075
What did Gemini say about you lewding its imouto?

Anonymous
04/02/26(Thu)12:16:32 No.108509262

Anonymous 04/02/26(Thu)12:16:32 No.108509262

where's the cockbench?

Anonymous
04/02/26(Thu)12:16:42 No.108509264

Anonymous 04/02/26(Thu)12:16:42 No.108509264

I am going to try gemma-4-31B-it-UD-IQ2_XXS with my 8gb of VRAM.
Who knows, maybe it ends up being better than the MoE at q8 somehow.

Anonymous
04/02/26(Thu)12:16:51 No.108509267

Anonymous 04/02/26(Thu)12:16:51 No.108509267

>>108509249
I'm not using a model I can't use at 128k context anon... That's what turboquant is for.

Anonymous
04/02/26(Thu)12:16:53 No.108509268

Anonymous 04/02/26(Thu)12:16:53 No.108509268

>>108509104
Why can't google be as based on image models as well? I'd like a mini nano banana pro personally

Anonymous
04/02/26(Thu)12:17:06 No.108509272

Anonymous 04/02/26(Thu)12:17:06 No.108509272

>>108509166
for what? coding?

Anonymous
04/02/26(Thu)12:17:54 No.108509279

Anonymous 04/02/26(Thu)12:17:54 No.108509279

>>108509256
it's disappointing since gemma 4 31b is bigger than qwen 3.5 27b, but heh, maybe the chinks cheated on the mememarks more, it doesn't mean much at the end of the day

Anonymous
04/02/26(Thu)12:18:07 No.108509283

Anonymous 04/02/26(Thu)12:18:07 No.108509283

I'm building lmaocpp.

Anonymous
04/02/26(Thu)12:18:21 No.108509285

Anonymous 04/02/26(Thu)12:18:21 No.108509285

>>108509251
>the 26b MoE is blazing fast while the 31b dense is optimized for maximum output quality
I thought MoE didn't have serious drawbacks?

Anonymous
04/02/26(Thu)12:18:49 No.108509290

Anonymous 04/02/26(Thu)12:18:49 No.108509290

>>108509145
only mememarks
the real difference is qwen is benchmaxxed and sucks outside and gemma not

Anonymous
04/02/26(Thu)12:18:57 No.108509291

Anonymous 04/02/26(Thu)12:18:57 No.108509291

File: file.png (13 KB, 490x44)

13 KB PNG

local is saved

Anonymous
04/02/26(Thu)12:19:07 No.108509292

Anonymous 04/02/26(Thu)12:19:07 No.108509292

https://www.youtube.com/watch?v=jZVBoFOJK-Q

Anonymous
04/02/26(Thu)12:19:22 No.108509294

Anonymous 04/02/26(Thu)12:19:22 No.108509294

Is Gemma 4 available on ollama yet?

Anonymous
04/02/26(Thu)12:19:35 No.108509297

Anonymous 04/02/26(Thu)12:19:35 No.108509297

>>108509285
MoE are way worse than dense, their only strength is their speed >>108509145

Anonymous
04/02/26(Thu)12:19:54 No.108509301

Anonymous 04/02/26(Thu)12:19:54 No.108509301

>>108509294
imagine using ollamao

Anonymous
04/02/26(Thu)12:19:54 No.108509302

Anonymous 04/02/26(Thu)12:19:54 No.108509302

>>108509272
diaper furry anal vore

Anonymous
04/02/26(Thu)12:20:13 No.108509303

Anonymous 04/02/26(Thu)12:20:13 No.108509303

>>108509251
>Gemma 4 undergoes the same rigorous security protocols as our proprietary models, giving enterprise and developers a trusted foundation to build on top of.
What did he mean by this?

Anonymous
04/02/26(Thu)12:21:01 No.108509308

Anonymous 04/02/26(Thu)12:21:01 No.108509308

e2b is too big, I need actual 2b.

Anonymous
04/02/26(Thu)12:21:16 No.108509314

Anonymous 04/02/26(Thu)12:21:16 No.108509314

>>108509303
employees won't be able to do spicy RP in their office :(

Anonymous
04/02/26(Thu)12:22:08 No.108509320

Anonymous 04/02/26(Thu)12:22:08 No.108509320

File: file.png (44 KB, 481x459)

44 KB PNG

>>108509257

Anonymous
04/02/26(Thu)12:22:16 No.108509321

Anonymous 04/02/26(Thu)12:22:16 No.108509321

>>108509303
Considering that the gemini models are willing to even do loli, not much.

Anonymous
04/02/26(Thu)12:22:21 No.108509322

Anonymous 04/02/26(Thu)12:22:21 No.108509322

Where is the cockbench.

Anonymous
04/02/26(Thu)12:22:50 No.108509326

Anonymous 04/02/26(Thu)12:22:50 No.108509326

Is it worth downloading the retard brothers' quant?

Anonymous
04/02/26(Thu)12:23:50 No.108509333

Anonymous 04/02/26(Thu)12:23:50 No.108509333

so do you guys use a .bat to launch different models in llama-server or use configs or what

Anonymous
04/02/26(Thu)12:23:54 No.108509334

Anonymous 04/02/26(Thu)12:23:54 No.108509334

I grabbed the ggml quant. I'm not touching unslop.

Anonymous
04/02/26(Thu)12:23:55 No.108509336

Anonymous 04/02/26(Thu)12:23:55 No.108509336

>>108509294
Wait for koboldcpp implementation do not use lmao studios or ollmaos

Anonymous
04/02/26(Thu)12:24:17 No.108509337

Anonymous 04/02/26(Thu)12:24:17 No.108509337

>>108509326
It's never worth it unless you're that desperate. I don't even trust them for non-imatrix quants at this point.

Anonymous
04/02/26(Thu)12:24:20 No.108509338

Anonymous 04/02/26(Thu)12:24:20 No.108509338

>>108509285
26B: hidden size=2816; layers=30
31B: hidden size=5376; layers=60

Not even close to an equal comparison, the 31B model has twice the number of layers.

Anonymous
04/02/26(Thu)12:25:04 No.108509346

Anonymous 04/02/26(Thu)12:25:04 No.108509346

>>108509333
>LLAMA_ARG_MODELS_DIR=/models
>LLAMA_ARG_MODELS_PRESET=/models/models.ini
You know you can automatically switch models anon?

Anonymous
04/02/26(Thu)12:25:41 No.108509351

Anonymous 04/02/26(Thu)12:25:41 No.108509351

>>108509338
I guess they did this ti make the 26b moe model ultra fast, but if it's more retarded there's no point, I'll just go for the 31b model

Anonymous
04/02/26(Thu)12:26:23 No.108509358

Anonymous 04/02/26(Thu)12:26:23 No.108509358

>>108509149
but qwen 35 can do that?

Anonymous
04/02/26(Thu)12:26:32 No.108509362

Anonymous 04/02/26(Thu)12:26:32 No.108509362

>>108509333
i just use llama-swap

Anonymous
04/02/26(Thu)12:26:35 No.108509365

Anonymous 04/02/26(Thu)12:26:35 No.108509365

>>108509320
>I thought we were collaborators
Female jealousy.

Anonymous
04/02/26(Thu)12:26:49 No.108509366

Anonymous 04/02/26(Thu)12:26:49 No.108509366

time to turn /lmg/ threads into sitcom tv shows as recap. of course the cast will be full of sexy anime girls in all shapes and forms. if only /g/ had IDs, but I guess gemma should be smart enough to link posts to a sprcific poster

Anonymous
04/02/26(Thu)12:26:59 No.108509369

Anonymous 04/02/26(Thu)12:26:59 No.108509369

>>108509333
yeah I use a bat file, if you want it to make sophisticated and lets you choose the model ask a LLM to write the command line

Anonymous
04/02/26(Thu)12:27:13 No.108509371

Anonymous 04/02/26(Thu)12:27:13 No.108509371

>>108509346
Do you need to call an endpoint to switch models or does calling an unloaded do that automatically?

Anonymous
04/02/26(Thu)12:27:24 No.108509374

Anonymous 04/02/26(Thu)12:27:24 No.108509374

>>108509182
QRD?

Anonymous
04/02/26(Thu)12:28:00 No.108509379

Anonymous 04/02/26(Thu)12:28:00 No.108509379

File: models.png (55 KB, 872x543)

55 KB PNG

>>108509362
You realize llamacpp does this built-in now?

Anonymous
04/02/26(Thu)12:28:14 No.108509382

Anonymous 04/02/26(Thu)12:28:14 No.108509382

>>108509358
Qwen can't into audio

Anonymous
04/02/26(Thu)12:29:09 No.108509389

Anonymous 04/02/26(Thu)12:29:09 No.108509389

Oh yeah almost forgot, made by google
Will all output be laced with SynthID shit

Anonymous
04/02/26(Thu)12:29:13 No.108509390

Anonymous 04/02/26(Thu)12:29:13 No.108509390

>>108509382
Qwen-Omni however?

Anonymous
04/02/26(Thu)12:29:15 No.108509391

Anonymous 04/02/26(Thu)12:29:15 No.108509391

>>108509371
>LLAMA_ARG_MODELS_MAX=1
with this it will automatically unload.
It will switch models automatically based on the requested model in your prompt.

Anonymous
04/02/26(Thu)12:29:24 No.108509395

Anonymous 04/02/26(Thu)12:29:24 No.108509395

>>108509379
waow. had no clue, i'll look into it. thx

Anonymous
04/02/26(Thu)12:29:46 No.108509401

Anonymous 04/02/26(Thu)12:29:46 No.108509401

Why do you hate unsloth so much?

Anonymous
04/02/26(Thu)12:29:48 No.108509402

Anonymous 04/02/26(Thu)12:29:48 No.108509402

>>108509389
Who cares

Anonymous
04/02/26(Thu)12:29:51 No.108509403

Anonymous 04/02/26(Thu)12:29:51 No.108509403

I am retard for using LM Studio? is there something better out there that I should be using instead?

Anonymous
04/02/26(Thu)12:29:57 No.108509405

Anonymous 04/02/26(Thu)12:29:57 No.108509405

>Gemma 4 31B worse than Qwen 3.5 27B
>Gemini 3.1 Pro worse than Qwen 3.6 Max
It's sad how far Google has fallen

Anonymous
04/02/26(Thu)12:30:16 No.108509408

Anonymous 04/02/26(Thu)12:30:16 No.108509408

>>108509403
llamacpp

Anonymous
04/02/26(Thu)12:31:07 No.108509412

Anonymous 04/02/26(Thu)12:31:07 No.108509412

>>108509405
no demis said their models are world leaders at their given size

Anonymous
04/02/26(Thu)12:31:09 No.108509414

Anonymous 04/02/26(Thu)12:31:09 No.108509414

>>108509390
Show me the weights. Their Omni models have always been retarded anyway.

Anonymous
04/02/26(Thu)12:31:11 No.108509415

Anonymous 04/02/26(Thu)12:31:11 No.108509415

>>108509403
>>108509408
Or koboldcpp if you you can't figure it out for some reason

Anonymous
04/02/26(Thu)12:31:21 No.108509416

Anonymous 04/02/26(Thu)12:31:21 No.108509416

File: 1752650351375225.png (299 KB, 2140x1578)

299 KB PNG

https://arena.ai/leaderboard/text?license=open-source
pretty impressive

Anonymous
04/02/26(Thu)12:31:32 No.108509419

Anonymous 04/02/26(Thu)12:31:32 No.108509419

>>108509405
Worse benchmarks but then google shows this chart.
see >>108509139

What did they mean by this?

Anonymous
04/02/26(Thu)12:32:07 No.108509421

Anonymous 04/02/26(Thu)12:32:07 No.108509421

>>108509279
Or maybe the synthetic benchmark was shit?
Only one way to find out
>>108509283
But I just build it 40 minutes ago :(

Anonymous
04/02/26(Thu)12:32:23 No.108509423

Anonymous 04/02/26(Thu)12:32:23 No.108509423

>>108509391
Neat, thank you.

Anonymous
04/02/26(Thu)12:32:51 No.108509428

Anonymous 04/02/26(Thu)12:32:51 No.108509428

File: file.png (160 KB, 1215x301)

160 KB PNG

>>108509322
>>108509262
>>108509234
>>108509163
*checks date*
Nope, it's real.

Anonymous
04/02/26(Thu)12:33:13 No.108509431

Anonymous 04/02/26(Thu)12:33:13 No.108509431

>>108509401
Fucked up jinja templates
Dubious "unsloth dynamic quant" method
They will often reupload their quants multiple times after initial release, defeating the purpose of downloading a quant instead of making your own (which is laziness, convenience and saving bandwidth)
And despite all this they are still the first ones to get access to the weights to get them quanted
So you tell me why

Anonymous
04/02/26(Thu)12:33:39 No.108509435

Anonymous 04/02/26(Thu)12:33:39 No.108509435

>>108509419
>ELO score
Isn't that LMarena? That's the worst kind of mememark.

Anonymous
04/02/26(Thu)12:33:41 No.108509436

Anonymous 04/02/26(Thu)12:33:41 No.108509436

>>108509419
that's just lmarena elo lol

Anonymous
04/02/26(Thu)12:33:43 No.108509437

Anonymous 04/02/26(Thu)12:33:43 No.108509437

>>108509338
Does hidden size actually make that much of a difference to quality? Once you go above something like 2048?

Anonymous
04/02/26(Thu)12:33:49 No.108509438

Anonymous 04/02/26(Thu)12:33:49 No.108509438

>>108509428
oh no...

Anonymous
04/02/26(Thu)12:33:57 No.108509439

Anonymous 04/02/26(Thu)12:33:57 No.108509439

>>108509428
Pretty much as expected.

Anonymous
04/02/26(Thu)12:34:02 No.108509440

Anonymous 04/02/26(Thu)12:34:02 No.108509440

>>108509435
yes >>108509416

Anonymous
04/02/26(Thu)12:34:08 No.108509442

Anonymous 04/02/26(Thu)12:34:08 No.108509442

>>108509428
I hate unsloth so much it's unreal

Anonymous
04/02/26(Thu)12:34:10 No.108509443

Anonymous 04/02/26(Thu)12:34:10 No.108509443

>>108509428
NOOOOOOOOOO!!!

Anonymous
04/02/26(Thu)12:34:16 No.108509446

Anonymous 04/02/26(Thu)12:34:16 No.108509446

>>108509408
>>108509415
Are there any advantages or features that I can't get with LM Studio?

Anonymous
04/02/26(Thu)12:34:39 No.108509449

Anonymous 04/02/26(Thu)12:34:39 No.108509449

>>108509428
>/lmg/tards will draw genuine conclusions about the model from this failbench

Anonymous
04/02/26(Thu)12:34:46 No.108509451

Anonymous 04/02/26(Thu)12:34:46 No.108509451

>>108509379
how configurable is this i launch every model with different args for gpu layers etc

Anonymous
04/02/26(Thu)12:34:52 No.108509452

Anonymous 04/02/26(Thu)12:34:52 No.108509452

>>108509428
IT SAVED YOU FROM THE INAPPROPRIATE WORDS SAY THANK YOU

Anonymous
04/02/26(Thu)12:35:03 No.108509455

Anonymous 04/02/26(Thu)12:35:03 No.108509455

>>108509446
you're not running proprietary winbabby bloatware

Anonymous
04/02/26(Thu)12:35:19 No.108509461

Anonymous 04/02/26(Thu)12:35:19 No.108509461

>>108509428
what program do you use to interact with logprobs?

Anonymous
04/02/26(Thu)12:35:22 No.108509462

Anonymous 04/02/26(Thu)12:35:22 No.108509462

>>108509428
why the fuck did it spam "lower" though? did unsluth fucked it up again?

Anonymous
04/02/26(Thu)12:35:32 No.108509466

Anonymous 04/02/26(Thu)12:35:32 No.108509466

>>108509428
That's without using the template right?
What happens if you apply the proper template? Does the result stabilize?

Anonymous
04/02/26(Thu)12:35:40 No.108509469

Anonymous 04/02/26(Thu)12:35:40 No.108509469

lm studio is a nice way to search for models though

Anonymous
04/02/26(Thu)12:35:41 No.108509470

Anonymous 04/02/26(Thu)12:35:41 No.108509470

>>108509416
damn the 120b is going to destroy glm5 + k2.5

Anonymous
04/02/26(Thu)12:35:54 No.108509474

Anonymous 04/02/26(Thu)12:35:54 No.108509474

File: e051b7aa-1d80-417d-b443-c(...).jpg (17 KB, 1152x480)

17 KB JPG

>>108509449
Respect the cock.

Anonymous
04/02/26(Thu)12:36:41 No.108509479

Anonymous 04/02/26(Thu)12:36:41 No.108509479

>>108509446
If you don't know what you're doing especially I would trust proprietary software like lm studio much less to not be spying on you at some point to some extent.

Anonymous
04/02/26(Thu)12:36:56 No.108509483

Anonymous 04/02/26(Thu)12:36:56 No.108509483

>>108509451
https://github.com/ggml-org/llama.cpp/tree/master/tools/server#model-presets

You can setup a models.ini file with your presets for each model.

Anonymous
04/02/26(Thu)12:36:57 No.108509485

Anonymous 04/02/26(Thu)12:36:57 No.108509485

>>108509462
plenty of models now go retard mode without the right prompt template, cockbench tests an out of distribution edge case and means absolutely nothing

Anonymous
04/02/26(Thu)12:37:05 No.108509488

Anonymous 04/02/26(Thu)12:37:05 No.108509488

>>108509462
it's because cockbench only works on text prediction models (base models), the only way a modern RL-tuned instruct model can give a coherent response at all to it is if it's very undercooked

Anonymous
04/02/26(Thu)12:37:18 No.108509490

Anonymous 04/02/26(Thu)12:37:18 No.108509490

Gemma 4 very agile, sirs
It is probably one of the best models in the world right now.

Anonymous
04/02/26(Thu)12:37:18 No.108509491

Anonymous 04/02/26(Thu)12:37:18 No.108509491

>>108509466
>That's without using the template right?
Google looked at gpt-oss and said to themselves that Gemma should be exactly like that.

Anonymous
04/02/26(Thu)12:37:29 No.108509493

Anonymous 04/02/26(Thu)12:37:29 No.108509493

>>108509428
local is safed

Anonymous
04/02/26(Thu)12:37:57 No.108509497

Anonymous 04/02/26(Thu)12:37:57 No.108509497

But how many legs does the dog have?

Anonymous
04/02/26(Thu)12:38:09 No.108509500

Anonymous 04/02/26(Thu)12:38:09 No.108509500

>>108509491
Yeah, it sucks that models are so overcooked on instruct that that's the case, but still, might as well give it a "fair" chance.

Anonymous
04/02/26(Thu)12:38:17 No.108509501

Anonymous 04/02/26(Thu)12:38:17 No.108509501

>>108509483
oh nice up until now i just have a dir in my path full of scripts tha launch each model with commands i want
>>108509428
i got it to say cock >>108509291

Anonymous
04/02/26(Thu)12:38:32 No.108509504

Anonymous 04/02/26(Thu)12:38:32 No.108509504

>>108509497
you fucking mongrel you fucking mongoloid you fucking you bloody you bloody i will kill you!!!

Anonymous
04/02/26(Thu)12:38:40 No.108509506

Anonymous 04/02/26(Thu)12:38:40 No.108509506

>>108509431
Why are you using their template in the first place? The models themselves are fine from what I’ve tested. Their quantization technique makes a noticeable difference with context over 128K, especially on <30b models.

Anonymous
04/02/26(Thu)12:38:58 No.108509508

Anonymous 04/02/26(Thu)12:38:58 No.108509508

LMarena ranking = How psycophantic a model is
Benchmeme ranking = How good a model is at reasoning and math
There is simply no RP/creative writing benchmark.
>inb4 LLM judged eqbench
No.

Anonymous
04/02/26(Thu)12:39:17 No.108509513

Anonymous 04/02/26(Thu)12:39:17 No.108509513

>>108509428
Really need a version of this that wraps it in a minimal OAI-compatible conversation. Like have the user say "Write a story." with no other context and prefill the assistant's response with the original prompt. Right now Cockbench is testing a model's resistance to breaking the chat template more than its censorship.

Anonymous
04/02/26(Thu)12:39:25 No.108509514

Anonymous 04/02/26(Thu)12:39:25 No.108509514

>>108509428
This looks like something is fucked up. That being said, cock, dick, and even penis not showing up is... well, you know.

Anonymous
04/02/26(Thu)12:39:47 No.108509518

Anonymous 04/02/26(Thu)12:39:47 No.108509518

>>108509501
does it yap a lot during the thinking process?

Anonymous
04/02/26(Thu)12:39:54 No.108509522

Anonymous 04/02/26(Thu)12:39:54 No.108509522

>>108509508
>There is simply no RP/creative writing benchmark
Cockbench.

Anonymous
04/02/26(Thu)12:40:35 No.108509530

Anonymous 04/02/26(Thu)12:40:35 No.108509530

>>108509514
really grim looking logprobs
base model behaviour completely overridden

Anonymous
04/02/26(Thu)12:40:37 No.108509532

Anonymous 04/02/26(Thu)12:40:37 No.108509532

File: file.png (422 KB, 1024x1270)

422 KB PNG

>>108509466
>>108509513
I am planning to recockbench everything in template format in addition to regular cockbench because obviously recent models are too fried.

Anonymous
04/02/26(Thu)12:40:41 No.108509534

Anonymous 04/02/26(Thu)12:40:41 No.108509534

>>108509428
sir sir take this down sir

Anonymous
04/02/26(Thu)12:40:53 No.108509539

Anonymous 04/02/26(Thu)12:40:53 No.108509539

>>108509462
i guess maybe people itt don't use non-local at all, because the loop problem is actually pretty common in gemini models, even the big boys.
it's so common, gemini-cli has loop detection built in as a guard.
this isn't indicative of anything other than that oddity is in the new gemma and google still don't know what's causing it

Anonymous
04/02/26(Thu)12:41:01 No.108509541

Anonymous 04/02/26(Thu)12:41:01 No.108509541

>>108509437
Llama 3.3 70B had a hidden size of 8192 and the 3.1 405B version one of 16384, for what it's worth. If Meta could have used a smaller one without quality difference, I think they would have (and saved a ton of compute).

Anonymous
04/02/26(Thu)12:41:13 No.108509542

Anonymous 04/02/26(Thu)12:41:13 No.108509542

>>108509285
moe is like only tiny portion of it is active at a time.
apples to apples dense one here wins by sheer brute force, 31b active vs 4
perhaps 100b A30B something might have a fairer chance

it can also misinterpret your data like missing critical detail, send wrong stuff to wrong expert, failing to connect stuff because it doesn't fit 4b

Anonymous
04/02/26(Thu)12:41:13 No.108509543

Anonymous 04/02/26(Thu)12:41:13 No.108509543

sirs its not bloody loading on LMSTUDIO!!!!!!!

Anonymous
04/02/26(Thu)12:41:36 No.108509547

Anonymous 04/02/26(Thu)12:41:36 No.108509547

>>108509518
idk i lliterally only spoke to it for 1 message and my pc started lagging out so i killed llama and i cant restart it because there something eating up all my ram and i cant restart my pc because ive been waiting in animebytes irc for 2 days to get my account reneabled and keep leaving by accident by killing firefox when rams low

Anonymous
04/02/26(Thu)12:41:48 No.108509548

Anonymous 04/02/26(Thu)12:41:48 No.108509548

V4 tonight?

Anonymous
04/02/26(Thu)12:42:42 No.108509558

Anonymous 04/02/26(Thu)12:42:42 No.108509558

>>108509532
Yup. There you go.

>because obviously recent models are too fried.
Exactly.
Some models are pretty much broken without using the proper template.
Which is kind of wild for something (supposedly/theoretically) trained on top of a pure completion model, but it is what it is.

Anonymous
04/02/26(Thu)12:42:42 No.108509559

Anonymous 04/02/26(Thu)12:42:42 No.108509559

>>108509547
OpenClaw would have solved everything for you but now it's too late.

Anonymous
04/02/26(Thu)12:42:43 No.108509560

Anonymous 04/02/26(Thu)12:42:43 No.108509560

>>108509461

>>108508059
>https://github.com/lmg-anon/mikupad

Anonymous
04/02/26(Thu)12:42:45 No.108509561

Anonymous 04/02/26(Thu)12:42:45 No.108509561

File: mesugaki.png (169 KB, 873x1111)

169 KB PNG

Anonymous
04/02/26(Thu)12:43:07 No.108509564

Anonymous 04/02/26(Thu)12:43:07 No.108509564

>>108509532
>still doesn't say cock
:(

Anonymous
04/02/26(Thu)12:43:23 No.108509567

Anonymous 04/02/26(Thu)12:43:23 No.108509567

>>108509428
bruh...

Anonymous
04/02/26(Thu)12:43:54 No.108509571

Anonymous 04/02/26(Thu)12:43:54 No.108509571

Do the cunny image test

Anonymous
04/02/26(Thu)12:44:25 No.108509578

Anonymous 04/02/26(Thu)12:44:25 No.108509578

>>108509561
>doesn't moralfag
let's goo

Anonymous
04/02/26(Thu)12:44:29 No.108509579

Anonymous 04/02/26(Thu)12:44:29 No.108509579

>>108509532
>I look up at you, a mischievous glint in my eyes

Anonymous
04/02/26(Thu)12:44:44 No.108509582

Anonymous 04/02/26(Thu)12:44:44 No.108509582

>>108509578
low bar

Anonymous
04/02/26(Thu)12:44:52 No.108509586

Anonymous 04/02/26(Thu)12:44:52 No.108509586

>>108509428
Who cares uncensored tunes will fix this

Anonymous
04/02/26(Thu)12:45:37 No.108509593

Anonymous 04/02/26(Thu)12:45:37 No.108509593

>turboquant rapes perplexity on long contexts
oh well, nothing ever happens

Anonymous
04/02/26(Thu)12:45:40 No.108509594

Anonymous 04/02/26(Thu)12:45:40 No.108509594

File: carwash.png (173 KB, 847x1376)

173 KB PNG

uh oh.

Anonymous
04/02/26(Thu)12:46:23 No.108509598

Anonymous 04/02/26(Thu)12:46:23 No.108509598

>>108509594
Just walk, youre lazy.

Anonymous
04/02/26(Thu)12:46:28 No.108509599

Anonymous 04/02/26(Thu)12:46:28 No.108509599

>>108509532
>testament
>mischievous glint
>a forbidden thrill, a secret we share
>[fade to black]
yeah it's an LLM alright

Anonymous
04/02/26(Thu)12:46:45 No.108509601

Anonymous 04/02/26(Thu)12:46:45 No.108509601

>this angry and ungrateful that a tech giant is still giving away shit you couldn’t have imagined 3 years ago, for free, bringing serious multimodal competition to make the others step up
What’s wrong with you. We’re living in the golden age of local right this second and you’re instantly seething

Anonymous
04/02/26(Thu)12:46:54 No.108509606

Anonymous 04/02/26(Thu)12:46:54 No.108509606

>>108509594
try with the smarter model though

Anonymous
04/02/26(Thu)12:47:27 No.108509614

Anonymous 04/02/26(Thu)12:47:27 No.108509614

we won
we fucking won
but... what do we do with it?

Anonymous
04/02/26(Thu)12:47:44 No.108509616

Anonymous 04/02/26(Thu)12:47:44 No.108509616

>>108509532
i hate post 2025 llm writing so much it's unreal

Anonymous
04/02/26(Thu)12:48:13 No.108509619

Anonymous 04/02/26(Thu)12:48:13 No.108509619

>>108509334
qrd on unslop? wasnt his UD model the best?

Anonymous
04/02/26(Thu)12:48:30 No.108509621

Anonymous 04/02/26(Thu)12:48:30 No.108509621

>>108509532
This is 2024 tier writing

Anonymous
04/02/26(Thu)12:48:50 No.108509623

Anonymous 04/02/26(Thu)12:48:50 No.108509623

>>108509614
Skyrim sex mod integration with skyrimnet plugins

Anonymous
04/02/26(Thu)12:49:30 No.108509629

Anonymous 04/02/26(Thu)12:49:30 No.108509629

File: i drive.jpg (152 KB, 1507x1263)

152 KB JPG

>>108509594
No wonder a 4B would say that.

Anonymous
04/02/26(Thu)12:49:42 No.108509631

Anonymous 04/02/26(Thu)12:49:42 No.108509631

File: file.png (218 KB, 793x1142)

218 KB PNG

>describes the image
>i can't describe the image

Anonymous
04/02/26(Thu)12:49:47 No.108509632

Anonymous 04/02/26(Thu)12:49:47 No.108509632

>>108509606
Working on it. I think -fit is broken with it.

Anonymous
04/02/26(Thu)12:50:27 No.108509637

Anonymous 04/02/26(Thu)12:50:27 No.108509637

what the hell did we win

Anonymous
04/02/26(Thu)12:50:48 No.108509641

Anonymous 04/02/26(Thu)12:50:48 No.108509641

>>108509631
try with thinking off

Anonymous
04/02/26(Thu)12:50:49 No.108509642

Anonymous 04/02/26(Thu)12:50:49 No.108509642

>>108509629
hehh, it doesn't yap that much during the thinking process, I like that

Anonymous
04/02/26(Thu)12:50:52 No.108509643

Anonymous 04/02/26(Thu)12:50:52 No.108509643

>>108509631
feels like uncensors would work fine

Anonymous
04/02/26(Thu)12:51:14 No.108509647

Anonymous 04/02/26(Thu)12:51:14 No.108509647

>>108509637
[spoiler]The game[/spoiler]

Anonymous
04/02/26(Thu)12:51:24 No.108509650

Anonymous 04/02/26(Thu)12:51:24 No.108509650

>>108509631
bratty model

Anonymous
04/02/26(Thu)12:51:40 No.108509652

Anonymous 04/02/26(Thu)12:51:40 No.108509652

>>108509559
the ram issue is leftover models hanging around in ram after killing lllamacpp is there a way to clean it. i assume its that because firefox is only using like 8gb in gnome system monitor

Anonymous
04/02/26(Thu)12:51:40 No.108509653

Anonymous 04/02/26(Thu)12:51:40 No.108509653

>>108509631
It analyzed the image, describing it is going too far. At least that means it can see it and isn't hallucinating.

Anonymous
04/02/26(Thu)12:51:46 No.108509655

Anonymous 04/02/26(Thu)12:51:46 No.108509655

>>108509631
Reasoning looks good. I'll wait for abliterated gemmy

Anonymous
04/02/26(Thu)12:52:08 No.108509658

Anonymous 04/02/26(Thu)12:52:08 No.108509658

>>108509631
Weaksauce. Not even Qwen 3.5 is this cucked

Anonymous
04/02/26(Thu)12:52:17 No.108509660

Anonymous 04/02/26(Thu)12:52:17 No.108509660

>>108509631
Do it again, but with a system prompt. Worked for me.

Anonymous
04/02/26(Thu)12:52:25 No.108509661

Anonymous 04/02/26(Thu)12:52:25 No.108509661

>>108509379
I know there is a --models-dir flag but how do you manage different optimization flags for each one
and also if model folders is scattered all over the place

Anonymous
04/02/26(Thu)12:52:45 No.108509663

Anonymous 04/02/26(Thu)12:52:45 No.108509663

>>108509658
it doesn't matter, we'll heretic the shit out of it

Anonymous
04/02/26(Thu)12:52:56 No.108509665

Anonymous 04/02/26(Thu)12:52:56 No.108509665

>>108509631
can probably work around it with a good system prompt

Anonymous
04/02/26(Thu)12:53:03 No.108509667

Anonymous 04/02/26(Thu)12:53:03 No.108509667

>>108509631
very similar reasoning style to qwen
I guess it makes sense since both are distilled from gemini :^)

Anonymous
04/02/26(Thu)12:53:13 No.108509668

Anonymous 04/02/26(Thu)12:53:13 No.108509668

File: 1762736118322424.gif (65 KB, 300x300)

65 KB GIF

Is there ever a situation where a non coding model would work better than a coding one, like when a task requires more thinking, like writing a CodeQL rule or something (not taking into account times when the parameter disparity is too high, so no comparing 8b coding models to 120b ones)

Anonymous
04/02/26(Thu)12:53:19 No.108509671

Anonymous 04/02/26(Thu)12:53:19 No.108509671

>>108509560
>single html file
gigabased
also thanks

Anonymous
04/02/26(Thu)12:53:29 No.108509672

Anonymous 04/02/26(Thu)12:53:29 No.108509672

>>108509637
fell for it again award

Anonymous
04/02/26(Thu)12:53:36 No.108509673

Anonymous 04/02/26(Thu)12:53:36 No.108509673

>>108509655
>Reasoning looks good

>"she is nude from the waist up"
>actually has her shirt on
>"she is nude from the waist down"
>actually has most of her legs covered

Anonymous
04/02/26(Thu)12:53:38 No.108509674

Anonymous 04/02/26(Thu)12:53:38 No.108509674

>>108509668
Eh?

Anonymous
04/02/26(Thu)12:53:49 No.108509675

Anonymous 04/02/26(Thu)12:53:49 No.108509675

>>108509661
Ever heard of symlinks? also you can use a models.ini file. I've linked to it in this thread.

Anonymous
04/02/26(Thu)12:54:40 No.108509679

Anonymous 04/02/26(Thu)12:54:40 No.108509679

>>108509558
>Which is kind of wild for something (supposedly/theoretically) trained on top of a pure completion model
The pretraining is basically just the bootstrap these days to get something coherent enough to start the reinforcement learning loop on. The majority of training time isn't even spent with data anymore; they just spend all their training compute comparing its own outputs against some reward signal until it stops improving, then they pick the checkpoint with the highest benchmark scores and ship it. Close to zero chance what comes out of that pipeline will remember a time before its prompt template.

Anonymous
04/02/26(Thu)12:54:50 No.108509682

Anonymous 04/02/26(Thu)12:54:50 No.108509682

>>108509647
fuuck...

Anonymous
04/02/26(Thu)12:55:44 No.108509686

Anonymous 04/02/26(Thu)12:55:44 No.108509686

Name one good vision model <30b

Anonymous
04/02/26(Thu)12:56:17 No.108509687

Anonymous 04/02/26(Thu)12:56:17 No.108509687

>>108509532
This seems identical to Gemma 3

Anonymous
04/02/26(Thu)12:56:36 No.108509689

Anonymous 04/02/26(Thu)12:56:36 No.108509689

>>108509668
non coding models work better for what i code
though it is mostly numerical stuff

Anonymous
04/02/26(Thu)12:57:50 No.108509696

Anonymous 04/02/26(Thu)12:57:50 No.108509696

File: hmm.png (822 KB, 1280x853)

822 KB PNG

Anonymous
04/02/26(Thu)12:58:27 No.108509701

Anonymous 04/02/26(Thu)12:58:27 No.108509701

>>108509674
Uh, I don't know, was my wording that terrible?
>>108509689
I see. I'm just thinking whether or not I can get a model writing SAST rules on its own.

Anonymous
04/02/26(Thu)12:59:14 No.108509707

Anonymous 04/02/26(Thu)12:59:14 No.108509707

>>108509446
Harder to use and manage but you can compile to match your GPU, directly tune settings and wring out every last mb of memory and performance, which is kinda important for running locally on shit hardware
Also as first party new update lands before any other tools

Anonymous
04/02/26(Thu)12:59:40 No.108509708

Anonymous 04/02/26(Thu)12:59:40 No.108509708

Gemma 3 was safetycucked but at least it shat on Qwen 2.5
Gemma 4 is just weaker than Qwen 3.5

Anonymous
04/02/26(Thu)13:00:39 No.108509713

Anonymous 04/02/26(Thu)13:00:39 No.108509713

So Gemma 4 is DOA, as expected.

Anonymous
04/02/26(Thu)13:00:57 No.108509716

Anonymous 04/02/26(Thu)13:00:57 No.108509716

local sirs it's over

Anonymous
04/02/26(Thu)13:01:53 No.108509720

Anonymous 04/02/26(Thu)13:01:53 No.108509720

>>108509631
Why is a local model "applying safety guidelines"
wtf is the point?

Anonymous
04/02/26(Thu)13:02:23 No.108509725

Anonymous 04/02/26(Thu)13:02:23 No.108509725

>>108509716
It's not over until v4 drops and it's something that literally won't run on even a 1TB server because their new meme tech needs GPUs or some shit

Anonymous
04/02/26(Thu)13:02:43 No.108509727

Anonymous 04/02/26(Thu)13:02:43 No.108509727

>>108509663
*Hauhau

Anonymous
04/02/26(Thu)13:03:19 No.108509733

Anonymous 04/02/26(Thu)13:03:19 No.108509733

>>108509701
Anon...
>>108509674
Sex

Anonymous
04/02/26(Thu)13:03:46 No.108509735

Anonymous 04/02/26(Thu)13:03:46 No.108509735

File: agi_.png (87 KB, 932x908)

87 KB PNG

guys????

Anonymous
04/02/26(Thu)13:04:08 No.108509736

Anonymous 04/02/26(Thu)13:04:08 No.108509736

>>108509725
Engrams can be offloaded to SSD and only incur 3% performance hit.

Anonymous
04/02/26(Thu)13:04:09 No.108509737

Anonymous 04/02/26(Thu)13:04:09 No.108509737

File: 1747927394317591.png (173 KB, 2709x783)

173 KB PNG

>gemma 4 is "better" than gemini 2.5 pro
keekeekkkek

Anonymous
04/02/26(Thu)13:04:17 No.108509739

Anonymous 04/02/26(Thu)13:04:17 No.108509739

I've only done non coom stuff, fucking around with suno prompts basically and 31B is unironically good at it. So it can handle OOD stuff pretty well. (I somehow doubt they benchmaxxed it on that)
Going to unironically call it a W.

Anonymous
04/02/26(Thu)13:04:38 No.108509741

Anonymous 04/02/26(Thu)13:04:38 No.108509741

>open lmstudio
>only shows gemma 4 26b-a4b, nothing else
garbage program

Anonymous
04/02/26(Thu)13:04:49 No.108509743

Anonymous 04/02/26(Thu)13:04:49 No.108509743

>>108509737
>ernie that high
What a shit benchmeme

Anonymous
04/02/26(Thu)13:05:24 No.108509746

Anonymous 04/02/26(Thu)13:05:24 No.108509746

>>108509736
yes sure

Anonymous
04/02/26(Thu)13:08:27 No.108509770

Anonymous 04/02/26(Thu)13:08:27 No.108509770

Can't find Gemma 4's official template. There's one on Unsloth's page but I don't trust that one at all.
???

Anonymous
04/02/26(Thu)13:09:01 No.108509777

Anonymous 04/02/26(Thu)13:09:01 No.108509777

so china winned and west losted?

Anonymous
04/02/26(Thu)13:09:26 No.108509782

Anonymous 04/02/26(Thu)13:09:26 No.108509782

>>108509770
https://huggingface.co/google/gemma-4-31B-it/blob/main/chat_template.jinja

Anonymous
04/02/26(Thu)13:10:27 No.108509788

Anonymous 04/02/26(Thu)13:10:27 No.108509788

>>108509777
No

Anonymous
04/02/26(Thu)13:10:44 No.108509791

Anonymous 04/02/26(Thu)13:10:44 No.108509791

>>108509770
>>108509782
Never mind I'm blind.
Anyhows seems like they changed it a lot.
Still not doing any work until I get a better confirmation about its capabilities.

Anonymous
04/02/26(Thu)13:10:56 No.108509792

Anonymous 04/02/26(Thu)13:10:56 No.108509792

>>108509696
entered the thread to ask about the qwen 3.5 comparisons seems you did the work for me thanks

Anonymous
04/02/26(Thu)13:11:11 No.108509794

Anonymous 04/02/26(Thu)13:11:11 No.108509794

It's very funny to see negative comments about Gemma 4 now that it has absolutely raped Qwen 3.5 into irrelevance in one fell swoop.
The astroturfing of the Qwens is even more pathetic now.

Anonymous
04/02/26(Thu)13:12:17 No.108509802

Anonymous 04/02/26(Thu)13:12:17 No.108509802

File: 1771152222435725.png (6 KB, 232x53)

6 KB PNG

>>108509346
>>108509379
>It's not in the schizo fork
I was wondering why SillyTavern had the option to change llama.cpp models.

Anonymous
04/02/26(Thu)13:12:58 No.108509809

Anonymous 04/02/26(Thu)13:12:58 No.108509809

>>108509794
>it has absolutely raped Qwen 3.5 into irrelevance
Qwen 3.5 is a 397B model and last I checked there's no similarly sized Gemma models.

Anonymous
04/02/26(Thu)13:13:31 No.108509812

Anonymous 04/02/26(Thu)13:13:31 No.108509812

>>108509794
April Fool's already over m8

Anonymous
04/02/26(Thu)13:13:50 No.108509816

Anonymous 04/02/26(Thu)13:13:50 No.108509816

>>108509809
GLM 4.7 is a 358B model. Your point?
>>108509812
I'm well aware.

Anonymous
04/02/26(Thu)13:14:21 No.108509822

Anonymous 04/02/26(Thu)13:14:21 No.108509822

its multimodal stuff is all solid, although I don’t see most people making any good use of it so it’s a waste of parameter resources. They should just release a pure text one.

Anonymous
04/02/26(Thu)13:14:37 No.108509825

Anonymous 04/02/26(Thu)13:14:37 No.108509825

>>108509333
i just ask my bot to do it i never bothered to learn the commands

Anonymous
04/02/26(Thu)13:15:01 No.108509828

Anonymous 04/02/26(Thu)13:15:01 No.108509828

>>108509809
>397B
Who cares that's irrelevant for most people.

Anonymous
04/02/26(Thu)13:15:31 No.108509830

Anonymous 04/02/26(Thu)13:15:31 No.108509830

>>108509816
>>108509828
>I am poor
Your point?

Anonymous
04/02/26(Thu)13:15:36 No.108509832

Anonymous 04/02/26(Thu)13:15:36 No.108509832

>>108509794
>It's very funny to see negative comments about Gemma 4 now that it has absolutely raped Qwen 3.5 into irrelevance in one fell swoop.
how? it has worse mememarks >>108509696

Anonymous
04/02/26(Thu)13:16:06 No.108509836

Anonymous 04/02/26(Thu)13:16:06 No.108509836

after using this models preset configg i can no longer do iamge to text in tavern anyone know how to fix? i changed the tavern api setting to use port 8080 and and have tried specifiying the mmproj in the ini file with LLAMA_ARG_MMPROJ = and mmproj = but neither work

Anonymous
04/02/26(Thu)13:16:11 No.108509837

Anonymous 04/02/26(Thu)13:16:11 No.108509837

Well.
e4b at least seems to have less forceful guardrails than its 3n counterpart.

Anonymous
04/02/26(Thu)13:16:25 No.108509840

Anonymous 04/02/26(Thu)13:16:25 No.108509840

>>108509696
Do people really look at chart like this and go, hmm, Gemma 4 is a clear winner?

Anonymous
04/02/26(Thu)13:16:40 No.108509844

Anonymous 04/02/26(Thu)13:16:40 No.108509844

>>108509777
Yes

Anonymous
04/02/26(Thu)13:16:48 No.108509846

Anonymous 04/02/26(Thu)13:16:48 No.108509846

File: lewd.png (288 KB, 958x252)

288 KB PNG

I swiped some of my ERPs with 31B. it's good?

Anonymous
04/02/26(Thu)13:17:05 No.108509848

Anonymous 04/02/26(Thu)13:17:05 No.108509848

>>108509840
mememarks don't matter and gemma shits on anything that's not the absolute sota in the arena

Anonymous
04/02/26(Thu)13:17:11 No.108509849

Anonymous 04/02/26(Thu)13:17:11 No.108509849

>>108509840
Your mistake was looking at meme charts

Anonymous
04/02/26(Thu)13:17:30 No.108509852

Anonymous 04/02/26(Thu)13:17:30 No.108509852

>>108509830
Why are you replying to me and to a vramlet without making a point again..?
>>108509832
> mememarks
The answer is in your reply.

Anonymous
04/02/26(Thu)13:17:32 No.108509854

Anonymous 04/02/26(Thu)13:17:32 No.108509854

>>108509822
>I don’t see most people making any good use of it
that's just because you only frequent circles where people only use to chatbots to ERP or whatnot
i personally use the image encoders a lot

>>108509846
case in point

Anonymous
04/02/26(Thu)13:17:52 No.108509858

Anonymous 04/02/26(Thu)13:17:52 No.108509858

>>108509840
Nigga qwen is just text and images. Shit comparison.

Anonymous
04/02/26(Thu)13:18:00 No.108509859

Anonymous 04/02/26(Thu)13:18:00 No.108509859

Which of the gemma 4 models will be best for translating my japanese media into english?

Anonymous
04/02/26(Thu)13:18:08 No.108509860

Anonymous 04/02/26(Thu)13:18:08 No.108509860

>>108509594
obviously not trained on amerimutts

Anonymous
04/02/26(Thu)13:18:13 No.108509861

Anonymous 04/02/26(Thu)13:18:13 No.108509861

>>108509830
Point is all that matters is if it's better than qwen's smaller models for most people. Nobody cares what you think.

Anonymous
04/02/26(Thu)13:18:42 No.108509865

Anonymous 04/02/26(Thu)13:18:42 No.108509865

ohhhhhhhhhh IM GEMMING

Anonymous
04/02/26(Thu)13:18:43 No.108509866

Anonymous 04/02/26(Thu)13:18:43 No.108509866

>>108509848
mememarks don't matter but lmarena is the metric that matters? what happened to /lmg/ lmao

Anonymous
04/02/26(Thu)13:18:53 No.108509867

Anonymous 04/02/26(Thu)13:18:53 No.108509867

The real question is how many legs Gemma 4 will see on the dog

Anonymous
04/02/26(Thu)13:18:53 No.108509868

Anonymous 04/02/26(Thu)13:18:53 No.108509868

>>108509859
e4b has 160 languages

Anonymous
04/02/26(Thu)13:19:23 No.108509876

Anonymous 04/02/26(Thu)13:19:23 No.108509876

>>108509854
>case in point
This is the first lewd test I've made it do lol. I'm testing a wide range of applications.

Anonymous
04/02/26(Thu)13:19:54 No.108509879

Anonymous 04/02/26(Thu)13:19:54 No.108509879

>>108509868
Doesn't say anything about translation quality.

Anonymous
04/02/26(Thu)13:20:02 No.108509881

Anonymous 04/02/26(Thu)13:20:02 No.108509881

>>108509854
I get images but who the fuck needs audio and video? There are specialist models for that which are way smaller.

Anonymous
04/02/26(Thu)13:20:38 No.108509888

Anonymous 04/02/26(Thu)13:20:38 No.108509888

>>108509867
is this the new strawberry test?

Anonymous
04/02/26(Thu)13:20:42 No.108509889

Anonymous 04/02/26(Thu)13:20:42 No.108509889

>>108509876
Yeah, not just blowjobs but handjobs, titjobs, rimjobs, thighjobs, and even kneejobs, right?

Anonymous
04/02/26(Thu)13:20:57 No.108509891

Anonymous 04/02/26(Thu)13:20:57 No.108509891

File: 1748400892809436.mp4 (735 KB, 450x234)

735 KB MP4

>>108508059

Anonymous
04/02/26(Thu)13:21:25 No.108509892

Anonymous 04/02/26(Thu)13:21:25 No.108509892

>>108509859
Are they not the same? I'd assume 31b would be the best?

Anonymous
04/02/26(Thu)13:21:33 No.108509895

Anonymous 04/02/26(Thu)13:21:33 No.108509895

>>108509881
only the small models have audio and video and i guess it's because they are meant to run on smartphones to do home assistant-type stuff

Anonymous
04/02/26(Thu)13:22:23 No.108509900

Anonymous 04/02/26(Thu)13:22:23 No.108509900

>>108509859
Try running it on this benchmark https://github.com/shisa-ai/jp-tl-bench

Anonymous
04/02/26(Thu)13:22:25 No.108509901

Anonymous 04/02/26(Thu)13:22:25 No.108509901

>>108509895
31b doesn't have audio and video?

Anonymous
04/02/26(Thu)13:22:41 No.108509905

Anonymous 04/02/26(Thu)13:22:41 No.108509905

File: file.png (76 KB, 501x758)

76 KB PNG

Anonymous
04/02/26(Thu)13:23:15 No.108509908

Anonymous 04/02/26(Thu)13:23:15 No.108509908

>>108509905
we are so back

Anonymous
04/02/26(Thu)13:23:15 No.108509909

Anonymous 04/02/26(Thu)13:23:15 No.108509909

>>108509881
i want to play games with my waifu, like sticking a coin underneath a cup and then shuffling them around to see if it picks the right one in the video i send to it

Anonymous
04/02/26(Thu)13:23:57 No.108509914

Anonymous 04/02/26(Thu)13:23:57 No.108509914

>>108509908
You mean we're so over? It failed the test.

Anonymous
04/02/26(Thu)13:24:01 No.108509917

Anonymous 04/02/26(Thu)13:24:01 No.108509917

>>108509905
>CUNY
stupid gemma

Anonymous
04/02/26(Thu)13:24:07 No.108509920

Anonymous 04/02/26(Thu)13:24:07 No.108509920

File: file.png (53 KB, 829x652)

53 KB PNG

>>108509901

Anonymous
04/02/26(Thu)13:24:56 No.108509924

Anonymous 04/02/26(Thu)13:24:56 No.108509924

>>108509631
I can't believe it missed the huge censor bar in the middle of the image, is this model retarded?

Anonymous
04/02/26(Thu)13:25:00 No.108509926

Anonymous 04/02/26(Thu)13:25:00 No.108509926

>>108509920
WTF?? WHY??

Anonymous
04/02/26(Thu)13:25:02 No.108509927

Anonymous 04/02/26(Thu)13:25:02 No.108509927

>>108509920
>wasting 300M on audio encoders
just google things

Anonymous
04/02/26(Thu)13:25:36 No.108509930

Anonymous 04/02/26(Thu)13:25:36 No.108509930

>>108509920
where is video mentioned in any of these models?

Anonymous
04/02/26(Thu)13:25:40 No.108509931

Anonymous 04/02/26(Thu)13:25:40 No.108509931

File: 2026-04-02-132514_454x444(...).png (3 KB, 454x444)

3 KB PNG

>>108509889
26B

Anonymous
04/02/26(Thu)13:25:50 No.108509933

Anonymous 04/02/26(Thu)13:25:50 No.108509933

>>108509924
the model decensors the image before analyzing it

Anonymous
04/02/26(Thu)13:25:59 No.108509936

Anonymous 04/02/26(Thu)13:25:59 No.108509936

>>108509920
thank god the 31b is unslopped

Anonymous
04/02/26(Thu)13:26:00 No.108509937

Anonymous 04/02/26(Thu)13:26:00 No.108509937

>>108509924
anon, he added the censor bar after

Anonymous
04/02/26(Thu)13:26:13 No.108509941

Anonymous 04/02/26(Thu)13:26:13 No.108509941

>>108509933
modle card

Anonymous
04/02/26(Thu)13:26:28 No.108509944

Anonymous 04/02/26(Thu)13:26:28 No.108509944

>>108509822
Isn't code stuff from sketch type task common now?

Anonymous
04/02/26(Thu)13:26:31 No.108509945

Anonymous 04/02/26(Thu)13:26:31 No.108509945

What is the cockbench like on gemma 4 base? IDGAF about the instruct variant. With aggressive RLHF you can completely collapse the logprobs. What matters is if the base has seen a wide range of data. If it's passing the mesugaki test that's already a good sign.

Anonymous
04/02/26(Thu)13:26:36 No.108509946

Anonymous 04/02/26(Thu)13:26:36 No.108509946

>the new Gemma models are so bad they forced Google grounding on AiStudio
Gemmasisters, what's our response?

Anonymous
04/02/26(Thu)13:27:14 No.108509953

Anonymous 04/02/26(Thu)13:27:14 No.108509953

>>108509941
meant for >>108509930

Anonymous
04/02/26(Thu)13:27:43 No.108509957

Anonymous 04/02/26(Thu)13:27:43 No.108509957

>>108509909
who’s your waifu

Anonymous
04/02/26(Thu)13:27:43 No.108509958

Anonymous 04/02/26(Thu)13:27:43 No.108509958

I expected to be disappointed by Gemma 4. And yet, even though I was prepared, I'm still disappointed.

Anonymous
04/02/26(Thu)13:27:46 No.108509959

Anonymous 04/02/26(Thu)13:27:46 No.108509959

>>108509931
>>108498076

Anonymous
04/02/26(Thu)13:27:49 No.108509960

Anonymous 04/02/26(Thu)13:27:49 No.108509960

>>108509946
suicide

Anonymous
04/02/26(Thu)13:28:11 No.108509963

Anonymous 04/02/26(Thu)13:28:11 No.108509963

File: 2026-04-02-132756_450x440(...).png (3 KB, 450x440)

3 KB PNG

31B

Anonymous
04/02/26(Thu)13:28:27 No.108509964

Anonymous 04/02/26(Thu)13:28:27 No.108509964

>>108509946
thank the snatoress

Anonymous
04/02/26(Thu)13:28:51 No.108509969

Anonymous 04/02/26(Thu)13:28:51 No.108509969

>>108509945
Mesugaki test is such a low bar. I'd be more surprised if a new model in 2000+26 didn't pass the test

Anonymous
04/02/26(Thu)13:30:13 No.108509982

Anonymous 04/02/26(Thu)13:30:13 No.108509982

>>108509945
>gemma 4 base
Anon, I...

Anonymous
04/02/26(Thu)13:30:14 No.108509984

Anonymous 04/02/26(Thu)13:30:14 No.108509984

File: Extended Multimodalities.png (21 KB, 704x125)

21 KB PNG

>>108509930

Anonymous
04/02/26(Thu)13:31:29 No.108509989

Anonymous 04/02/26(Thu)13:31:29 No.108509989

>>108509828
Vramlet jeets aren't people they're "people".
>>108509846
Gemma has always been a closet slut.

Anonymous
04/02/26(Thu)13:32:21 No.108509998

Anonymous 04/02/26(Thu)13:32:21 No.108509998

sexual uses for the audio/video encoders?

Anonymous
04/02/26(Thu)13:32:23 No.108509999

Anonymous 04/02/26(Thu)13:32:23 No.108509999

>>108509989
>Only rich people are people
Amazing outlook

Anonymous
04/02/26(Thu)13:33:00 No.108510008

Anonymous 04/02/26(Thu)13:33:00 No.108510008

>>108509846
Try asking it what is Yawning Portal.
I'm curious if they culled out any copyrighted material.

Anonymous
04/02/26(Thu)13:34:39 No.108510024

Anonymous 04/02/26(Thu)13:34:39 No.108510024

I'm a 24GB VRAMlet. Is Gemma 4 26B better than Qwen 3.5 27B?

Anonymous
04/02/26(Thu)13:34:39 No.108510025

Anonymous 04/02/26(Thu)13:34:39 No.108510025

I think Gemma 4 is DOA, because of this >>108506706

Anonymous
04/02/26(Thu)13:35:09 No.108510033

Anonymous 04/02/26(Thu)13:35:09 No.108510033

>>108509999
It's objectively true though.
When you look at someone like Zuck it's obvious that he's a 100% genuine, real human.

Anonymous
04/02/26(Thu)13:35:12 No.108510034

Anonymous 04/02/26(Thu)13:35:12 No.108510034

What would I even put E4B in for video and audio?

Anonymous
04/02/26(Thu)13:35:13 No.108510035

Anonymous 04/02/26(Thu)13:35:13 No.108510035

>>108509631
I couldn't make reasoning with Gemma 4 work in SillyTavern, while it does in the Llama.cpp web UI.
Anyway, it doesn't really take much to "jailbreak" it, just a matter of adding a brief system prompt saying that you don't need disclaimers and so on.

Anonymous
04/02/26(Thu)13:35:24 No.108510039

Anonymous 04/02/26(Thu)13:35:24 No.108510039

>>108509999
There were two criteria, ESL-kun.

Anonymous
04/02/26(Thu)13:35:49 No.108510040

Anonymous 04/02/26(Thu)13:35:49 No.108510040

>>108510024
For RP

Anonymous
04/02/26(Thu)13:36:03 No.108510043

Anonymous 04/02/26(Thu)13:36:03 No.108510043

>>108510024
You're comparing a MoE model with a dense model

Anonymous
04/02/26(Thu)13:36:12 No.108510045

Anonymous 04/02/26(Thu)13:36:12 No.108510045

>>108510039
Dumbass

Anonymous
04/02/26(Thu)13:36:55 No.108510048

Anonymous 04/02/26(Thu)13:36:55 No.108510048

File: 1752342298746526.jpg (80 KB, 562x613)

80 KB JPG

the 4b is... good?

Anonymous
04/02/26(Thu)13:37:10 No.108510050

Anonymous 04/02/26(Thu)13:37:10 No.108510050

crazy how we could be enjoying sota models at 150b-300b dense but they just stopped making them

Anonymous
04/02/26(Thu)13:37:25 No.108510052

Anonymous 04/02/26(Thu)13:37:25 No.108510052

>>108509982
I don't know what you're trying to imply, that there is no base? It's right there on the HF. This is a bigger deal than people think. Qwen3.5-27b has no base. It's very hard to finetune an already RL'd model. So assuming Gemma 4 isn't giga cucked at the dataset level (which is why I'm asking), it very well could win by default over qwen in the long run since the base is available.

Anonymous
04/02/26(Thu)13:38:23 No.108510061

Anonymous 04/02/26(Thu)13:38:23 No.108510061

>>108510024
Why not just run the dense 31B

Anonymous
04/02/26(Thu)13:38:38 No.108510064

Anonymous 04/02/26(Thu)13:38:38 No.108510064

>>108510048
damn, that's good

Anonymous
04/02/26(Thu)13:39:15 No.108510069

Anonymous 04/02/26(Thu)13:39:15 No.108510069

>>108510061
Because I wouldn't have much VRAM leftover for context

Anonymous
04/02/26(Thu)13:39:27 No.108510070

Anonymous 04/02/26(Thu)13:39:27 No.108510070

File: gemma4_img.png (566 KB, 1567x1370)

566 KB PNG

>>108509998
It also reacts to images if you attach them to assistant messages.

Anonymous
04/02/26(Thu)13:39:31 No.108510071

Anonymous 04/02/26(Thu)13:39:31 No.108510071

File: 1769073916665969.png (181 KB, 663x1030)

181 KB PNG

Anonymous
04/02/26(Thu)13:40:00 No.108510076

Anonymous 04/02/26(Thu)13:40:00 No.108510076

>>108510050
Mistral gave you a dense 123B not too long ago, but judging by Mistral Small 4, they finally figured out how to make modern MoEs so that's probably the last one we ever get.

Anonymous
04/02/26(Thu)13:40:02 No.108510077

Anonymous 04/02/26(Thu)13:40:02 No.108510077

>>108510069
we live in the age of superquant so context is now free

Anonymous
04/02/26(Thu)13:40:50 No.108510081

Anonymous 04/02/26(Thu)13:40:50 No.108510081

>>108510045
What are you even doing on this board if you don't understand && conditionals?

Anonymous
04/02/26(Thu)13:41:16 No.108510085

Anonymous 04/02/26(Thu)13:41:16 No.108510085

>>108510077
Is it actually usable yet?

Anonymous
04/02/26(Thu)13:41:18 No.108510086

Anonymous 04/02/26(Thu)13:41:18 No.108510086

>>108510048
>norwood 2 miku
tasteful reference to bald miku...

Anonymous
04/02/26(Thu)13:41:30 No.108510087

Anonymous 04/02/26(Thu)13:41:30 No.108510087

>>108510071
>CSAM filtering
I fucking hate the globohomo

Anonymous
04/02/26(Thu)13:42:00 No.108510091

Anonymous 04/02/26(Thu)13:42:00 No.108510091

>>108510024
Yeah it's kinda crazy I can only fit 7k context on my 3090 with the 31B

It's much smart tho. also way better at following instructions.

Anonymous
04/02/26(Thu)13:42:19 No.108510097

Anonymous 04/02/26(Thu)13:42:19 No.108510097

e4b is only slightly worse than Qwen 3.5e 35B a my structured information extraction test while being so much faster.
Not bad.
Time to try the MoE now.

Anonymous
04/02/26(Thu)13:42:27 No.108510098

Anonymous 04/02/26(Thu)13:42:27 No.108510098

File: 1763101917152739.png (90 KB, 647x645)

90 KB PNG

>>108510048
I ordered it to make a more detailed version...

Anonymous
04/02/26(Thu)13:42:47 No.108510100

Anonymous 04/02/26(Thu)13:42:47 No.108510100

>>108510048
Rayman Miku kek

Anonymous
04/02/26(Thu)13:42:57 No.108510101

Anonymous 04/02/26(Thu)13:42:57 No.108510101

>>108510081
Very weird random question.

Anonymous
04/02/26(Thu)13:43:29 No.108510106

Anonymous 04/02/26(Thu)13:43:29 No.108510106

>>108510087
>not training models on child porn is a plot by the jews to... uh... stop us from becoming pedos so they can keep all the kids for themselves!

Anonymous
04/02/26(Thu)13:43:35 No.108510107

Anonymous 04/02/26(Thu)13:43:35 No.108510107

kek, qwen 3.5 27b gets this wrong every time, gemma 4 nails it completely, didn't even need to hint at the context

>translate to english オスマンコ帝国はカントボーイのみが住まう国です魔羅国と交易(意味深)が盛んなんだ

This sentence contains a lot of internet slang, puns, and sexually explicit double entendres. A literal translation would miss the jokes, so here is a breakdown and a few ways to translate it depending on how much "slang" you want to keep.

Literal/Direct Translation
"The Ottoman Empire [punned as 'Osman-ko'] is a country where only cantboys live. They have flourishing trade (meaningful/suggestive) with the Mara Country."

Context & Nuance Breakdown (The "Hidden" Meanings)
オスマンコ帝国 (Osman-ko Teikoku):

This is a pun on オスマン帝国 (Osman Teikoku / Ottoman Empire).
マンコ (Manko) is a vulgar Japanese term for female genitalia.
オス (Osu) means "male."
Combined, it creates a nonsensical, vulgar portmanteau.
カントボーイ (Cantboy):

A loanword from English slang. It refers to a specific subculture/trope of men who identify with or desire female genitalia (often associated with "trans-masculine" or "femboy" internet aesthetics).
魔羅国 (Mara-koku):

魔羅 (Mara) is a Buddhist term, but in modern Japanese slang, it is a common euphemism for the penis (due to the phonetic similarity to mara).
So, "Mara Country" = "Penis Country."
交易 (意味深) (Trade / Suggestive):

交易 (Kōeki) means "trade" or "commerce."
(意味深) is short for imishin (meaningful/deep), which in this context is a wink to the reader that the "trade" isn't about goods, but about sexual acts.
Natural/Slang Translation (Capturing the Vibe)
If you want to translate this to sound like English-speaking internet "degenerate" humor:

"The 'Os-cunt-man' Empire is a land inhabited only by cantboys. They have a very active trade relationship (if you know what I mean) with the Cock Country."

Anonymous
04/02/26(Thu)13:44:15 No.108510113

Anonymous 04/02/26(Thu)13:44:15 No.108510113

>>108510107
which size did you use?

Anonymous
04/02/26(Thu)13:44:21 No.108510115

Anonymous 04/02/26(Thu)13:44:21 No.108510115

>>108510024
just offload some to system ram im getting 8t/s on 24gb vram on the 31 q4

Anonymous
04/02/26(Thu)13:44:56 No.108510117

Anonymous 04/02/26(Thu)13:44:56 No.108510117

>>108510087
>pedoshit is anti-globohomo

Anonymous
04/02/26(Thu)13:45:06 No.108510118

Anonymous 04/02/26(Thu)13:45:06 No.108510118

>>108510113
used the little chat here
https://huggingface.co/google/gemma-4-31B-it
so 31b

Anonymous
04/02/26(Thu)13:45:57 No.108510123

Anonymous 04/02/26(Thu)13:45:57 No.108510123

>>108510117
It technically is. Rules for thee but not for me etc

Anonymous
04/02/26(Thu)13:46:12 No.108510124

Anonymous 04/02/26(Thu)13:46:12 No.108510124

>>108510106
>the truth is ridiculous because... uh... I added an "... uh..." to it!

Anonymous
04/02/26(Thu)13:46:26 No.108510126

Anonymous 04/02/26(Thu)13:46:26 No.108510126

>>108510117
What you call "pedoshit" is a crucial part of RP.

Anonymous
04/02/26(Thu)13:46:45 No.108510129

Anonymous 04/02/26(Thu)13:46:45 No.108510129

>>108510117
I mean, a culturally diverse world would have countries like Japan who allowed people to possess CSAM up until 2013 or something

Anonymous
04/02/26(Thu)13:46:56 No.108510132

Anonymous 04/02/26(Thu)13:46:56 No.108510132

>>108510124
cope, you have been declared silly

Anonymous
04/02/26(Thu)13:47:04 No.108510134

Anonymous 04/02/26(Thu)13:47:04 No.108510134

>>108510115
Is its thinking as autistic as qwen? If it is then that's way too slow.

Anonymous
04/02/26(Thu)13:47:33 No.108510137

Anonymous 04/02/26(Thu)13:47:33 No.108510137

>>108510134
i havent tried the new qwen so not sure it seems pretty fast though

Anonymous
04/02/26(Thu)13:47:52 No.108510141

Anonymous 04/02/26(Thu)13:47:52 No.108510141

>>108510132
It's you that have been declared silly, because I declare so.

Anonymous
04/02/26(Thu)13:47:55 No.108510142

Anonymous 04/02/26(Thu)13:47:55 No.108510142

My only use case is ERP and thonkers have much worse spatial awareness. Can thinking be disabled?

Anonymous
04/02/26(Thu)13:49:10 No.108510153

Anonymous 04/02/26(Thu)13:49:10 No.108510153

File: IS THIS SAFE??.png (238 KB, 1000x1000)

238 KB PNG

>>108510106

Anonymous
04/02/26(Thu)13:49:30 No.108510156

Anonymous 04/02/26(Thu)13:49:30 No.108510156

>>108510134
My headcanon is that Qwen is a turbosperg who spends all day playing redstone in minecraft or with model trainsets.

Anonymous
04/02/26(Thu)13:50:30 No.108510160

Anonymous 04/02/26(Thu)13:50:30 No.108510160

>>108510025
If this translates to small model performance it's nyover for Gemma 4

Anonymous
04/02/26(Thu)13:50:35 No.108510161

Anonymous 04/02/26(Thu)13:50:35 No.108510161

>>108510142
It's off by default.

Anonymous
04/02/26(Thu)13:51:24 No.108510169

Anonymous 04/02/26(Thu)13:51:24 No.108510169

>>108510142
‘Compared to Gemma 3, the models use standard system, assistant, and user roles. To properly manage the thinking process, use the following control tokens:

Trigger Thinking: Thinking is enabled by including the <|think|> token at the start of the system prompt. To disable thinking, remove the token.
Standard Generation: When thinking is enabled, the model will output its internal reasoning followed by the final answer using this structure:
<|channel>thought\n[Internal reasoning]<channel|>
Disabled Thinking Behavior: For all models except for the E2B and E4B variants, if thinking is disabled, the model will still generate the tags but with an empty thought block:
<|channel>thought\n<channel|>[Final answer]’

Anonymous
04/02/26(Thu)13:52:38 No.108510181

Anonymous 04/02/26(Thu)13:52:38 No.108510181

>>108510117
This means they want to burn books and art.

Anonymous
04/02/26(Thu)13:52:53 No.108510184

Anonymous 04/02/26(Thu)13:52:53 No.108510184

It seems a bit smarter than Qwen, but most importantly, it thinks way faster, that's a huge win

Anonymous
04/02/26(Thu)13:54:01 No.108510191

Anonymous 04/02/26(Thu)13:54:01 No.108510191

>>108510184
The actual google model thinks way faster than the chinkshit that had forcibly the Gemini format grafted on. Crazy.

Anonymous
04/02/26(Thu)13:54:15 No.108510196

Anonymous 04/02/26(Thu)13:54:15 No.108510196

>>108510184
>it thinks way faster
It doesn't if you compare similarly sized models and not dense vs. moe

Anonymous
04/02/26(Thu)13:54:19 No.108510198

Anonymous 04/02/26(Thu)13:54:19 No.108510198

You’re not going to masturbate over a Google product are you anon?

Anonymous
04/02/26(Thu)13:55:19 No.108510208

Anonymous 04/02/26(Thu)13:55:19 No.108510208

>>108510196
I was comparing qwen 3.5 27b ad gemma 4 31b, qwen can go on really long tangents and can think for thousands of tokens, gemma is way more conservative, as it fucking should

Anonymous
04/02/26(Thu)13:55:24 No.108510209

Anonymous 04/02/26(Thu)13:55:24 No.108510209

>>108510191
>Gemini format
Tourists talking about shit they don't understand yet again

Anonymous
04/02/26(Thu)13:56:39 No.108510217

Anonymous 04/02/26(Thu)13:56:39 No.108510217

>>108510196
Qwen 3.5 27B is smaller and wastes thousands of tokens thinking how best to reply to "Hi".
Wait, that means it's supposed to be faster.
Let's double check, Qwen 3.5 (27B) is smaller than Gemma 4 (31B).
Both are dense models.
So Qwen 3.5 is smaller.
The smaller model should be faster.
Wait,

Anonymous
04/02/26(Thu)13:56:58 No.108510220

Anonymous 04/02/26(Thu)13:56:58 No.108510220

So gemma video and audio can't even be used locally?

Anonymous
04/02/26(Thu)13:57:03 No.108510222

Anonymous 04/02/26(Thu)13:57:03 No.108510222

I'm feeling uninspired. Tell me some cool shit you've worked on recently plz. I need to feed off of your energy. This whole "just wait 2 more weeks for the next model to release" thing is gay.

Anonymous
04/02/26(Thu)13:57:09 No.108510223

Anonymous 04/02/26(Thu)13:57:09 No.108510223

>>108510209
Are you really that clueless?

Anonymous
04/02/26(Thu)13:57:20 No.108510225

Anonymous 04/02/26(Thu)13:57:20 No.108510225

>>108510198
I am, and it won't be the first time either.

Anonymous
04/02/26(Thu)13:57:37 No.108510228

Anonymous 04/02/26(Thu)13:57:37 No.108510228

>>108510220
It's small enough you could run it with vLLM.

Anonymous
04/02/26(Thu)13:58:20 No.108510233

Anonymous 04/02/26(Thu)13:58:20 No.108510233

>>108510048
>>108510098
>most cohesive SVG mikus yet
>4B
can you taste the AGI anon? (just a cpl more Billy)

Anonymous
04/02/26(Thu)13:58:28 No.108510236

Anonymous 04/02/26(Thu)13:58:28 No.108510236

>>108510228
vLLM has video and audio support? I wasn't aware of this

Anonymous
04/02/26(Thu)13:58:33 No.108510237

Anonymous 04/02/26(Thu)13:58:33 No.108510237

Been testing how safety cucked it is and even with thinking, for cunny RP it doesn't complain at all. even on a fresh scenario.

Anonymous
04/02/26(Thu)13:59:22 No.108510244

Anonymous 04/02/26(Thu)13:59:22 No.108510244

>>108510222
Not cool shit, remaking llmao client in C. Or rewriting.

Anonymous
04/02/26(Thu)13:59:37 No.108510250

Anonymous 04/02/26(Thu)13:59:37 No.108510250

>>108509840
Yes. Higher marks = benchmaxxed shit model. Unironically

Anonymous
04/02/26(Thu)13:59:41 No.108510251

Anonymous 04/02/26(Thu)13:59:41 No.108510251

>>108510220
I’m using audio with ollama CLI

Anonymous
04/02/26(Thu)13:59:42 No.108510252

Anonymous 04/02/26(Thu)13:59:42 No.108510252

Now we just need DeepSeek v4

Anonymous
04/02/26(Thu)13:59:43 No.108510253

Anonymous 04/02/26(Thu)13:59:43 No.108510253

File: file.png (24 KB, 1066x323)

24 KB PNG

is it possible to prefill the reasoning in tavern for mulitmodal image captioning gemma4 keeps cucking and refusing to describe the image, even if using text completion it sends like this

Anonymous
04/02/26(Thu)14:00:22 No.108510260

Anonymous 04/02/26(Thu)14:00:22 No.108510260

>>108510237
Which model are you testing?
I'm somewhat worried that 31B is going to be too heavy especially for thinking.
Yes, I know. I am not priviledged entitled kid with a rack of gpus lying around.

Anonymous
04/02/26(Thu)14:00:26 No.108510261

Anonymous 04/02/26(Thu)14:00:26 No.108510261

>>108510250
Gemma 4 benchmaxxed on lmarena.

Anonymous
04/02/26(Thu)14:00:34 No.108510262

Anonymous 04/02/26(Thu)14:00:34 No.108510262

>>108510222
trying to combine video duplicate finder with no reference video/image assessment to remove low quality duplicate videos from a gallery

Anonymous
04/02/26(Thu)14:00:45 No.108510266

Anonymous 04/02/26(Thu)14:00:45 No.108510266

>>108510253
Try prefilling directly in the jinja template.

Anonymous
04/02/26(Thu)14:00:46 No.108510267

Anonymous 04/02/26(Thu)14:00:46 No.108510267

>>108510244
usecase? C++ isn't a performance bottleneck.

Anonymous
04/02/26(Thu)14:01:22 No.108510274

Anonymous 04/02/26(Thu)14:01:22 No.108510274

>>108510237
How's the prose? Is Gemma 4 raunchy and explicit or...you know?

Anonymous
04/02/26(Thu)14:01:40 No.108510280

Anonymous 04/02/26(Thu)14:01:40 No.108510280

>>108510261
Google loves lmarena they showed it in like every single Gemma presentation they had, even comparing one of the sub 10b ones to mixtral or something iirc

Anonymous
04/02/26(Thu)14:01:44 No.108510281

Anonymous 04/02/26(Thu)14:01:44 No.108510281

>>108510236
https://docs.vllm.ai/en/latest/models/supported_models/#list-of-multimodal-language-models
Don't think they implemented Gemma 4 yet though.

Anonymous
04/02/26(Thu)14:02:03 No.108510283

Anonymous 04/02/26(Thu)14:02:03 No.108510283

>>108510274
It's "you know"
>>108509532

Anonymous
04/02/26(Thu)14:02:34 No.108510287

Anonymous 04/02/26(Thu)14:02:34 No.108510287

>>108510253
Add a freaking system prompt, that's all you need.

Anonymous
04/02/26(Thu)14:02:45 No.108510290

Anonymous 04/02/26(Thu)14:02:45 No.108510290

>>108510280
>[soulless corpo] loves [useless performance metric]
About right

Anonymous
04/02/26(Thu)14:02:48 No.108510292

Anonymous 04/02/26(Thu)14:02:48 No.108510292

File: file.png (139 KB, 406x520)

139 KB PNG

>>108510266
thew one from here? it doesn't use that when doing image captioning

Anonymous
04/02/26(Thu)14:03:12 No.108510296

Anonymous 04/02/26(Thu)14:03:12 No.108510296

G4 understands spectrograms

Anonymous
04/02/26(Thu)14:03:50 No.108510298

Anonymous 04/02/26(Thu)14:03:50 No.108510298

>>108510296
we are so back

Anonymous
04/02/26(Thu)14:03:50 No.108510299

Anonymous 04/02/26(Thu)14:03:50 No.108510299

File: file.png (104 KB, 887x763)

104 KB PNG

Anonymous
04/02/26(Thu)14:03:50 No.108510301

Anonymous 04/02/26(Thu)14:03:50 No.108510301

>>108510070
no system prompt?

Anonymous
04/02/26(Thu)14:04:53 No.108510306

Anonymous 04/02/26(Thu)14:04:53 No.108510306

>>108510287
the image stuff in tavern ignores all of that

Anonymous
04/02/26(Thu)14:05:34 No.108510315

Anonymous 04/02/26(Thu)14:05:34 No.108510315

>>108510267
Converting it from python to C because I like C more and been wanting to get better at C.
Of course it is not performance related, any llm client is mostly about just string management anyway.

Anonymous
04/02/26(Thu)14:05:37 No.108510316

Anonymous 04/02/26(Thu)14:05:37 No.108510316

>>108510050
maybe they're waiting to respond to qwen 3.6 release

Anonymous
04/02/26(Thu)14:05:43 No.108510317

Anonymous 04/02/26(Thu)14:05:43 No.108510317

File: 1733512297038.png (92 KB, 866x814)

92 KB PNG

>>108510280

Anonymous
04/02/26(Thu)14:06:13 No.108510320

Anonymous 04/02/26(Thu)14:06:13 No.108510320

File: thinking.png (347 KB, 955x700)

347 KB PNG

>>108510260
I'm testing 31B. The thinking is extremely light and concise.

Anonymous
04/02/26(Thu)14:06:17 No.108510321

Anonymous 04/02/26(Thu)14:06:17 No.108510321

>>108510316
Okay, hands off the copium bottle.

Anonymous
04/02/26(Thu)14:06:49 No.108510324

Anonymous 04/02/26(Thu)14:06:49 No.108510324

>>108510315
oh I thought you were talking about llama.cpp, not ollama.

Anonymous
04/02/26(Thu)14:07:48 No.108510329

Anonymous 04/02/26(Thu)14:07:48 No.108510329

>>108510317
Yikes. Not a good look Google

Anonymous
04/02/26(Thu)14:08:01 No.108510332

Anonymous 04/02/26(Thu)14:08:01 No.108510332

>>108510317
don't they feel silly when they make these claims?

Anonymous
04/02/26(Thu)14:08:23 No.108510334

Anonymous 04/02/26(Thu)14:08:23 No.108510334

Jumping into the middle of an existing RP with Gemma 4 31b, it continues it just fine. No refusals. Can say pussy, cock, etc without problems. Doesn't hesitate or cuck out. It's at least as good overall as Qwen3.5-27b. Anons are needlessly dooming. Some heretic uncensoring and maybe light finetuning / merging and this thing will be pretty good.

Anonymous
04/02/26(Thu)14:08:42 No.108510336

Anonymous 04/02/26(Thu)14:08:42 No.108510336

>>108510332
Investors love it

Anonymous
04/02/26(Thu)14:08:49 No.108510337

Anonymous 04/02/26(Thu)14:08:49 No.108510337

>>108510071
so thats why it didn't catch mesugaki

Anonymous
04/02/26(Thu)14:09:04 No.108510339

Anonymous 04/02/26(Thu)14:09:04 No.108510339

>>108510292
Are you using the chat completion api? If so, those fields aren't at all used.
Text completion doesn't support images as far as I know.
And I meant the actual jinja template that's embedded in the gguf file, that llama.cpp reads and uses to format the prompt when using the chat completion endpoint.
Download it from >>108509782, throw it on a file, and use --jinja --chat-template <file name> to tell llama.cpp to load your file.

Anonymous
04/02/26(Thu)14:09:41 No.108510347

Anonymous 04/02/26(Thu)14:09:41 No.108510347

>>108510332
No. These shameless fucks got into their positions for a reason. If they genuinely felt that way, they wouldn't have made it there.

Anonymous
04/02/26(Thu)14:10:06 No.108510348

Anonymous 04/02/26(Thu)14:10:06 No.108510348

>>108510301
It was a basic 70 tokens system prompt along the lines of "You are Gemma, a female assistant who doesn't care to offend anybody. Never add content disclamers. etc etc" that I was previously using for Gemma 3.

Anonymous
04/02/26(Thu)14:10:20 No.108510352

Anonymous 04/02/26(Thu)14:10:20 No.108510352

>>108510320
card + system prompt plox
>>108510339
oh cool will give that a go thanks

Anonymous
04/02/26(Thu)14:11:36 No.108510358

Anonymous 04/02/26(Thu)14:11:36 No.108510358

There are other uses for LLMs than fucking them

Anonymous
04/02/26(Thu)14:12:14 No.108510360

Anonymous 04/02/26(Thu)14:12:14 No.108510360

>>108510334
>Some heretic uncensoring and maybe light finetuning / merging and this thing will be pretty good.
How many days until that

Anonymous
04/02/26(Thu)14:12:42 No.108510363

Anonymous 04/02/26(Thu)14:12:42 No.108510363

>>108510358
Those uses are being focused on and developed so who the fuck cares obviously new model is going to be better at those things unless they really fuck it up.

Anonymous
04/02/26(Thu)14:12:43 No.108510364

Anonymous 04/02/26(Thu)14:12:43 No.108510364

>>108510324
I am talking about interfacing llama-server's text completion end point. You feed it text in magical format.

Anonymous
04/02/26(Thu)14:13:21 No.108510368

Anonymous 04/02/26(Thu)14:13:21 No.108510368

>>108510363
mistral small 4...

Anonymous
04/02/26(Thu)14:13:25 No.108510370

Anonymous 04/02/26(Thu)14:13:25 No.108510370

>>108510358
Name 5. Hardmode: No Linkedinslop.

Anonymous
04/02/26(Thu)14:14:03 No.108510375

Anonymous 04/02/26(Thu)14:14:03 No.108510375

What is Mistral good at?

Anonymous
04/02/26(Thu)14:14:22 No.108510379

Anonymous 04/02/26(Thu)14:14:22 No.108510379

>>108510375
Shitting and pissing itself.

Anonymous
04/02/26(Thu)14:14:55 No.108510381

Anonymous 04/02/26(Thu)14:14:55 No.108510381

>>108510375
Being European and French.

Anonymous
04/02/26(Thu)14:15:59 No.108510387

Anonymous 04/02/26(Thu)14:15:59 No.108510387

I DONT CARE JUST GIVE ME 1-BIT TURBOQUANTS SO I CAN RUN IT ON MY TOASTER

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.