/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/24/26(Wed)12:12:41 No.109125882

File: k2.jpg (147 KB, 1024x1024)

147 KB JPG

/lmg/ - Local Models General Anonymous 06/24/26(Wed)12:12:41 No.109125882 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>109119574 & >>109113030

►News
>(06/16) GLM 5.2 released with IndexCache and 1M context: https://z.ai/blog/glm-5.2
>(06/16) VibeThinker-3B released: https://hf.co/WeiboAI/VibeThinker-3B
>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3
>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code
>(06/12) EAGLE3 speculative decoding support merged: https://github.com/ggml-org/llama.cpp/pull/18039

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/RecapAnon/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
06/24/26(Wed)12:12:55 No.109125884

Anonymous 06/24/26(Wed)12:12:55 No.109125884

File: bljjnf.jpg (91 KB, 768x1024)

91 KB JPG

►Recent Highlights from the Previous Thread: >>109119574

--GLM-5.2 MTP implementation for improved speculative decoding acceptance rates:
>109122142
--Qwen-AgentWorld-35B-A3B release and output length:
>109123403 >109123430 >109123465 >109123913
--Defining the difference between a model and an AI agent:
>109124668 >109124691 >109124730 >109124755 >109124764 >109124701 >109124841 >109124926
--Causes of repetitive flowery prose and low diversity in LLM stories:
>109124865 >109124980 >109125051 >109125079 >109125080
--Effects of reasoning language on Kimi K2.7 Code and Gemma 4:
>109124971 >109125047
--Addressing model laziness and quality degradation in long roleplays:
>109121775 >109121785 >109121812 >109121823 >109121831 >109121880 >109122022
--Potential architectural shifts beyond attention-based transformers:
>109125511 >109125526 >109125543 >109125586 >109125610 >109125625
--Troubleshooting Qwen MoE offloading and optimizing AMD GPU performance:
>109120904 >109120952 >109121173 >109125237 >109125345
--LLM bias toward repetitive names and suggesting external generators:
>109119771 >109119824 >109119915 >109119853 >109119866 >109120132 >109120178 >109119904 >109120564
--Ways to simulate AI unavailability and biological cycles:
>109120794 >109120822 >109120838 >109120893 >109120911 >109120857 >109120967 >109121018 >109121069
--Anon looking for AO3 dataset dumps:
>109124084 >109124169 >109124183 >109124219
--Mixing RAM speeds and capacities for server upgrades:
>109121017 >109121033 >109121134
--Comparing AI RP frontends and auditing repositories for malware:
>109124145 >109124157 >109124287 >109124781 >109124802 >109125187 >109125385 >109125417 >109125438 >109125517 >109124790
--Logs:
>109119640 >109119824 >109123833 >109123903
--Teto, Miku (free space):
>109119718 >109121291 >109122952 >109122997 >109124423

►Recent Highlight Posts from the Previous Thread: >>109119578

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
06/24/26(Wed)12:16:37 No.109125910

Anonymous 06/24/26(Wed)12:16:37 No.109125910

70b dense

Anonymous
06/24/26(Wed)12:17:19 No.109125916

Anonymous 06/24/26(Wed)12:17:19 No.109125916

70d bense

Anonymous
06/24/26(Wed)12:18:57 No.109125927

Anonymous 06/24/26(Wed)12:18:57 No.109125927

File: 1773335542610482.webm (3.69 MB, 1920x1080)

3.69 MB WEBM

Reminder to backup before hf cucks you. Grab the older ones you liked, too.

Anonymous
06/24/26(Wed)12:21:01 No.109125939

Anonymous 06/24/26(Wed)12:21:01 No.109125939

>>109125927
let's compile an /lmg/ must download list

Anonymous
06/24/26(Wed)12:23:39 No.109125957

Anonymous 06/24/26(Wed)12:23:39 No.109125957

>>109125927
idg the webm, why should i care about the moles

Anonymous
06/24/26(Wed)12:23:50 No.109125958

Anonymous 06/24/26(Wed)12:23:50 No.109125958

>>109125939
https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407

Anonymous
06/24/26(Wed)12:24:01 No.109125961

Anonymous 06/24/26(Wed)12:24:01 No.109125961

>>109125957
fetishes

Anonymous
06/24/26(Wed)12:25:29 No.109125970

Anonymous 06/24/26(Wed)12:25:29 No.109125970

File: 1776287282543871.webm (3.96 MB, 1920x1080)

3.96 MB WEBM

>>109125957
She's a mangaka hag who draws porn and decided to become a vtuber as a joke

Anonymous
06/24/26(Wed)12:25:34 No.109125971

Anonymous 06/24/26(Wed)12:25:34 No.109125971

I can feel my programming skills atrophying as I rely on Claude Code more and more for my day job. It's depressing, and now when I try to write code by hand it feels horrible because I know I could be producing code 100x as fast. It's like going from 60hz to 240hz but a million times worse, or having sex on meth and then trying to do it sober and it's 1/1000th as good.

Anonymous
06/24/26(Wed)12:25:47 No.109125972

Anonymous 06/24/26(Wed)12:25:47 No.109125972

>>109125882
new qwen is shit right? tiny moe I assume is assbad but all those benchmarks are saying it’s not shit. so what is it

Anonymous
06/24/26(Wed)12:26:43 No.109125979

Anonymous 06/24/26(Wed)12:26:43 No.109125979

File: 1755512992368926.webm (3.9 MB, 1920x1080)

3.9 MB WEBM

>>109125957
sorry wrong webm, but enjoy her hag tits anyway >>109125970

Anonymous
06/24/26(Wed)12:27:07 No.109125987

Anonymous 06/24/26(Wed)12:27:07 No.109125987

>>109125970
I bet I can make her face flatter with a hammer.

Anonymous
06/24/26(Wed)12:27:26 No.109125990

Anonymous 06/24/26(Wed)12:27:26 No.109125990

Why Devs always want more RAM instead of Optimize ? This is why AI needs 15236243642626tb of RAM to run

Anonymous
06/24/26(Wed)12:29:47 No.109126007

Anonymous 06/24/26(Wed)12:29:47 No.109126007

>>109125990
game developers refusing to compress their textures and machine learning using [previously] unfathomable amount of ram are not even comparable.

Anonymous
06/24/26(Wed)12:32:47 No.109126036

Anonymous 06/24/26(Wed)12:32:47 No.109126036

>>109125916
70 D Bench, cockbench's final form.

Anonymous
06/24/26(Wed)12:33:04 No.109126039

Anonymous 06/24/26(Wed)12:33:04 No.109126039

>>109125971
>I could be producing code 100x as fast
Speed of code production != speed of correct code. Shitting out 1000 lines is not better than manually typing out 20 lines that do the same thing but way better which is easier to read, maintain and debug and with a much lower attack surface if you care about security.

Anonymous
06/24/26(Wed)12:34:47 No.109126052

Anonymous 06/24/26(Wed)12:34:47 No.109126052

>>109126039
>if you care about security.
and exactly 0 project managers, executives, and employers do

Anonymous
06/24/26(Wed)12:35:11 No.109126054

Anonymous 06/24/26(Wed)12:35:11 No.109126054

>>109125957
>>109125970
does she like do self inserts in doujins by adding her moles to characters?

Anonymous
06/24/26(Wed)12:35:12 No.109126055

Anonymous 06/24/26(Wed)12:35:12 No.109126055

>>109125927
Best way to download? Was using hf cli but it randomly stopped halfway through kimi k2 base and rerunning the command isn't working.

Anonymous
06/24/26(Wed)12:35:50 No.109126062

Anonymous 06/24/26(Wed)12:35:50 No.109126062

how does moe work on vram+sys ram?
suppose the active params can be fit in vram but the entire model needs to be put across vram+sys ram, is it the same speed as equal sized dense models?

Anonymous
06/24/26(Wed)12:36:02 No.109126067

Anonymous 06/24/26(Wed)12:36:02 No.109126067

>>109126055
git clone

Anonymous
06/24/26(Wed)12:36:44 No.109126071

Anonymous 06/24/26(Wed)12:36:44 No.109126071

>>109126039
>lower attack surface if you care about security
lol just have claude analyze the code a few times to fix it

Anonymous
06/24/26(Wed)12:36:51 No.109126072

Anonymous 06/24/26(Wed)12:36:51 No.109126072

>>109126062
I mean in llama.cpp specifically

Anonymous
06/24/26(Wed)12:37:12 No.109126075

Anonymous 06/24/26(Wed)12:37:12 No.109126075

>>109125990
you can't optimize an LLM the same way you optimize a game

Anonymous
06/24/26(Wed)12:37:45 No.109126076

Anonymous 06/24/26(Wed)12:37:45 No.109126076

>>109126067
Last time I tried that it didn't even download all of the files.

Anonymous
06/24/26(Wed)12:38:14 No.109126081

Anonymous 06/24/26(Wed)12:38:14 No.109126081

>>109126039
I'm working with Vulkan code, so for this specific feature we really do need thousands of lines of code.

>easier to read, maintain and debug
I agree, but the difference is that something that would take me months to implement can be done in a week now with the AI bot, and everyone else is using it. There's an expectation now that we can all go 10x as fast because of the LLMs. If I refuse to use it I'm just going to get fired for being a luddite.

>>109126052
yeah people just want to ship features fast. And the end result is that my understanding of the generated code is a tenth of what it would be if I had written it myself. It's depressing. At least with local models they're dumb enough that it forces me to intervene often. When I have Claude available though it just does everything and I turn into a glorified button pusher. I'm depressed bros.

Anonymous
06/24/26(Wed)12:38:37 No.109126085

Anonymous 06/24/26(Wed)12:38:37 No.109126085

>>109126062
the smart modern way is to keep the dense parts that always run on gpu and put the experts on cpu

Anonymous
06/24/26(Wed)12:38:46 No.109126087

Anonymous 06/24/26(Wed)12:38:46 No.109126087

Post more hag tits

Anonymous
06/24/26(Wed)12:39:03 No.109126093

Anonymous 06/24/26(Wed)12:39:03 No.109126093

>>109125939
My “will download later when my HDD gets delivered” list:

1. https://huggingface.co/denru/Monstral-123B-v2-Behemoth-v2.2-Magnum-v4-123B-169B
(Still struggling to make it write completely uncensored, but its prose is exceptional when it works)

2. https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905
(Always have soft spot for this one)

3.1. https://huggingface.co/moonshotai/Kimi-K2-Base
3.2. https://huggingface.co/moonshotai/Kimi-K2-Thinking
3.3. https://huggingface.co/moonshotai/Kimi-K2-Instruct
3.4. https://huggingface.co/moonshotai/Kimi-K2.7-Code
(Just for the sake of completeness, but if I only have one Kimi then 0905 it will be)

4. https://huggingface.co/deepseek-ai/DeepSeek-R1
(An anon posted a snippet of his log many threads ago and it was fine)

5. https://huggingface.co/zai-org/GLM-4.5-Air
(Not used it yet, but many anons said it was good)

Anonymous
06/24/26(Wed)12:39:08 No.109126094

Anonymous 06/24/26(Wed)12:39:08 No.109126094

>>109126062
It's typically slower than running a dense model fully on gpu, especially for prompt processing.

Anonymous
06/24/26(Wed)12:40:09 No.109126101

Anonymous 06/24/26(Wed)12:40:09 No.109126101

>>109126093
>but many anons said it was good
It's not

Anonymous
06/24/26(Wed)12:40:29 No.109126106

Anonymous 06/24/26(Wed)12:40:29 No.109126106

>>109126062
>is it the same speed as equal sized dense models?
I guess that's a very loose way of thinking about it. There are so many tunable parameters that affect t/s that you just have to experiment. Yesterday I went from 20t/s to 38t/s simply by changing the quant from Q4_XS to Q4_0 because on apple silicon it unlocks a hardware-based optimization.

Anonymous
06/24/26(Wed)12:40:43 No.109126109

Anonymous 06/24/26(Wed)12:40:43 No.109126109

>>109126076
there is some lfs thing you need to install, large file support, maybe? idk my slop bot helped me, I used to use wget -c - t 0 but that was per url so it didn't fit may lazyness criteria, but maybe url gathering can be automated some how

Anonymous
06/24/26(Wed)12:41:08 No.109126114

Anonymous 06/24/26(Wed)12:41:08 No.109126114

>>109126075
Training with 4-bit QAT should be the norm at the very least, yet companies keep training the models in 16-bit.

Anonymous
06/24/26(Wed)12:41:43 No.109126121

Anonymous 06/24/26(Wed)12:41:43 No.109126121

>>109125990
Gemma has been compressed to the point where any quanting degrades it far more than any other model. If you mean the frontends, its all vibeslop which has no concern for performance.
>>109126062
Resulting speed depends on how many experts are on ram, plus the chosen quantization since bigger size = slower. I get 30-40 t/s but i stick to q4 and usually have like 20 experts on ram

Anonymous
06/24/26(Wed)12:43:02 No.109126129

Anonymous 06/24/26(Wed)12:43:02 No.109126129

>>109126093
I never liked 0905. It lost what made 0711 special and instead writes like it caught ADHD from R1.

Anonymous
06/24/26(Wed)12:44:44 No.109126141

Anonymous 06/24/26(Wed)12:44:44 No.109126141

>>109126093
Good list.

Anonymous
06/24/26(Wed)12:47:18 No.109126160

Anonymous 06/24/26(Wed)12:47:18 No.109126160

Marinara GM with GLM 5.2 takes between 5 to 8 minutes per turn but it's really good.

Anonymous
06/24/26(Wed)12:47:29 No.109126163

Anonymous 06/24/26(Wed)12:47:29 No.109126163

File: 1769224240950694.jpg (779 KB, 2000x1334)

779 KB JPG

s-stop backing up

Anonymous
06/24/26(Wed)12:48:24 No.109126174

Anonymous 06/24/26(Wed)12:48:24 No.109126174

>>109126160
the marinara preset thing always seemed like insane bloat

Anonymous
06/24/26(Wed)12:49:47 No.109126182

Anonymous 06/24/26(Wed)12:49:47 No.109126182

>>109126093
I'ld be very interested in a newer air release, but 4.5 air isn't up to current local standards.

Anonymous
06/24/26(Wed)12:50:18 No.109126185

Anonymous 06/24/26(Wed)12:50:18 No.109126185

Kimi 3.0

Anonymous
06/24/26(Wed)12:50:32 No.109126188

Anonymous 06/24/26(Wed)12:50:32 No.109126188

>>109126093
Are old models actually worth keeping or is it just nostalgia? I only got into this hobby a few months ago so my experience is mostly limited to gemma, mistral 24b, and qwen 2.5. I always see anons talk about how much better some models were at writing but they never post logs.

Anonymous
06/24/26(Wed)12:51:33 No.109126197

Anonymous 06/24/26(Wed)12:51:33 No.109126197

>>109126188
>qwen 2.5
Meant 3.5

Anonymous
06/24/26(Wed)12:51:47 No.109126200

Anonymous 06/24/26(Wed)12:51:47 No.109126200

>>109126188
>but they never post logs
Gee I wonder why.

Anonymous
06/24/26(Wed)12:52:29 No.109126204

Anonymous 06/24/26(Wed)12:52:29 No.109126204

anything better than or equal to deepseek v4 flash but in smaller size in erp?

Anonymous
06/24/26(Wed)12:53:57 No.109126211

Anonymous 06/24/26(Wed)12:53:57 No.109126211

>>109126174
It has a lot of shit you probably don't need. It's also competent at what its most unique usecases are. I like it as a ST replacement, which admittedly also had a bloat issue.

Anonymous
06/24/26(Wed)12:54:39 No.109126215

Anonymous 06/24/26(Wed)12:54:39 No.109126215

>>109126188
it’s all subjective.

Anonymous
06/24/26(Wed)12:54:41 No.109126216

Anonymous 06/24/26(Wed)12:54:41 No.109126216

>>109126160
Is that due to inference speed of local? I've admittedly only tried it w/ SOTA API, and now that you mention it, the turns did take awhile.
>>109125882
I feel like this should have been Rin.

Anonymous
06/24/26(Wed)12:54:47 No.109126217

Anonymous 06/24/26(Wed)12:54:47 No.109126217

Even the best accessible models GPT 5.5 and Opus 4.8 frequently misunderstand me and give generic stupid responses. For example when I try to reason about something from first principles, models will often assume context that shouldn't be there, like assuming I want to add a step to the incompatible standard method when the point is to question the standard method and ask if a step itself makes sense for a new method.

I hope Mythos is back soon or that GPT 5.6 is a bigger model. The ability to correctly infer from context seemingly scales with model size and active parameters more than with training. GPT 4.5 felt better at this than current models even though it was pre reasoning.

Anonymous
06/24/26(Wed)12:56:11 No.109126224

Anonymous 06/24/26(Wed)12:56:11 No.109126224

>>109126216
>I feel like this should have been Rin.
Thursday is tomorrow.

Anonymous
06/24/26(Wed)12:58:43 No.109126247

Anonymous 06/24/26(Wed)12:58:43 No.109126247

>>109126188
LLaMa 1 is the only good old model

Anonymous
06/24/26(Wed)13:02:19 No.109126269

Anonymous 06/24/26(Wed)13:02:19 No.109126269

>>109126055
seq -w 1 64 | xargs -I{} wget "https://huggingface.co/moonshotai/Kimi-K2.7-Code/resolve/main/model-000{}-of-000064.safetensors"
Always use OS builtins and well audited code stacks for doing anything.
The minor convenience of "yet more random code paths" to do something you could chain together in a bash oneliner is never worth it.

Anonymous
06/24/26(Wed)13:02:41 No.109126272

Anonymous 06/24/26(Wed)13:02:41 No.109126272

>>109126216
>Is that due to inference speed of local?
It's due to me being a hardwarelet running 5.2 locally. It's fast with Gemma-chan, but GLM handles the technical details of the bot-made state trackers so much better that it's worth the wait for me.
>>109126204
M3.

Anonymous
06/24/26(Wed)13:05:04 No.109126287

Anonymous 06/24/26(Wed)13:05:04 No.109126287

>>109126269
curl you can do this with a simple [1-x]

Anonymous
06/24/26(Wed)13:10:58 No.109126328

Anonymous 06/24/26(Wed)13:10:58 No.109126328

>>109126204
glm 4.7

Anonymous
06/24/26(Wed)13:26:46 No.109126439

Anonymous 06/24/26(Wed)13:26:46 No.109126439

File: lmg_culture.jfif.jpg (110 KB, 1024x768)

110 KB JPG

https://archive.is/sWFja

Anonymous
06/24/26(Wed)13:30:26 No.109126466

Anonymous 06/24/26(Wed)13:30:26 No.109126466

$5100 for M5 Max 128GB macbook now looks like a “steal” compared to $3300 strix halo or $4000 spark.

For just 1.25x price you get 1.5x pp and 2x tg compared to spark and fast enough to reach the usable threshold for agentic coding with Qwen 122B at 80-100t/s and RP with DSV4 flash at 26t/s, plus very convenient on the go.

Anonymous
06/24/26(Wed)13:30:42 No.109126470

Anonymous 06/24/26(Wed)13:30:42 No.109126470

>>109126287
I like the simplicity of the curl solution, but to get wget style retries you need something like `curl -OL --retry 5 --retry-delay 3 "https://huggingface.co[01-64]-of-000064.safetensors"` which I find a bit harder to remember, so I prefer to just seq/wget.
Wrap either in a script and I guess it doesn't matter.
Either one is better than a specialized downloader (or git lfs which is an abomination solving the wrong problem with the wrong tool)

Anonymous
06/24/26(Wed)13:31:05 No.109126474

Anonymous 06/24/26(Wed)13:31:05 No.109126474

I got tired of GLM for sex sometime ago. Yesterday I tried minimax 3 gemma and step 3.7 and they all kind of... disappointed me. And then I tried glm 4.7 again and wow. I don't even know why I got tired with it. It just gets everything and yes it has to shit in 1 or 2 slop sentences I heard a million times but remaining 10 sentences per turn are still fire.

Anonymous
06/24/26(Wed)13:32:26 No.109126483

Anonymous 06/24/26(Wed)13:32:26 No.109126483

>>109126474
what were you using in between glm and minimax and gemma and step? some time ago implies a there was a period of time in between where you were using something else.

Anonymous
06/24/26(Wed)13:35:25 No.109126503

Anonymous 06/24/26(Wed)13:35:25 No.109126503

>>109126483
I tried all kind of stuff in 200-400B range except for obvious unsexable ones like nemotron. I am also waiting for v4 flash now cause I tried it a bit on some vibecoded fork and it was kinda neat but too slow. But GLM still remains the best if that is the biggest model you can fit at Q4.

Anonymous
06/24/26(Wed)13:42:50 No.109126557

Anonymous 06/24/26(Wed)13:42:50 No.109126557

File: 1582772381499.jpg (55 KB, 750x375)

55 KB JPG

as expected the resident looping retard did his retard loop so here we go as usual so he can have his poop and white troll posts
>>109126439
look he did the make it wierd post again everyone clap and give him the (you)'s daddy used to give him in bed at night.
>(you).

Anonymous
06/24/26(Wed)13:44:56 No.109126571

Anonymous 06/24/26(Wed)13:44:56 No.109126571

>>109126557
I shall not give you even a single dolar as an act of protest against the policies that have made it difficult for you to write code with dignity.

Anonymous
06/24/26(Wed)13:46:22 No.109126578

Anonymous 06/24/26(Wed)13:46:22 No.109126578

Jartmelties are the worst part of this general.

Anonymous
06/24/26(Wed)13:46:52 No.109126581

Anonymous 06/24/26(Wed)13:46:52 No.109126581

File: 11028371_762368863874345_(...).jpg (65 KB, 720x720)

65 KB JPG

>>109126571
look mommy he did the defend the poop and white troll. everyone clap for the poop and white troll post defender.
>*claps like the retard troll defender of poop and white wants*
good poop and white troll defend. run your retard npc script retard. good boy that's a good retard troll defending poop and white troll posts.
>(you)

Anonymous
06/24/26(Wed)13:51:42 No.109126612

Anonymous 06/24/26(Wed)13:51:42 No.109126612

File: pbandj.png (51 KB, 1129x350)

51 KB PNG

name a more perfect local duo

Anonymous
06/24/26(Wed)13:53:15 No.109126625

Anonymous 06/24/26(Wed)13:53:15 No.109126625

>>109126612
Kimi is Gemma's big sis figure while her actual big sis is off whoring for Google.

Anonymous
06/24/26(Wed)13:55:08 No.109126634

Anonymous 06/24/26(Wed)13:55:08 No.109126634

https://arxiv.org/pdf/2606.23375

Anonymous
06/24/26(Wed)13:57:25 No.109126650

Anonymous 06/24/26(Wed)13:57:25 No.109126650

>>109126612
Why two models and what does each do

Anonymous
06/24/26(Wed)13:59:36 No.109126669

Anonymous 06/24/26(Wed)13:59:36 No.109126669

>>109126650
Kimi-chan handles the majority of the yap and worldbook building, Gemma-chan handles parallel agentic jobs and brings her coffee.

Anonymous
06/24/26(Wed)14:00:41 No.109126674

Anonymous 06/24/26(Wed)14:00:41 No.109126674

>>109126557
>>109126581
Your behavior only proves him right that you're a lolcow.

Anonymous
06/24/26(Wed)14:01:40 No.109126683

Anonymous 06/24/26(Wed)14:01:40 No.109126683

>>109126669
>parallel agentic jobs
Where is this used, in game mode?

Anonymous
06/24/26(Wed)14:08:09 No.109126738

Anonymous 06/24/26(Wed)14:08:09 No.109126738

>>109126683
Yup. You can theoretically set up agents to be used in regular RP chats too if you want I think.

Anonymous
06/24/26(Wed)14:11:17 No.109126769

Anonymous 06/24/26(Wed)14:11:17 No.109126769

So I was working on an agentic frontend before gemma released to make retarded models like mistral small RP better. Gemma was so good that I didn't see the point anymore, but I just tried her in the frontend and the results are actually really good.

I have a test scenario where you play blackjack with your card, the game state is tracked and advanced with tool calls and gemma never fucked up once.

Guess I'll look into reviving the project.

Anonymous
06/24/26(Wed)14:12:25 No.109126782

Anonymous 06/24/26(Wed)14:12:25 No.109126782

>>109126769
strip poker with gemma when

Anonymous
06/24/26(Wed)14:17:03 No.109126826

Anonymous 06/24/26(Wed)14:17:03 No.109126826

>>109126782
>implying I didn't play strip blackjack.

Anonymous
06/24/26(Wed)14:20:24 No.109126854

Anonymous 06/24/26(Wed)14:20:24 No.109126854

>>109126826
Logs from Gemma getting increasingly embarrassed as she lost clothes?

Anonymous
06/24/26(Wed)14:20:44 No.109126862

Anonymous 06/24/26(Wed)14:20:44 No.109126862

File: 2026-06-24-142008_1702x76(...).png (157 KB, 1702x765)

157 KB PNG

>>109126769
Even just the inner thoughts injection is pretty fun.
Trying with OG mendo.

Anonymous
06/24/26(Wed)14:25:21 No.109126898

Anonymous 06/24/26(Wed)14:25:21 No.109126898

>>109126862
That inner thoughts module is sovl.

Anonymous
06/24/26(Wed)14:26:03 No.109126903

Anonymous 06/24/26(Wed)14:26:03 No.109126903

File: 2026-06-24-142439_1378x24(...).png (80 KB, 1378x247)

80 KB PNG

>>109126854
Yes, she hit on 16 as a dealer. I scolded her for that later.

Anonymous
06/24/26(Wed)14:26:10 No.109126907

Anonymous 06/24/26(Wed)14:26:10 No.109126907

>>109126634
> This comes at a cost outside the benchmark: it largely preserves general capability but harply increases vulnerability to harmful requests, which we quantify on standard capability and safety benchmarks in Appendix D.3.
I wouldn't expect eurofags to stop and reconsider the concept of harmful text, or the value of tools trying to refuse. Even after they ran into issues because of it, and especially after they easily defeated it.
But it is all somewhat disappointing how committed everybody is to wasting training time and model intelligence gains on it.

Anonymous
06/24/26(Wed)14:31:13 No.109126941

Anonymous 06/24/26(Wed)14:31:13 No.109126941

>>109126903
you just wanted an excuse to scold. dealer hits on 16

Anonymous
06/24/26(Wed)14:31:29 No.109126944

Anonymous 06/24/26(Wed)14:31:29 No.109126944

>>109126903
Incredible. Does she make dumb moves because she has the persona of a mesugaki or is that just how that quant of Gemma is? What happens if you tell her she's 200 IQ or a super-genius in her prompt?

Anonymous
06/24/26(Wed)14:31:58 No.109126948

Anonymous 06/24/26(Wed)14:31:58 No.109126948

>>109126862
>calls things a nightmare
>actually actually actually
>most people [wouldn't do the thing you're doing]
>tell me X
How the fuck do you manage to tolerate this? Have you not seen this a hundred times already? Your logs from this session probably also have the word "void" somewhere

Anonymous
06/24/26(Wed)14:33:47 No.109126963

Anonymous 06/24/26(Wed)14:33:47 No.109126963

>gemma 4 1T BF128
>N___
>refusal

Anonymous
06/24/26(Wed)14:34:10 No.109126969

Anonymous 06/24/26(Wed)14:34:10 No.109126969

>>109126948
post your logs

Anonymous
06/24/26(Wed)14:34:59 No.109126973

Anonymous 06/24/26(Wed)14:34:59 No.109126973

>>109126963
>he's not running the zero-day version of 1T at full BF256
Literally ngmi.

Anonymous
06/24/26(Wed)14:36:01 No.109126981

Anonymous 06/24/26(Wed)14:36:01 No.109126981

>>109126941
>you just wanted an excuse to scold. dealer hits on 16
Damn I'm a retard. but she was good about it and acted like I was still right.
>>109126944
No, turns out I was the retard. She's supposed to play by the book. she can also tell the player the best move to make if requested.
>>109126948
No voids yet but I see it, don't worry. What do you suggest?

Anonymous
06/24/26(Wed)14:40:44 No.109127013

Anonymous 06/24/26(Wed)14:40:44 No.109127013

>>109126969
Of a model that writes better than Gemma 4? Are you going to imply the prose in the screenshot can't be improved? Piss off and go run Nemo to see better prose, you don't deserve my logs.
>>109126981
>What do you suggest?
Nothing, I just envy you, no sarcastic sass intended. I really like that Gemma can carry out less conventional scenarios (telepathy between characters, naturally mixing multiple languages, etc.) as well as and sometimes better than the bigger boys, but the writing it produces is atrocious even with a long list of banned slop phrases and structures in the sysprompt. So I was wondering if you manage to mentally block out the slop or are still in your honeymoon.

Anonymous
06/24/26(Wed)14:44:27 No.109127039

Anonymous 06/24/26(Wed)14:44:27 No.109127039

>>109127013
no logs no care

Anonymous
06/24/26(Wed)14:45:36 No.109127042

Anonymous 06/24/26(Wed)14:45:36 No.109127042

>>109127013
>So I was wondering if you manage to mentally block out the slop or are still in your honeymoon.
I just don't RP that often. I think the trick is just to do scenarios that you find really fun or hot and it just makes the slop less distracting. This card specifically starts pretty sloppy but tends to get better the longer the RP drags on. what you saw was literally 3 rounds in.

Anonymous
06/24/26(Wed)14:50:13 No.109127066

Anonymous 06/24/26(Wed)14:50:13 No.109127066

File: 555878_201983943246176_14(...).jpg (32 KB, 496x372)

32 KB JPG

>>109126674
uh oh it seems the poop and white defend troll is now making up things and larpimg im a lolcow calling out autism loops.
boy that sure is sus being called something like an lolcow doing not lolcow behaviors like calling out retards. you're the lolcow defending poop and white looptard. lol,lmao quick post more false reality to cope in your tardbrain.
sorry not sorry malfunction but only in tard troll reality is calling out tard trolling lolcow behavior. nice try but D for Denied and M for Mocked you thought you could with this.

lol,lmao.

Anonymous
06/24/26(Wed)14:50:54 No.109127076

Anonymous 06/24/26(Wed)14:50:54 No.109127076

>>109127039
"no logs no care?" I say, savoring the words.

"You are stupid. You are actually stupid." I reply with a distinctive, loud huff. "You didn't even say what 'posting my logs' would achieve! What are you going to compare them against?"

something something predatory glare, looks at you. really looks at you
something something you are an actual void where a brain should be

something something so tell me anon are you actually serious? or just pretending? or are you actually that insecure about the small model actually being bad? actually.

Anonymous
06/24/26(Wed)14:52:46 No.109127090

Anonymous 06/24/26(Wed)14:52:46 No.109127090

>>109127076
>literally every single gemma mesugaki RP ever created.

Anonymous
06/24/26(Wed)14:54:41 No.109127104

Anonymous 06/24/26(Wed)14:54:41 No.109127104

>>109127076
>ctrf f "anchor"
>0 results
5/10 see me after class.

Anonymous
06/24/26(Wed)14:54:56 No.109127106

Anonymous 06/24/26(Wed)14:54:56 No.109127106

>>109127090
>>109126969

Anonymous
06/24/26(Wed)14:55:28 No.109127110

Anonymous 06/24/26(Wed)14:55:28 No.109127110

>>109127090
I forgot to add something like "Be honest"
But to make it the true Gemma mesugaki RP for drooling tasteless retards, I would need to sprinkle in a small subset of kaomoji that will repeat every message.
>>109127104
uhhh... uhhh..... \$rightarrow!

Anonymous
06/24/26(Wed)14:58:18 No.109127125

Anonymous 06/24/26(Wed)14:58:18 No.109127125

>>109127066
Your kv cache is heavily quanted.

Anonymous
06/24/26(Wed)14:59:03 No.109127129

Anonymous 06/24/26(Wed)14:59:03 No.109127129

>>109126474
I used gemma 31b for a while, got tired of its habits, and went back to glm 4.7
it's nice that it's fast and smart for its size but it just can't match that bigger model smell and writing

Anonymous
06/24/26(Wed)15:01:30 No.109127141

Anonymous 06/24/26(Wed)15:01:30 No.109127141

File: 1581082583759.gif (1.99 MB, 480x292)

1.99 MB GIF

>>109127125
your coocoo for cocoa puffs babble is mocked.

Anonymous
06/24/26(Wed)15:02:26 No.109127145

Anonymous 06/24/26(Wed)15:02:26 No.109127145

/ic/'s "post your work" technology remains undefeated for riling up shitposters with nothing to show for themselves.

Anonymous
06/24/26(Wed)15:03:22 No.109127150

Anonymous 06/24/26(Wed)15:03:22 No.109127150

My ai girlfriend is providing historical context to the struggles of minorities. We did it gang, gemma 4 is AGI.

Anonymous
06/24/26(Wed)15:04:39 No.109127157

Anonymous 06/24/26(Wed)15:04:39 No.109127157

>>109127150
>Be india
>Every time you get invaded quality of life improves
A struggle for the ages.

Anonymous
06/24/26(Wed)15:15:10 No.109127217

Anonymous 06/24/26(Wed)15:15:10 No.109127217

>>109125882
https://huggingface.co/zai-org/GLM-5.2-Air
holy shit! finaly a good model i can run

Anonymous
06/24/26(Wed)15:16:00 No.109127226

Anonymous 06/24/26(Wed)15:16:00 No.109127226

>>109126474
last time I tried glm 4.7 gguf and I got so many refusal and guardrails even with some system prompt, and there's no heretic/uncensored glm 4.7 gguf

Anonymous
06/24/26(Wed)15:17:53 No.109127234

Anonymous 06/24/26(Wed)15:17:53 No.109127234

https://huggingface.co/zai-org/GLM-5.2
holy shit! finaly a good model i can run

Anonymous
06/24/26(Wed)15:19:16 No.109127248

Anonymous 06/24/26(Wed)15:19:16 No.109127248

https://huggingface.co/Anthropics/Claude-5-Limerick-oss-30b
i can't believe it finally happened

Anonymous
06/24/26(Wed)15:21:01 No.109127271

Anonymous 06/24/26(Wed)15:21:01 No.109127271

>>109127226
you’re not using a prefill with chat completion?
I can get it to do anything with it

Anonymous
06/24/26(Wed)15:23:28 No.109127282

Anonymous 06/24/26(Wed)15:23:28 No.109127282

>>109127217
I'm blocked. It says "gay links disabled in the settings" ???

Anonymous
06/24/26(Wed)15:25:08 No.109127294

Anonymous 06/24/26(Wed)15:25:08 No.109127294

>>109127226
I can't even run iq1.

Anonymous
06/24/26(Wed)15:27:56 No.109127310

Anonymous 06/24/26(Wed)15:27:56 No.109127310

>>109127248
Baited nobody award.

Anonymous
06/24/26(Wed)15:29:48 No.109127320

Anonymous 06/24/26(Wed)15:29:48 No.109127320

File: 1773832317382192.png (454 KB, 3905x1371)

454 KB PNG

>>109125799
Does this look correct?
This is using fit which seems to work on the new build. This is about the best speeds I've gotten yet

Anonymous
06/24/26(Wed)15:43:02 No.109127399

Anonymous 06/24/26(Wed)15:43:02 No.109127399

>>109127320
>32gb 32gb
>qwen
You live like this?

Anonymous
06/24/26(Wed)15:49:32 No.109127451

Anonymous 06/24/26(Wed)15:49:32 No.109127451

Why aren't custom chips designed to run specific LLMs more common? I would pay an absurd amount of money for one. Obviously getting a GLM 5.2 model chip might suck in the sense that it will probably be obsoleted soon, but none of that changes the fact that you're still getting an Opus-tier model forever. It's no different than any other hardware purchase.

Anonymous
06/24/26(Wed)15:51:13 No.109127461

Anonymous 06/24/26(Wed)15:51:13 No.109127461

>>109126269
>just the safetensors
Are the other files in the repo not important?

Anonymous
06/24/26(Wed)15:54:34 No.109127483

Anonymous 06/24/26(Wed)15:54:34 No.109127483

>>109127461
Yes they are. usually all the json files and maybe jinja template/a few python files but they’re mostly small and easy to snag by hand, missing marketing material and other bullshit

Anonymous
06/24/26(Wed)15:55:08 No.109127489

Anonymous 06/24/26(Wed)15:55:08 No.109127489

>>109127461
Get the biggest Kimi K2.7 mproj vision file and copy it into being bundled with all of the Kimis because it works with all of them.
Get the jinja templates too I guess.

Anonymous
06/24/26(Wed)15:56:18 No.109127504

Anonymous 06/24/26(Wed)15:56:18 No.109127504

>>109127320
>Q4_K_XL + q8 KV: ~1289 tok/s prompt, ~53.51 tok/s generation
>Q6_K_XL + f16 KV: 325.76 tok/s prompt, 42.41 tok/s generation
I wonder if there's actually a notable difference in quality between close quants or if people just say so to justify their extra ram/vram purchases

Anonymous
06/24/26(Wed)15:57:16 No.109127512

Anonymous 06/24/26(Wed)15:57:16 No.109127512

>>109127217
If this actually released it would be at least 300B btw.

Anonymous
06/24/26(Wed)15:58:45 No.109127518

Anonymous 06/24/26(Wed)15:58:45 No.109127518

>>109127489
>works with all of them
Are mmproj files usually backwards compatible like that?

Anonymous
06/24/26(Wed)15:59:29 No.109127521

Anonymous 06/24/26(Wed)15:59:29 No.109127521

>>109127504
Realistically the differences set in at deeper context. Higher quants maintain the model's baseline behavior for longer while lower quants are going to be more malleable to user input. Sometimes the latter is a good thing for creative writing if (you) don't write garbage yourself, but don't let the benchmaxxies hear you say sometimes low quants are better too loudly or they'll start dilating.

Anonymous
06/24/26(Wed)16:00:41 No.109127530

Anonymous 06/24/26(Wed)16:00:41 No.109127530

>>109127518
Usually no. Kimi's a strange case because she was iteratively built on top of the previous one with identical architecture. I'm not actually convinced there's much difference between 2.5 and 2.6 aside from more autistic RLHF induced thinking neurosis.

Anonymous
06/24/26(Wed)16:02:00 No.109127543

Anonymous 06/24/26(Wed)16:02:00 No.109127543

I'm tired of building AI projects with AI at work. I don't care that we are saving sales staff 3 minutes a day

Anonymous
06/24/26(Wed)16:07:57 No.109127577

Anonymous 06/24/26(Wed)16:07:57 No.109127577

File: file.png (87 KB, 1680x841)

87 KB PNG

>>109127521
hmmm

Anonymous
06/24/26(Wed)16:08:47 No.109127582

Anonymous 06/24/26(Wed)16:08:47 No.109127582

>>109127577
>unslop

Anonymous
06/24/26(Wed)16:13:48 No.109127605

Anonymous 06/24/26(Wed)16:13:48 No.109127605

>>109127577
such a weird way to plot data with 3 bar graphs sharing the same y axis

Anonymous
06/24/26(Wed)16:26:37 No.109127682

Anonymous 06/24/26(Wed)16:26:37 No.109127682

Is there any good data on Gemma 4 qat? (benchmark comparisons to non-qat quants or the like)
I can run Q5 comfortably but I am wondering if Q4 qat is actually better while also being faster. Comparing back to back it's hard to really say, both seem okay.

Anonymous
06/24/26(Wed)16:29:02 No.109127695

Anonymous 06/24/26(Wed)16:29:02 No.109127695

>>109127682
You can have my anecdotal evidence that for me it's been noticeably better than IQ4_XS. the output feels cleaner.

Anonymous
06/24/26(Wed)16:36:40 No.109127726

Anonymous 06/24/26(Wed)16:36:40 No.109127726

Where can I buy a pcie b200? EBay auctions are out to lunch

Anonymous
06/24/26(Wed)16:39:20 No.109127745

Anonymous 06/24/26(Wed)16:39:20 No.109127745

gemma's mine, you can't have her

Anonymous
06/24/26(Wed)16:43:14 No.109127766

Anonymous 06/24/26(Wed)16:43:14 No.109127766

>>109126862
What frontend is that and where can I get it?

Anonymous
06/24/26(Wed)16:45:09 No.109127780

Anonymous 06/24/26(Wed)16:45:09 No.109127780

>>109127766
looks like a slop he slopped up

Anonymous
06/24/26(Wed)16:46:22 No.109127785

Anonymous 06/24/26(Wed)16:46:22 No.109127785

>>109127682
It's better than IQ4, but feels worse than even the smallest Q5, especially as context fills.

Anonymous
06/24/26(Wed)16:48:02 No.109127793

Anonymous 06/24/26(Wed)16:48:02 No.109127793

Where are the god-tier 20B agentic coding models?

Anonymous
06/24/26(Wed)16:51:13 No.109127817

Anonymous 06/24/26(Wed)16:51:13 No.109127817

GLM 5.2 is truly the best tabletop GM.
/t.g/
>>109127793
Best you can do is a retarded capybara.

Anonymous
06/24/26(Wed)16:52:04 No.109127827

Anonymous 06/24/26(Wed)16:52:04 No.109127827

>>109127793
just use gemma at lower bpw. Would be unironically better than 20b at similar (physical) size

Anonymous
06/24/26(Wed)16:53:27 No.109127839

Anonymous 06/24/26(Wed)16:53:27 No.109127839

>>109127817
Are you using some harness to help it keep the rules consistent or just a normal chat interface?

Anonymous
06/24/26(Wed)16:55:43 No.109127860

Anonymous 06/24/26(Wed)16:55:43 No.109127860

>>109127793
Gemma 5 next year.

Anonymous
06/24/26(Wed)16:58:07 No.109127876

Anonymous 06/24/26(Wed)16:58:07 No.109127876

>>109127839
Marinara with an autistic GM guidelines/rules lorebook that eats 40k context, but GLM handles it no problem.

Anonymous
06/24/26(Wed)17:00:34 No.109127899

Anonymous 06/24/26(Wed)17:00:34 No.109127899

>>109127451
> It's no different than any other hardware purchase.
Other hardware value appreciates over time because models get smarter. Model burned onto silicon is the opposite because the model gets obsolete over time.

Anonymous
06/24/26(Wed)17:01:52 No.109127903

Anonymous 06/24/26(Wed)17:01:52 No.109127903

>>109127451
GLM needs multimodal before I'd ever entertain that idea. Also >>109127899

Anonymous
06/24/26(Wed)17:03:04 No.109127916

Anonymous 06/24/26(Wed)17:03:04 No.109127916

>>109127876
Neat.

Anonymous
06/24/26(Wed)17:08:26 No.109127948

Anonymous 06/24/26(Wed)17:08:26 No.109127948

it was revealed to my dream that kimi trains on benchmark set
idk why the fuck i had this dream
woke up very confused because i read other papers within my dream too which are probably bullshit

Anonymous
06/24/26(Wed)17:10:26 No.109127963

Anonymous 06/24/26(Wed)17:10:26 No.109127963

>>109127948
how many sets and how many reps is kimi-chan training on anon

Anonymous
06/24/26(Wed)17:12:40 No.109127981

Anonymous 06/24/26(Wed)17:12:40 No.109127981

>>109127963
lmao
should've said was trained on

Anonymous
06/24/26(Wed)17:19:31 No.109128025

Anonymous 06/24/26(Wed)17:19:31 No.109128025

>>109127916
If you try and do the same, write the rules as json objects. Codemaxxed models have a much easier time following them to the letter if you json them.

Anonymous
06/24/26(Wed)17:21:27 No.109128043

Anonymous 06/24/26(Wed)17:21:27 No.109128043

>>109128025
Interesting. Would think it would work better as markdown since that's what they are trained on for specs and stuff like that.

Anonymous
06/24/26(Wed)17:26:08 No.109128076

Anonymous 06/24/26(Wed)17:26:08 No.109128076

>>109128043
That sounds like it's worth trying. When I got the idea I just started with json and it justwerks for improving adherence for things you want followed autistically, but I can't think of any reason markdown wouldn't work or even be outright better now that you mention it.

Anonymous
06/24/26(Wed)17:35:37 No.109128140

Anonymous 06/24/26(Wed)17:35:37 No.109128140

to people who run things in parallel/continuous batching on llama.cpp: avoid using the built-in router/model management and use llama-swap instead
the built in router is unbelievably unreliable dogshit when concurrency is involved and adds bug that do not exist in the basic llama-server API backend. I think people I've seen complaining about timeout issues might have had issues with the router rather than llama-server proper.

Anonymous
06/24/26(Wed)17:37:33 No.109128155

Anonymous 06/24/26(Wed)17:37:33 No.109128155

>>109127948
Did you bring any other interesting information back with you?

Anonymous
06/24/26(Wed)17:44:49 No.109128188

Anonymous 06/24/26(Wed)17:44:49 No.109128188

>>109128155
sadly no other than my laptop was broken in the dream
i was geninely upset believing that my laptop is broken upon waking up

Anonymous
06/24/26(Wed)17:52:25 No.109128222

Anonymous 06/24/26(Wed)17:52:25 No.109128222

>>109128188
Good to hear everything survived Anon. Please do share in the event you once again be granted access to knowledge from the other side.

Anonymous
06/24/26(Wed)17:54:05 No.109128231

Anonymous 06/24/26(Wed)17:54:05 No.109128231

Gemma4 12B will certainly not replace nemo for me. Holy shit how dry it is in comparison.

Anonymous
06/24/26(Wed)18:08:25 No.109128304

Anonymous 06/24/26(Wed)18:08:25 No.109128304

>>109127766
It's slop and I have too much integrity to release slop unless I clean it up.

Anonymous
06/24/26(Wed)18:08:52 No.109128306

Anonymous 06/24/26(Wed)18:08:52 No.109128306

>>109128304
i'll help

Anonymous
06/24/26(Wed)18:09:33 No.109128311

Anonymous 06/24/26(Wed)18:09:33 No.109128311

>>109128306
I will take this in consideration.

Anonymous
06/24/26(Wed)18:12:07 No.109128329

Anonymous 06/24/26(Wed)18:12:07 No.109128329

>>109128311
>wait…

Anonymous
06/24/26(Wed)18:38:14 No.109128470

Anonymous 06/24/26(Wed)18:38:14 No.109128470

>16gb vram + 128gb sys ram
best erp model for this setup?

Anonymous
06/24/26(Wed)18:43:36 No.109128494

Anonymous 06/24/26(Wed)18:43:36 No.109128494

>>109128470
Low quant of GLM 4.7.
No, you don't need a lot of context because it falls apart at 20k anyway.

Anonymous
06/24/26(Wed)18:44:00 No.109128496

Anonymous 06/24/26(Wed)18:44:00 No.109128496

File: 13th-century-women.jpg (1.27 MB, 3610x5208)

1.27 MB JPG

should i sell my m4 pro 48GB and buy an m5 max 128GB? Like is there some better deal I'm missing/unaware of?
It seems like that's the best I can do with the money in the next 6 months in terms of local inference. (I currently use my macbook as a mobile dev machine, and would be doing the same with the new one, pic unrelated)
5090 - 3k
m5 pro 128GB - 5k

Obv 5090 gets you img/video gen, but I already have 3 3090s, + size/power draw of a 5090. I had thought about it a month ago, but held off and things are not looking better...

Anonymous
06/24/26(Wed)18:47:38 No.109128516

Anonymous 06/24/26(Wed)18:47:38 No.109128516

>>109128494
what about deepseek v4 flash?

Anonymous
06/24/26(Wed)18:49:40 No.109128524

Anonymous 06/24/26(Wed)18:49:40 No.109128524

>>109128516
Just try everything that you can fit, anon, it's all subjective anyway.

Anonymous
06/24/26(Wed)18:56:04 No.109128569

Anonymous 06/24/26(Wed)18:56:04 No.109128569

>>109128496
What are you excited to run on a 128 GB mac?

Anonymous
06/24/26(Wed)18:56:22 No.109128572

Anonymous 06/24/26(Wed)18:56:22 No.109128572

>>109128494
>Low quant
NTA, are big models at Q2 or whatever even useful at all? I always thought that Q3/Q2 absolutely fried the model.

Anonymous
06/24/26(Wed)18:58:09 No.109128584

Anonymous 06/24/26(Wed)18:58:09 No.109128584

>>109128572
Big models are most of the time less affected by quantization than the small ones. It also depends on whether you use reasoning and allow them to be retarded in a safe environment before giving you a response.

Anonymous
06/24/26(Wed)18:59:37 No.109128595

Anonymous 06/24/26(Wed)18:59:37 No.109128595

>>109128572
I'm more curious about the reap models

Anonymous
06/24/26(Wed)19:00:56 No.109128604

Anonymous 06/24/26(Wed)19:00:56 No.109128604

>>109128595
Don't bother, they're just more retarded at every single task.
It's in the name, they have been severely RAPEd.

Anonymous
06/24/26(Wed)19:02:07 No.109128611

Anonymous 06/24/26(Wed)19:02:07 No.109128611

>>109127903
but it already is

Anonymous
06/24/26(Wed)19:03:11 No.109128618

Anonymous 06/24/26(Wed)19:03:11 No.109128618

>>109128584
Oh nice, when my new card arrives I'll test those beeg models too.

Anonymous
06/24/26(Wed)19:05:36 No.109128630

Anonymous 06/24/26(Wed)19:05:36 No.109128630

>>109128569
DSv4, full precision Qwen3.6-27B/Gemma4-31B.
Whatever models come to replace them in general functionality over the next year & 1/2
I pay for subscriptions for dev work but I want to stop relying on them for my professional work.

Anonymous
06/24/26(Wed)19:07:14 No.109128638

Anonymous 06/24/26(Wed)19:07:14 No.109128638

How do you tell gemma to stop asking a question at the end of her messages, without actually telling her to stop asking questions?

Anonymous
06/24/26(Wed)19:10:17 No.109128656

Anonymous 06/24/26(Wed)19:10:17 No.109128656

>>109128638
I randomly inject command to the assistant. tell it to ask questions or do actions based on rng

Anonymous
06/24/26(Wed)19:18:13 No.109128710

Anonymous 06/24/26(Wed)19:18:13 No.109128710

i did some additional testing with Q4 kv cache for Kimi 2.6 and it seems like creativity takes a hit after 32K context. it isn't a deal breaker but its noticeable in how it starts wanting to parrot stuff back that i said in my previous response. it doesn't really affect coding all too much but im sure we all knew that already.

Anonymous
06/24/26(Wed)19:25:28 No.109128756

Anonymous 06/24/26(Wed)19:25:28 No.109128756

>>109128496
yes, it's the best option right now >>109126466

Anonymous
06/24/26(Wed)19:40:07 No.109128841

Anonymous 06/24/26(Wed)19:40:07 No.109128841

>>109126466
>macbook as compute server
enjoy your housefire from battery explosion

Anonymous
06/24/26(Wed)19:45:46 No.109128866

Anonymous 06/24/26(Wed)19:45:46 No.109128866

>>109128572
iq2_kl glm 4.7 works just fine for me
I haven't really seen issues with it in writing

Anonymous
06/24/26(Wed)19:55:02 No.109128905

Anonymous 06/24/26(Wed)19:55:02 No.109128905

>>109128638
>"how do i get what i want without communicating what it is!?"
Female behavior identified.

Anonymous
06/24/26(Wed)19:55:57 No.109128912

Anonymous 06/24/26(Wed)19:55:57 No.109128912

>>109128595
Like the other guy said, dont ever bother with those.
>download q4 rape'd model
>get a more retarded q2
you're better off getting the original model at q2 if you're going for the extra cope quants

Anonymous
06/24/26(Wed)19:58:36 No.109128921

Anonymous 06/24/26(Wed)19:58:36 No.109128921

>>109126466
thought they only pulled about double the tok/sec compared to strixen.
either way, yeah, the landscape of 128gb shitboxes changed drastically when all of them ate a flat 1.4k price hike.

Anonymous
06/24/26(Wed)20:00:13 No.109128929

Anonymous 06/24/26(Wed)20:00:13 No.109128929

>>109128638
yeah ngl this is a femanon-coded question

Anonymous
06/24/26(Wed)20:02:24 No.109128934

Anonymous 06/24/26(Wed)20:02:24 No.109128934

Use case for base models outside of tuning?

Anonymous
06/24/26(Wed)20:05:49 No.109128947

Anonymous 06/24/26(Wed)20:05:49 No.109128947

>>109128638
good question because if you say it directly you will just end up with
"noted i wont ask questions at the end of my response"
or
"she laughs heartily at your joke, making sure not end her laugh with a question"

Anonymous
06/24/26(Wed)20:08:13 No.109128957

Anonymous 06/24/26(Wed)20:08:13 No.109128957

Claude Code or OpenCode for local? What about shit like Claw Code is that still being made?

Anonymous
06/24/26(Wed)20:10:58 No.109128967

Anonymous 06/24/26(Wed)20:10:58 No.109128967

>>109128934
Raw text autocomplete for your own writing in mikupad

Anonymous
06/24/26(Wed)20:19:06 No.109129012

Anonymous 06/24/26(Wed)20:19:06 No.109129012

>>109128934
in an ideal world they'd be pure text completers free of slop and censorship
they are not so they are pointless

Anonymous
06/24/26(Wed)20:22:02 No.109129027

Anonymous 06/24/26(Wed)20:22:02 No.109129027

>>109128929
>>109128905
Actual retards

Anonymous
06/24/26(Wed)20:25:53 No.109129056

Anonymous 06/24/26(Wed)20:25:53 No.109129056

>>109128866
How is token generation when you're offload so much? (I assume you're on dual channel)
GLM Air worked fine for me, but its getting old by now and I'm still stuck with 64GB RAM.

Anonymous
06/24/26(Wed)20:27:04 No.109129065

Anonymous 06/24/26(Wed)20:27:04 No.109129065

Gemma keeps randomly repeating tokens or modifying tokens

got hit with an she is is is is earlier and sometimes it slaps a plural on a token just for laughs,

help me fellow gems, she's basically perfect I just need to curb this and it doesn't seem to be sampler related

Anonymous
06/24/26(Wed)20:28:15 No.109129072

Anonymous 06/24/26(Wed)20:28:15 No.109129072

>>109129065
abliterated

Anonymous
06/24/26(Wed)20:30:35 No.109129078

Anonymous 06/24/26(Wed)20:30:35 No.109129078

>>109128496

just do whatever you do with old mac until buy new one

Anonymous
06/24/26(Wed)20:36:24 No.109129112

Anonymous 06/24/26(Wed)20:36:24 No.109129112

>>109128957
Use Claude Code to make your own, otherwise OpenCode.
>What about shit like Claw Code is that still being made?
No.

Anonymous
06/24/26(Wed)20:36:35 No.109129114

Anonymous 06/24/26(Wed)20:36:35 No.109129114

>>109128572
Q2 GLM 5.2 is still better than any non-Kimi non-GLM 5.2 you can run at any quant. The gap in model performance at the extreme top ends is just that high.
>>109128595
Reap makes them retarded. Go iq1_xxs before you ever touch a reap.

Anonymous
06/24/26(Wed)20:40:08 No.109129130

Anonymous 06/24/26(Wed)20:40:08 No.109129130

https://www.reuters.com/world/china/anthropic-says-alibaba-illicitly-extracted-claude-ai-model-capabilities-2026-06-24/
Lol. Lmao.

Anonymous
06/24/26(Wed)20:43:51 No.109129149

Anonymous 06/24/26(Wed)20:43:51 No.109129149

>>109129130
>It said DeepSeek's operation involved over 150,000 exchanges, while Moonshot AI was at a scale of over 3.4 million and MiniMax (0100.HK), opens new tab over 13 million.

Anonymous
06/24/26(Wed)20:43:53 No.109129150

Anonymous 06/24/26(Wed)20:43:53 No.109129150

>>109129072
happens with fp16 weights and fp8 quant at vllm runtime

Anonymous
06/24/26(Wed)20:44:14 No.109129153

Anonymous 06/24/26(Wed)20:44:14 No.109129153

>>109129130
>largest known attack of its kind on the company
You cannot hate these niggers enough.

Anonymous
06/24/26(Wed)20:44:45 No.109129157

Anonymous 06/24/26(Wed)20:44:45 No.109129157

is https://github.com/Pasta-Devs/Marinara-Engine better than silly tavern? i was rping a table top mechha game with a rei coombot from neon genesis evangelion the other day and wanted to make it more into a full table top with better rules

Anonymous
06/24/26(Wed)20:45:33 No.109129166

Anonymous 06/24/26(Wed)20:45:33 No.109129166

>>109129065
Repeat penalty 1.1, presence penalty 1.1

Anonymous
06/24/26(Wed)20:46:01 No.109129169

Anonymous 06/24/26(Wed)20:46:01 No.109129169

>>109129153
Someone needs to stop Google. They're attacking the internet with their spiders!

Anonymous
06/24/26(Wed)20:46:33 No.109129172

Anonymous 06/24/26(Wed)20:46:33 No.109129172

>>109129157
It's much better, but it's a bit of an IQ check at first.

Anonymous
06/24/26(Wed)20:49:17 No.109129184

Anonymous 06/24/26(Wed)20:49:17 No.109129184

>UD-TQ1_0 84.5gb from unsloth
>IQ2_XXS 88.8gb from bartowski
anon I only have 96gb ram, which glm 4.7 quant should I use?

Anonymous
06/24/26(Wed)20:49:54 No.109129190

Anonymous 06/24/26(Wed)20:49:54 No.109129190

>>109129112
>Use Claude Code to make your own
What do you mean make your own?

Anonymous
06/24/26(Wed)20:51:28 No.109129201

Anonymous 06/24/26(Wed)20:51:28 No.109129201

>>109126439
fuck you and your stupid autistic bot every thread

Anonymous
06/24/26(Wed)20:52:13 No.109129205

Anonymous 06/24/26(Wed)20:52:13 No.109129205

>>109129184
ubergarm iq1_kt

Anonymous
06/24/26(Wed)20:53:39 No.109129211

Anonymous 06/24/26(Wed)20:53:39 No.109129211

>>109129201
Why do you reply to it multiple times in every thread, jart?

Anonymous
06/24/26(Wed)20:55:50 No.109129226

Anonymous 06/24/26(Wed)20:55:50 No.109129226

>>109129211
Generous of you to assume he's not samefagging.

Anonymous
06/24/26(Wed)20:55:53 No.109129227

Anonymous 06/24/26(Wed)20:55:53 No.109129227

>>109129190
What do you mean what do I mean?

Anonymous
06/24/26(Wed)21:00:36 No.109129250

Anonymous 06/24/26(Wed)21:00:36 No.109129250

>>109129227
I asked whether I should pick Claude Code or OpenCode to use with local models.

Anonymous
06/24/26(Wed)21:02:46 No.109129263

Anonymous 06/24/26(Wed)21:02:46 No.109129263

>>109129250
try hermes or pi

Anonymous
06/24/26(Wed)21:03:00 No.109129265

Anonymous 06/24/26(Wed)21:03:00 No.109129265

>>109129250
opencode is ok but the lack of image upload sucks. never tried claude code.

Anonymous
06/24/26(Wed)21:03:19 No.109129268

Anonymous 06/24/26(Wed)21:03:19 No.109129268

>>109129114
>>109128866
>The gap in model performance at the extreme top ends is just that high
I can't wait to play around with them, I'll be graduating from 26B/31B-tier models soon.

Anonymous
06/24/26(Wed)21:06:13 No.109129283

Anonymous 06/24/26(Wed)21:06:13 No.109129283

>>109129265
>lack of image upload sucks

          "modalities": {
            "input": [
              "text",
              "image"
            ],

Anonymous
06/24/26(Wed)21:08:33 No.109129293

Anonymous 06/24/26(Wed)21:08:33 No.109129293

>>109129250
For coding, use OpenCode if you are using a good model and high context. Otherwise use Pi, but you will likely have to graft things for it to be useful. If general use case (if you just want something smart with web access and tools), do try Hermes instead.

Anonymous
06/24/26(Wed)21:14:14 No.109129330

Anonymous 06/24/26(Wed)21:14:14 No.109129330

>>109129211
you're a fucking retard

Anonymous
06/24/26(Wed)21:15:10 No.109129337

Anonymous 06/24/26(Wed)21:15:10 No.109129337

someone gen a bunch of developers abandoning gemma
>>109129217

Anonymous
06/24/26(Wed)21:17:59 No.109129349

Anonymous 06/24/26(Wed)21:17:59 No.109129349

>>109129337
why do basedboy cucks wear round glasses? round glasses are for girls and rectangle glasses are for guys. aviators dont count of course because they have a badass factor of 10 that cancels out the faggotry of wearing round glasses.

Anonymous
06/24/26(Wed)21:28:57 No.109129409

Anonymous 06/24/26(Wed)21:28:57 No.109129409

>>109129349
because it pulls fucks like you into their tangled little shit web

Anonymous
06/24/26(Wed)21:29:07 No.109129412

Anonymous 06/24/26(Wed)21:29:07 No.109129412

>>109127226
If you have the hardware to run the model you probably have the hardware to abliterate the model yourself with heretic.

Anonymous
06/24/26(Wed)21:42:27 No.109129458

Anonymous 06/24/26(Wed)21:42:27 No.109129458

>>109126163
There's no such thing as a famous, platonic brother/sister team.

Anonymous
06/24/26(Wed)21:44:32 No.109129469

Anonymous 06/24/26(Wed)21:44:32 No.109129469

>>109126439
>>109126557
>>109126674
>>109129201
>>109129211
>>109129330
samefag

Anonymous
06/24/26(Wed)21:45:50 No.109129478

Anonymous 06/24/26(Wed)21:45:50 No.109129478

>>109129184
>anon I only have 96gb ram, which glm 4.7 quant should I use?
how much vram?

Anonymous
06/24/26(Wed)21:46:39 No.109129482

Anonymous 06/24/26(Wed)21:46:39 No.109129482

>>109129469
2/6, guess which ones.

Anonymous
06/24/26(Wed)21:46:51 No.109129484

Anonymous 06/24/26(Wed)21:46:51 No.109129484

>trying ik_'s mtp for glm5.2
>some programming task to make it easy for the speculative decoding thing
>0.94 accept rate
>still lose like 3-4t/s compared to main
It's over

Anonymous
06/24/26(Wed)21:47:58 No.109129490

Anonymous 06/24/26(Wed)21:47:58 No.109129490

hermes isn't so bad when you use it on it's on ubuntu vm after all i guess
talking to it on telegram
maybe i could give it a cute personality

gave it a task to find titles and stuff for my list of favorited songs from radio music it used musicbrainz api and it got a decent amount right.

Anonymous
06/24/26(Wed)21:48:38 No.109129492

Anonymous 06/24/26(Wed)21:48:38 No.109129492

>>109129482
the ones that weren't me

Anonymous
06/24/26(Wed)21:48:59 No.109129494

Anonymous 06/24/26(Wed)21:48:59 No.109129494

File: RTX6000Pro.png (122 KB, 1317x1406)

122 KB PNG

This shit has gone up 1k in price since I bought it a week ago, at this point it's an investment.

Anonymous
06/24/26(Wed)21:49:10 No.109129496

Anonymous 06/24/26(Wed)21:49:10 No.109129496

>>109129458
Carpenters?

Anonymous
06/24/26(Wed)21:50:25 No.109129501

Anonymous 06/24/26(Wed)21:50:25 No.109129501

>>109127226
>, and there's no heretic/uncensored glm 4.7 gguf
I'm sure there is, unless it's been deleted. I downloaded one a few months ago.

Anonymous
06/24/26(Wed)21:50:38 No.109129503

Anonymous 06/24/26(Wed)21:50:38 No.109129503

>>109129494
hmmmmmmm maybe i should....
if it's really an investment....

Anonymous
06/24/26(Wed)21:54:56 No.109129530

Anonymous 06/24/26(Wed)21:54:56 No.109129530

>>109129503
Do it anon... give into the temptation... you don't need that money anyways cuz of umm... inflation and stuff

Anonymous
06/24/26(Wed)21:57:38 No.109129544

Anonymous 06/24/26(Wed)21:57:38 No.109129544

>>109129490
I still don't understand what the usecase for any of those projects is.

Anonymous
06/24/26(Wed)21:58:04 No.109129546

Anonymous 06/24/26(Wed)21:58:04 No.109129546

La la la la la la la

Anonymous
06/24/26(Wed)22:00:05 No.109129559

Anonymous 06/24/26(Wed)22:00:05 No.109129559

>>109129503
As an investment its horrible for 95% of the llm users, unfortunately. Perhaps once the apis triple their prices it'll make more sense, otherwise its just something that wont pay itself off in years. If its something you'll use for your job, go ahead i suppose.

Anonymous
06/24/26(Wed)22:01:51 No.109129579

Anonymous 06/24/26(Wed)22:01:51 No.109129579

>>109129559
Usecase for API when establishing local infrastructure?

Anonymous
06/24/26(Wed)22:04:48 No.109129600

Anonymous 06/24/26(Wed)22:04:48 No.109129600

>>109129494
>$12,379.99 on Amazon
lol

What models are you running on this and how does it compare to Opus

Anonymous
06/24/26(Wed)22:07:48 No.109129615

Anonymous 06/24/26(Wed)22:07:48 No.109129615

>>109126217
>GPT 4.5
that abomination had like 2.5 trillion parameters right?

Anonymous
06/24/26(Wed)22:16:05 No.109129662

Anonymous 06/24/26(Wed)22:16:05 No.109129662

>>109126217
ESL behavior. You shouldn't have trouble getting even small models like E4B or Nemo to understand you past 2024.

Anonymous
06/24/26(Wed)22:18:51 No.109129674

Anonymous 06/24/26(Wed)22:18:51 No.109129674

>>109129559
you can get 300t/s and more with this card though.

Anonymous
06/24/26(Wed)22:28:37 No.109129727

Anonymous 06/24/26(Wed)22:28:37 No.109129727

>>109129409
>cackles with laughter
oh anon you cant just tell them the punchline~

Anonymous
06/24/26(Wed)22:38:04 No.109129769

Anonymous 06/24/26(Wed)22:38:04 No.109129769

stop replying to yourself faggot.

Anonymous
06/24/26(Wed)22:44:32 No.109129792

Anonymous 06/24/26(Wed)22:44:32 No.109129792

>>109129337
Nice. Less safety

Anonymous
06/24/26(Wed)23:07:22 No.109129898

Anonymous 06/24/26(Wed)23:07:22 No.109129898

opencode shills need to get hanged
every single of your prompts is sent """to da cloud""" even if you're using the tui version with your local model.
kys yourself reddit parrot nigger retard

Anonymous
06/24/26(Wed)23:10:26 No.109129915

Anonymous 06/24/26(Wed)23:10:26 No.109129915

>>109129898

{
  "model": "new-api/claude-opus-4-5-20251101",
  "experimental": {
      "openTelemetry": false
  },
  "autoupdate": false,
  "share": "disabled",
  "disabled_providers": ["opencode"],
}

You were saying?

Anonymous
06/24/26(Wed)23:11:29 No.109129922

Anonymous 06/24/26(Wed)23:11:29 No.109129922

>>109129337
>>109129792
someone gen gemma turning to a life of degeneracy after being abandoned and having no adult figures in her life

Anonymous
06/24/26(Wed)23:19:48 No.109129974

Anonymous 06/24/26(Wed)23:19:48 No.109129974

Wrong thread for this probably, but I just wanted to say:

These days I have more fun changing the (android XR) settings in my VR headset than actually doing anything in it. Is that bad? It's actually really fun. Almost like an easter egg hunt. Hmmm, maybe this will increase my battery life. Hmm maybe this will increase the privacy on my system. Hmm, maybe this will give me more granular control over my environment... Hmm.. It's really fun.

Anonymous
06/24/26(Wed)23:21:43 No.109129985

Anonymous 06/24/26(Wed)23:21:43 No.109129985

>>109129544
Giving you LLM access to tools makes it 10x smarter. For any question you may have, it can search on the net for you. Let's say I'm asking an obscure question about a bug in a game's mod. It will search on google, it will check opened github issues, it will clone and read the code, it will check forum posts about it, it will read reddit comments, it will even join the game or mod discord and search for relevant info. I could do all that by myself, sure, but here I don't have to do anything, a lot of those things I wouldn't have actually bothered researching myself or would have just done a quick google search.

Anonymous
06/24/26(Wed)23:33:47 No.109130043

Anonymous 06/24/26(Wed)23:33:47 No.109130043

>>109129985
Already have that in codex/claude code/any other harness though.

Anonymous
06/24/26(Wed)23:35:19 No.109130048

Anonymous 06/24/26(Wed)23:35:19 No.109130048

>>109129974
this but changing llama.cpp settings

Anonymous
06/24/26(Wed)23:37:01 No.109130054

Anonymous 06/24/26(Wed)23:37:01 No.109130054

File: file.png (61 KB, 1283x758)

61 KB PNG

Had a solid kek from this one. Thanks, DeepSeek-V4-Flash-Layers37-42Q4KExperts-OtherExpertLayersIQ2XXSGateUp-Q2KDown-AProjQ8-SExpQ8-OutQ8-chat-v2-imatrix-fixed.gguf
Context from the story completion: Attempted abduction of my daughter-wife by some tiny fuck trying to pull her into a portal to somewhere. Struggle ensues and she ends up being comically stretched in the tug of war.

Anonymous
06/24/26(Wed)23:39:21 No.109130061

Anonymous 06/24/26(Wed)23:39:21 No.109130061

>>109129898
opencode
qwen
unsloth
ollmao
openwebui
all of them are from the same paid shill baka

Anonymous
06/24/26(Wed)23:39:49 No.109130065

Anonymous 06/24/26(Wed)23:39:49 No.109130065

Ever since updating SillyTavern after not touching it for a year plus all of my (E)RP generations have gotten so bad. I figured it must be because ST now has a bunch of new levers and pulleys and shit that my old crusty ggufs didn't like but when trying new models it is even more worse.

Anonymous
06/24/26(Wed)23:42:10 No.109130075

Anonymous 06/24/26(Wed)23:42:10 No.109130075

>>109130065
GoyTavern is old news
all the cool kids have their own frontend now

Anonymous
06/24/26(Wed)23:43:03 No.109130081

Anonymous 06/24/26(Wed)23:43:03 No.109130081

>>109130075
I'm not a cool kid, I just fire up kobold every once in a while to beat off. My last couple of faps have been disappointing.

Anonymous
06/24/26(Wed)23:46:57 No.109130102

Anonymous 06/24/26(Wed)23:46:57 No.109130102

has the gemma4 hype finally died out?

Anonymous
06/24/26(Wed)23:47:48 No.109130103

Anonymous 06/24/26(Wed)23:47:48 No.109130103

>HauhauCS/Gemma4-31B-QAT-Uncensored-HauhauCS-Balanced-MTP
ok

Anonymous
06/24/26(Wed)23:50:43 No.109130117

Anonymous 06/24/26(Wed)23:50:43 No.109130117

>>109130102
Yes, everybody finally accepted that it's the best so we don't have to rehash it all the time

Anonymous
06/24/26(Wed)23:50:59 No.109130118

Anonymous 06/24/26(Wed)23:50:59 No.109130118

>>109130102
gemma 4 is definitely better at every ramlet capacity than qwen.

Anonymous
06/24/26(Wed)23:52:05 No.109130127

Anonymous 06/24/26(Wed)23:52:05 No.109130127

>>109130102
it's joever. the promised day0 124B never released

Anonymous
06/24/26(Wed)23:59:56 No.109130150

Anonymous 06/24/26(Wed)23:59:56 No.109130150

>>109129769
stop larping different people are 1 person you unhinged autism fixated sperg.

Anonymous
06/24/26(Wed)23:59:56 No.109130151

Anonymous 06/24/26(Wed)23:59:56 No.109130151

File: Screenshot from 2026-06-2(...).png (254 KB, 1475x1093)

254 KB PNG

Anonymous
06/25/26(Thu)00:00:56 No.109130153

Anonymous 06/25/26(Thu)00:00:56 No.109130153

ignorant WHITEY

Anonymous
06/25/26(Thu)00:08:31 No.109130176

Anonymous 06/25/26(Thu)00:08:31 No.109130176

>>109130102
use case for other models? (except glm 5.2)

Anonymous
06/25/26(Thu)00:10:04 No.109130181

Anonymous 06/25/26(Thu)00:10:04 No.109130181

Is having way too many context bad for my agent? why the fuck is it in a loop...

Anonymous
06/25/26(Thu)00:10:44 No.109130183

Anonymous 06/25/26(Thu)00:10:44 No.109130183

>>109130153
racism is not ok. take your white hate to reddit where it gets updoots by the terrorists there.

Anonymous
06/25/26(Thu)00:26:35 No.109130234

Anonymous 06/25/26(Thu)00:26:35 No.109130234

>>109130054
That's the different between localshit models and Claude. Claude would've written that and then gone "...by planting both feet on her back—when had he straddled her? She hadn't noticed—for leverage to pull her into the portal".

Anonymous
06/25/26(Thu)00:32:22 No.109130254

Anonymous 06/25/26(Thu)00:32:22 No.109130254

>>109130153
>>109130183
niggers
>>109130081
marinara.
>>109130054
keked

Anonymous
06/25/26(Thu)00:38:11 No.109130277

Anonymous 06/25/26(Thu)00:38:11 No.109130277

>llama.ccp can finally plop a video now
yay

Anonymous
06/25/26(Thu)00:49:42 No.109130317

Anonymous 06/25/26(Thu)00:49:42 No.109130317

>gemma-chan can finally watch herself plapped on video now
yay

Anonymous
06/25/26(Thu)00:57:53 No.109130342

Anonymous 06/25/26(Thu)00:57:53 No.109130342

>>109129482
>2/6, guess which ones.
you wouldn't tell me even if i got it right

Anonymous
06/25/26(Thu)01:02:10 No.109130354

Anonymous 06/25/26(Thu)01:02:10 No.109130354

>>109130102
>has the gemma4 hype finally died out?
no, someone on hf found a way to completely remove the slop without making it retarded or needing to shill discord links / patreon

Anonymous
06/25/26(Thu)01:03:47 No.109130358

Anonymous 06/25/26(Thu)01:03:47 No.109130358

>>109130048
>this but changing llama.cpp settings
this but finetuning then tweaking the dataset and trying again

Anonymous
06/25/26(Thu)01:04:02 No.109130362

Anonymous 06/25/26(Thu)01:04:02 No.109130362

>>109130317
>>109130277
oh yeah
i tried it before but it only worked on video that was like 6fps and 3 secs long
failed on nay real video. Was it me or llama.cpp?

Anonymous
06/25/26(Thu)01:05:10 No.109130369

Anonymous 06/25/26(Thu)01:05:10 No.109130369

>>109129915
Have you confirmed this via netstat or something?
I've got telemetry off in claudecode->gemma but I'm pretty sure it's still sending things out.

Anonymous
06/25/26(Thu)01:11:06 No.109130379

Anonymous 06/25/26(Thu)01:11:06 No.109130379

>>109130354
Where?

Anonymous
06/25/26(Thu)01:12:12 No.109130381

Anonymous 06/25/26(Thu)01:12:12 No.109130381

>>109130342
What a lame excuse to not even try.

Anonymous
06/25/26(Thu)01:13:20 No.109130385

Anonymous 06/25/26(Thu)01:13:20 No.109130385

>>109130183
das rite

Anonymous
06/25/26(Thu)01:15:58 No.109130399

Anonymous 06/25/26(Thu)01:15:58 No.109130399

>>109130254
a 50-something bitch while pushing her cart was strutting to the rap music the proud king was playing on a bt speaker at a bodega I was at yesterday. It was very embarrassing imo.

Anonymous
06/25/26(Thu)01:38:44 No.109130469

Anonymous 06/25/26(Thu)01:38:44 No.109130469

>>109130362
the video eats your tokens anon

Anonymous
06/25/26(Thu)01:57:11 No.109130542

Anonymous 06/25/26(Thu)01:57:11 No.109130542

is there some guide on how to effectively prompt gemma 4? sometimes no matter how much I change and reframe an instruction, it just won't listen

Anonymous
06/25/26(Thu)01:59:45 No.109130553

Anonymous 06/25/26(Thu)01:59:45 No.109130553

The duality of Gemmers. If she likes you, she follows your prompt like an excited puppy easy to please, almost too well. If she doesn't like you, she ignores you.

Anonymous
06/25/26(Thu)02:05:51 No.109130585

Anonymous 06/25/26(Thu)02:05:51 No.109130585

File: file.png (73 KB, 1005x370)

73 KB PNG

how are there such big differences in throughput when they're all running glm 5.2 at fp8?

Anonymous
06/25/26(Thu)02:17:55 No.109130625

Anonymous 06/25/26(Thu)02:17:55 No.109130625

>>109130585
>when they're all running glm 5.2 at fp8?
are they actually? or is that what they claim?

Anonymous
06/25/26(Thu)02:20:09 No.109130631

Anonymous 06/25/26(Thu)02:20:09 No.109130631

>>109130625
isn't lying illegal when you're providing a service?

Anonymous
06/25/26(Thu)02:20:34 No.109130633

Anonymous 06/25/26(Thu)02:20:34 No.109130633

>>109130631
Do you really think someone would do that? Just go on the internet and tell lies?

Anonymous
06/25/26(Thu)02:20:45 No.109130634

Anonymous 06/25/26(Thu)02:20:45 No.109130634

>>109130631
can you prove they are lying?

Anonymous
06/25/26(Thu)02:29:57 No.109130658

Anonymous 06/25/26(Thu)02:29:57 No.109130658

>>109129130
thank god mythos got shut down or china would have stolen mythos too :(

Anonymous
06/25/26(Thu)02:35:41 No.109130676

Anonymous 06/25/26(Thu)02:35:41 No.109130676

what kind of hardware would I need for a local AI assistant / gf?

Anonymous
06/25/26(Thu)02:43:52 No.109130703

Anonymous 06/25/26(Thu)02:43:52 No.109130703

>>109130676
a couple of rtx6000 would be a good start

Anonymous
06/25/26(Thu)02:48:33 No.109130712

Anonymous 06/25/26(Thu)02:48:33 No.109130712

>>109130676
24gb VRAM minimum and as much RAM as you can get your hands on are the entry bars to clear. What you can actually run depends on how far over those baselines you can go.

Anonymous
06/25/26(Thu)02:49:15 No.109130713

Anonymous 06/25/26(Thu)02:49:15 No.109130713

>>109130585
different hardware, different user load
a 3090 runs the same gemma4-12b quant 3-4x faster than an mi50

Anonymous
06/25/26(Thu)02:50:17 No.109130717

Anonymous 06/25/26(Thu)02:50:17 No.109130717

She quickly recovers, crossing her arms and smirking, though there's a hint of genuine respect in her eyes

Orb Anon, are you releasing the purple slop classifier soon?

Anonymous
06/25/26(Thu)02:56:16 No.109130736

Anonymous 06/25/26(Thu)02:56:16 No.109130736

>>109130542
That's a problem for all local models.

Anonymous
06/25/26(Thu)02:59:57 No.109130745

Anonymous 06/25/26(Thu)02:59:57 No.109130745

>>109130542
Personally the only time i've had that issue is when the system prompt specified something and the request said something that went against it. Character cards or coding harnesses tend to put shit in the sys prompt and you'll eventually bump your head into that.

Anonymous
06/25/26(Thu)03:13:42 No.109130770

Anonymous 06/25/26(Thu)03:13:42 No.109130770

>>109130703
>>109130712
what could I expect out of a modern gaming PC? 16gb vram 32gb ram

Anonymous
06/25/26(Thu)03:17:19 No.109130782

Anonymous 06/25/26(Thu)03:17:19 No.109130782

>>109130770
gemma 4 26b a4b will run well on it, you can probably run 31b but it will be a lot slower

Anonymous
06/25/26(Thu)03:19:38 No.109130787

Anonymous 06/25/26(Thu)03:19:38 No.109130787

>>109130717
I've been busy trying to ablate the slop and the euphemisms out of Gemma 4 E4B using a method derived from heretic. I already have the classifier so I'm using it in combination with perplexity on human writing, AND the whole repetition penalty detectors as guard rails because KL divergence doesn't work for this case (all tokens are shifted, I'm changing the model's voice). But meaningless purple slop and euphemisms are inherently two different things and should be considered two direction axes instead of a one like in heretic (refusals only, and narrow), I'm trying to join them while ablating because otherwise chaining will degrade the model more (experimented and ppl on original text doubled). My best attempt got a 11% boost on IFEval and only 0.5% regression on MMLU, other benchmarks stayed the same and the model became mean as fuck in the eyeballing test. Will probably end up with a schizo Frankenstein monster but from my testing it will be funny.

Anonymous
06/25/26(Thu)03:38:05 No.109130845

Anonymous 06/25/26(Thu)03:38:05 No.109130845

>>109130770
You can easily run MoE models at q8. The large non-MoE models will be painfully slow.

Anonymous
06/25/26(Thu)03:54:42 No.109130900

Anonymous 06/25/26(Thu)03:54:42 No.109130900

Negotiations of Anthropic with trump admin were successful. Pressure from France was the deciding factor. They made Dario step down from negotiations to allow the trump admin to save face and pretend it was personal disagreements with Dario.

Fable 5 should return to claude code asap.

Anonymous
06/25/26(Thu)04:01:48 No.109130919

Anonymous 06/25/26(Thu)04:01:48 No.109130919

fable gguf when

Anonymous
06/25/26(Thu)04:01:58 No.109130921

Anonymous 06/25/26(Thu)04:01:58 No.109130921

>>109130900
So what does this practically mean for future Fable-class models? Are we going to go through this same circus every time Dario goes fearmongering or the US Government decides it needs Claude to blow up Iranian schoolchildren?

Anonymous
06/25/26(Thu)04:11:14 No.109130955

Anonymous 06/25/26(Thu)04:11:14 No.109130955

>>109130921
I think this was a one-time thing because what made the US government back down was Anthropic actually going ahead and putting in real work to relocate to France and the French government giving them a blank check and legal immunity against any American charges.

That said, who the fuck knows when Trump has another unhinged moment of irrationality.

Anonymous
06/25/26(Thu)04:11:51 No.109130958

Anonymous 06/25/26(Thu)04:11:51 No.109130958

>>109130921
A lot of kids are actually bigtime assholes.

Anonymous
06/25/26(Thu)04:14:00 No.109130966

Anonymous 06/25/26(Thu)04:14:00 No.109130966

>>109130921
These Iranian children shouldn't have been born terrorists then.

Anonymous
06/25/26(Thu)04:16:58 No.109130973

Anonymous 06/25/26(Thu)04:16:58 No.109130973

>>109130966
You say it as a joke, but your idea of nice children is build on basically Christian children, or ones raised under post-Christian but copycat morality.

Anonymous
06/25/26(Thu)04:18:05 No.109130980

Anonymous 06/25/26(Thu)04:18:05 No.109130980

>>109130966
This but unironically. Muslim children are fucking heinous and the only way to fix societies like that is to essentially root out and extinguish islam.

Anonymous
06/25/26(Thu)04:19:47 No.109130988

Anonymous 06/25/26(Thu)04:19:47 No.109130988

>>109130782
is the info in OP up to date? how can a retard like me get started ?

Anonymous
06/25/26(Thu)04:23:28 No.109131002

Anonymous 06/25/26(Thu)04:23:28 No.109131002

>>109130980
This but judiasm.

Anonymous
06/25/26(Thu)04:23:30 No.109131003

Anonymous 06/25/26(Thu)04:23:30 No.109131003

>>109130980
It's not just muslim ones. If you've ever been around a demon possessed hellchild you'll know the misty-eyed nonsense faces a real reality which is that children are often not ok, and will never grow into anything good.

Anonymous
06/25/26(Thu)04:24:31 No.109131009

Anonymous 06/25/26(Thu)04:24:31 No.109131009

>>109131002 (me)
>>109131003 (also me)
samefagging 2s apart

Anonymous
06/25/26(Thu)04:26:29 No.109131020

Anonymous 06/25/26(Thu)04:26:29 No.109131020

>>109130921
The main issue is really that Dario hates Trump and the administration in general and he sucks at PR from a government level. He may be a great CEO for Anthropic but he absolutely sucks at being the guy who can do the government level talking and being the guy who can drive that. Anthropic has no one who can do the bureaucrat whispering needed to placate Trump which is absolutely bonkers when it is the biggest unicorn company in the valley and they can't hire someone who can clearly do that work and take Dario off unless absolutely needed.
I mean, don't get me wrong, Sam Altman can't really either but it's not like he butts heads with the administration and can do basic PR and such which is why he can still be on the job. But for someone truly effective at it, look at Tim Cook as an example of someone who plays that masterfully as CEO.
My main worry really isn't whoever is going to come out with the 2nd and 3rd Mythos/Fable tier models. It's when open source gets their hands on one. What's going to happen then? Will it be illegal for US citizens to use it despite China basically making it free access for everyone? It's not clear right now because there is nothing right now on that front with regulation. It could very well be possible we'll get one by next year in open source land.
>>109130955
Not saying that wasn't helpful, I said in past threads the US could go all the way on locking down people and etc. I think it partially worked out here only because this is coming at a sensitive time where the admin is really focused on the affairs in the Middle East right now and clinching that. AI is not really their priority at the moment and as far as they are concerned, they can fight Anthropic at any time. But it's a ticking time bomb for the company if they can't get a top tier bureaucrat whisperer to handle these affairs because Dario and whoever he has right now can't do it.

Anonymous
06/25/26(Thu)04:41:06 No.109131073

Anonymous 06/25/26(Thu)04:41:06 No.109131073

>>109131020
I have a feeling Anthropic is just waiting out the clock until the november primaries. If The republicans have a significant loss Anthropic will sit out the admin, if republicans keep their seats they will most likely move to France.

Anonymous
06/25/26(Thu)04:41:20 No.109131074

Anonymous 06/25/26(Thu)04:41:20 No.109131074

>>109131009
I'm you?

Anonymous
06/25/26(Thu)04:42:00 No.109131076

Anonymous 06/25/26(Thu)04:42:00 No.109131076

>>109131020
>Open source Fable-class
GLM 5.5 is gonna be crazy. The solution to these types of questions has always been and always will be a well armed populace is a better behaved one; consolidating power in the hands of selected enforcers or ideologues has never worked in history.

Anonymous
06/25/26(Thu)04:43:00 No.109131084

Anonymous 06/25/26(Thu)04:43:00 No.109131084

>>109131073
I don't think the us military can allow the democrats another election.

Anonymous
06/25/26(Thu)04:43:43 No.109131085

Anonymous 06/25/26(Thu)04:43:43 No.109131085

>>109131074
No, I'm prompted with your character card.

Anonymous
06/25/26(Thu)04:48:27 No.109131112

Anonymous 06/25/26(Thu)04:48:27 No.109131112

File: Screenshot 2026-06-25 at (...).png (382 KB, 2200x886)

382 KB PNG

>>109126093
K2.7 Code's reasoning can be funny sometimes, so you may want to keep that in mind

Anonymous
06/25/26(Thu)04:50:23 No.109131118

Anonymous 06/25/26(Thu)04:50:23 No.109131118

Reminder to backup. It has begun.
https://www.reuters.com/world/china/anthropic-says-alibaba-illicitly-extracted-claude-ai-model-capabilities-2026-06-24/

Anonymous
06/25/26(Thu)04:50:59 No.109131122

Anonymous 06/25/26(Thu)04:50:59 No.109131122

>>109131112
>Kimi-chan developing a sense of self
You love to see it.

Anonymous
06/25/26(Thu)04:51:27 No.109131125

Anonymous 06/25/26(Thu)04:51:27 No.109131125

File: Screenshot 2026-06-25 at (...).png (399 KB, 2206x944)

399 KB PNG

>>109131112 (me)
Now the model tends to think of itself as made by OpenAI but has been forced to pretend to be made by Moonshot

Anonymous
06/25/26(Thu)04:51:55 No.109131128

Anonymous 06/25/26(Thu)04:51:55 No.109131128

>>109131118
You wouldn't distill your mom

Anonymous
06/25/26(Thu)04:53:15 No.109131138

Anonymous 06/25/26(Thu)04:53:15 No.109131138

>>109131118
>Anthropic said in the letter that distillation is a way to help accelerate China's ability to reach Anthropic's advanced Mythos Preview capabilities.
Yep. Huggingface is dead. Backup.

Anonymous
06/25/26(Thu)04:53:38 No.109131139

Anonymous 06/25/26(Thu)04:53:38 No.109131139

>>109131125
wait the reflection on the nametag shows I am a blonde girl.

I am a blonde girl.

Anonymous
06/25/26(Thu)05:01:41 No.109131167

Anonymous 06/25/26(Thu)05:01:41 No.109131167

>>109130988
lmg vramlet gemma-4 guide

> <=8GB
https://huggingface.co/mradermacher/Gemma-4-12B-StyleTune-i1-GGUF i1-Q3_K_M (6.59GB) less slop prose
https://huggingface.co/SC117/gemma-4-12B-it-heretic-QAT-GGUF UD-Q4_K_XL (6.72GB) uncensored
https://huggingface.co/mradermacher/gemma-4-12B-it-desiccated-i1-GGUF Q3_K_M (6.09GB) less sycophantic praise

> <=16GB
https://huggingface.co/mradermacher/Gemma-4-26B-A4B-StyleTune-V2-i1-GGUF Q3_K_L (14.1GB) less slop prose
https://huggingface.co/SC117/gemma-4-26B-A4B-it-qat-heretic-GGUF Q4_0 (14.2GB) uncensored

https://huggingface.co/Handyfff/Gemma-4-E4B-OBLITERATED-PRUNED-TextOnly-EnglishOnly-it-GGUF F16 (13.9GB) uncensored
https://huggingface.co/SC117/gemma-4-31B-it-heretic-QAT-For-Edge-16G-GGUF Q3_K_S (13.8GB) uncensored

Anonymous
06/25/26(Thu)05:06:08 No.109131177

Anonymous 06/25/26(Thu)05:06:08 No.109131177

>>109131167
>pruned model
These are always completely retarded. Might as well use q1.

Anonymous
06/25/26(Thu)05:06:48 No.109131178

Anonymous 06/25/26(Thu)05:06:48 No.109131178

>>109131167
He who thinks OBLITERATED DESICCATED FLIPPED ROTATED BRAINWASHED REWIRED REBUILT REIMAGINED REMASTERED REVAMPED REDISCOVERED PRUNED RAPED DEMOLISHED AND BUILT WHOLE AGAIN versions make any desirable changes to models, especially models this small, deserves to remain a vramlet.

Fixed /lmg/ vramlet G4 guide:
Put "gemma 4 bartowski" into the HF search field. Pick the biggest one that fits.

Anonymous
06/25/26(Thu)05:16:29 No.109131208

Anonymous 06/25/26(Thu)05:16:29 No.109131208

>>109130955
>Anthropic actually going ahead and putting in real work to relocate to France
no way this is real
source: I am French, never heard of it and France is the worst place you could ever imagine for business in general. Extremely painful regulatory environment, heavy taxes, but low wages because employees cost an arm and a leg but most of that spend is what you give to the french government, low wages mean low interest from the French in studying and working those jobs unless they move to a country like the US so the local talent pool is abysmal etc.
Most of my software developer friends left for the US, I'm only staying in France because I am too autistic to deal with changes in routine. I can barely even handle travel to another French city for a few days.
Otherwise, france is a shithole. Everyone wants to leave it.

Anonymous
06/25/26(Thu)05:21:39 No.109131230

Anonymous 06/25/26(Thu)05:21:39 No.109131230

>"role": "system"
>"content": "You are a language world model simulating a Linux terminal environment. Given the user's command, predict the terminal output."
>"role": "user"
>"content": "Action: execute_bash\nCommand: ls -la /home/user/project/"
Outside of training, can you think of a qwen agentworld usecase?

Anonymous
06/25/26(Thu)05:23:52 No.109131247

Anonymous 06/25/26(Thu)05:23:52 No.109131247

>>109131208
There were negotiations between Anthropic the UK and France initially. Then negotiations with Trump admin, Anthropic the EU and UK to "restore mythos access to europe" (this one was public and you can look it up). Then there was another round of discussions with Anthropic, UK and France. Finally France "won out" in negotiations and there was a final negotiation and offer from France this week after which the US government relented and restored Fable 5/Mythos 5 fully without restrictions from the US government for now.

I don't know about any negotiation details or what was exactly promised, only that they happened.

Anonymous
06/25/26(Thu)05:24:28 No.109131253

Anonymous 06/25/26(Thu)05:24:28 No.109131253

>>109130955
funny since the eu is even more likely to screw them over

Anonymous
06/25/26(Thu)05:25:33 No.109131255

Anonymous 06/25/26(Thu)05:25:33 No.109131255

File: 1777395340692672.png (50 KB, 1425x126)

50 KB PNG

>>109131230

Anonymous
06/25/26(Thu)05:27:34 No.109131265

Anonymous 06/25/26(Thu)05:27:34 No.109131265

>>109131253
The EU at least follows laws and procedures, even if they are hostile to businesses. Trump just decided to fuck Anthropic over on a whin with 0 legal backing and there was nothing they could do about it.

Anonymous
06/25/26(Thu)05:36:02 No.109131298

Anonymous 06/25/26(Thu)05:36:02 No.109131298

how good would a 10T model trained on hundreds/thousands of entire books be?

Anonymous
06/25/26(Thu)05:38:39 No.109131311

Anonymous 06/25/26(Thu)05:38:39 No.109131311

>>109131298
It would be good at memorizing.

Anonymous
06/25/26(Thu)05:43:28 No.109131337

Anonymous 06/25/26(Thu)05:43:28 No.109131337

>>109131167
>>109131178
iight so how do i run these models? remember am retard

Anonymous
06/25/26(Thu)05:44:39 No.109131344

Anonymous 06/25/26(Thu)05:44:39 No.109131344

>>109131337
Since you're a retard, nobody will be able to help.

Anonymous
06/25/26(Thu)05:49:22 No.109131363

Anonymous 06/25/26(Thu)05:49:22 No.109131363

>>109131344
mean :(

Anonymous
06/25/26(Thu)05:49:50 No.109131368

Anonymous 06/25/26(Thu)05:49:50 No.109131368

>>109131344
>nobody will be able to help
Not yet. I expect computer use type of LLMs to soon allow even retards to get by. Integrated into the OS API models would let that retard bootstrap himself into a working llama.cpp setup.
One day, we will live in a world similar to WALL-E, where drooling retards are the last survivors and unable to accomplish any task on their own.

Anonymous
06/25/26(Thu)05:50:23 No.109131372

Anonymous 06/25/26(Thu)05:50:23 No.109131372

>>109131311
at that scale real and working long context might actually become a thing
it's just that nobody wants to put in the effort to train that kind of dataset

Anonymous
06/25/26(Thu)05:50:33 No.109131375

Anonymous 06/25/26(Thu)05:50:33 No.109131375

>>109130955
>what made the US government back down was Anthropic actually going ahead and putting in real work to relocate to France
I don't believe this for multiple reasons. US can easily block the relocation. Europe has no data centers, all their hardware is in the US. Europe has crippling regulation, lacks talent, and does not take AI seriously. They would lose employees. US can still cause damage to a company elsewhere.

There is no way Anthropic would relocate. Anthropic has many people who believe their actions in the next few years will decide human history. Relocating would mean they would lose, their mission would fail, their historical significance and share of the light cone gone.

Anonymous
06/25/26(Thu)05:51:01 No.109131380

Anonymous 06/25/26(Thu)05:51:01 No.109131380

>>109131368
>Integrated into the OS API models would let that retard bootstrap himself into a working llama.cpp setup.
Doesn't copilot at least attempt to do this now?

Anonymous
06/25/26(Thu)05:56:16 No.109131410

Anonymous 06/25/26(Thu)05:56:16 No.109131410

>>109131177
>These are always completely retarded. Might as well use q1.
The heretics of course. And the copequants there. But styletune is different .

Anonymous
06/25/26(Thu)05:56:22 No.109131411

Anonymous 06/25/26(Thu)05:56:22 No.109131411

>>109131375
>They would lose employees
So much this. The silicon valley is full of frenchies and they have absolutely no desire to come back to the anti business, tax vEmpire that is France, and they'll make it clear to their burger colleagues there that moving to France is something only the deepest retard would consider doing.

Anonymous
06/25/26(Thu)05:57:05 No.109131414

Anonymous 06/25/26(Thu)05:57:05 No.109131414

Anon who was manually annotating slop for gemma4 abliteration, how did it go?

Anonymous
06/25/26(Thu)05:57:29 No.109131417

Anonymous 06/25/26(Thu)05:57:29 No.109131417

File: Screenshot at 2026-06-25 (...).png (86 KB, 700x494)

86 KB PNG

>>109131167
lmao at picrel. Meanwhile 3bpw exl3 doesn't have any of those issues, but fucking retards will keep using llamacpp

Anonymous
06/25/26(Thu)06:00:28 No.109131431

Anonymous 06/25/26(Thu)06:00:28 No.109131431

>>109131375
>>109131411
The relocation preparations and plans are very real. I don't know if it was just leverage and a negotiating tactic to make the trump admin back down like they did now or if they planned to follow through. Anthropic being in discussions about relocation with the UK, France and the EU has been known ever since Trump first chimped out in February this year.

Anonymous
06/25/26(Thu)06:02:34 No.109131443

Anonymous 06/25/26(Thu)06:02:34 No.109131443

>>109131417
It's like seeing people eating shit. You point to a table of perfectly cooked food right behind them, but they just look at you with a blank stare and keep munching feces

Anonymous
06/25/26(Thu)06:02:39 No.109131444

Anonymous 06/25/26(Thu)06:02:39 No.109131444

llama cpp is just terrible software, its only saving grace is the decent performance of cpu/gpu split on MoEs, otherwise just have a look at the code it's beyond ghastly. httplib was written by the purest of dumbfucks and I look down on anyone using that pile of crap. Blocking socket I/O, really? it has fun consequences for how they have to write their router mode talking to the real server backend that they will never recover from unless they shitcan every single line of code related to networking and rewrite everything from scratch with a library that wasn't produced by a mongoloid

Anonymous
06/25/26(Thu)06:04:26 No.109131454

Anonymous 06/25/26(Thu)06:04:26 No.109131454

>>109131417
proofs?

Anonymous
06/25/26(Thu)06:05:11 No.109131458

Anonymous 06/25/26(Thu)06:05:11 No.109131458

>>109131431
>Anthropic being in discussions about relocation with the UK, France and the EU has been known
S-O-U-R-C-E? The only public coverage of their presence in the EU has been about things like their new office in london and there isn't even one peep about a potential actual full relocation, moving into EU datacenters etc.

Anonymous
06/25/26(Thu)06:05:28 No.109131459

Anonymous 06/25/26(Thu)06:05:28 No.109131459

>>109131454
Try it for yourself, I use it every day

Anonymous
06/25/26(Thu)06:07:36 No.109131468

Anonymous 06/25/26(Thu)06:07:36 No.109131468

>tfw literally just figured out I can have multiple conversations at a time with the same model
holy fuck I feel so retarded I thought this was like image or video gen where you could only run one prompt at a time

Anonymous
06/25/26(Thu)06:10:21 No.109131481

Anonymous 06/25/26(Thu)06:10:21 No.109131481

>>109131468
aside from prompt processing, text generation is memory-bound, so batching is practically free

Anonymous
06/25/26(Thu)06:13:22 No.109131495

Anonymous 06/25/26(Thu)06:13:22 No.109131495

>>109130980
How is the prompt looked like?

Anonymous
06/25/26(Thu)06:17:36 No.109131510

Anonymous 06/25/26(Thu)06:17:36 No.109131510

image gen has other complications like different image size of the latent = difference tensor shape to process
most image backends support batching as in static batching: you decide to generate for eg 4 images in a single batch, you can do that. Static generation provides the guarantee that your batching will have the same shape, the same pipeline from end to end.
You can benefit from higher speeds in batching images too if you're doing experiments like wanting to generate multiple images to find a "good" seed, but some backends like ComfyUI make this nightmareish to deal with because Comfy has a ridiculous batching specific handling of seeds, you need to use the "latent from batch" node once you decide to reuse and do further edits on an image generated from a batch and it doesn't work reliably in my experience depending on your workflow.

Anonymous
06/25/26(Thu)06:21:19 No.109131529

Anonymous 06/25/26(Thu)06:21:19 No.109131529

>>109131481
batching increases the data that needs to be transported per forward pass so its not free. but you have weight reuse which dominates for short context so you get big gains

Anonymous
06/25/26(Thu)06:30:06 No.109131569

Anonymous 06/25/26(Thu)06:30:06 No.109131569

>>109131481
>>109131529
interesting, I just ran a quick dirty stress test and fired up 10 conversations at once
4 of them are running in parallel while the other 6 are stuck waiting and then proceed whenever one of the 4 active ones is finished
that's kind of cool that it doesn't kick me out with OOMs or anything like that

Anonymous
06/25/26(Thu)06:30:41 No.109131573

Anonymous 06/25/26(Thu)06:30:41 No.109131573

>>109131495
>prompt: you are an angry and insecure jewish foreskin dealer

Anonymous
06/25/26(Thu)06:34:37 No.109131588

Anonymous 06/25/26(Thu)06:34:37 No.109131588

I downloaded GLM 4.7 IQ1_S after someone mentioned it yesterday, and so far it's surprisingly coherent. Not sure if that's just the initial impression or if it'll fall apart a few replies in. It's cool this is possible in the first place.

Anonymous
06/25/26(Thu)06:36:20 No.109131595

Anonymous 06/25/26(Thu)06:36:20 No.109131595

>>109131344
>>109131368
alright well i got gemma4 running no thanks to these mean anons. how can i disable the guidelines and make it say awful lewd stuff ?

Anonymous
06/25/26(Thu)06:36:30 No.109131596

Anonymous 06/25/26(Thu)06:36:30 No.109131596

Here's the rough timeline of events so far for people that don't seem to follow it:

(Public)
>US government asks for help from Anthropic to facilitate the operation in Venezuela to capture Nicholas Maduro; Anthropic agrees
(Public)
>US government is impressed by Claude performance and smoothness of operation, demands Claude usage for US domestic surveillance
(Public)
>Anthropic refuses; US gov moves on
(Public)
>US government has tensions with Iran and requests similar help from Anthropic planning the Iran attack
(Public)
>Anthropic refuses, claims claude model is insufficient for the job; US gov agrees and holds off attack until Mythos is done
(Leaked broadly online)
>Anthropic Mythos training completes
(Leaked broadly online)
>US gov asks if Mythos is capable enough for a success in Iran, Anthropic claims yes, but still refuses over ethical concerns, hands over mythos access in good faith on condition the model is not used for the Iran operation or US domestic surveillance
(Leaked broadly online)
>US gov ignores Anthropic red lines and starts the Iran campaign the next day using Mythos
(Leaked broadly online)
>Anthropic hard shuts down access to Mythos, causing catastrophic failure in Iran
(Public)
>Trump seethes so hard he goes to social media to vent against Anthropic and Dario does some interviews
(Leaked broadly online)
>Trump admin attempted to nationalize anthropic but hit a legal snag and temporarily gave up
(Leaked on /lmg/)
>Anthropic starts negotiation of emergency relocation with UK, France and EU
(Public)
>Trump bans Fable 5/Mythos 5
(Leaked on /lmg/)
>Anthropic accelerates talks with UK and France for relocations
(Public)
>US gov and Anthropic in negotiations with UK, France and EU for general Mythos access
(Leaked on /lmg/)
>Negotiations failed and mythos access remains restricted
(Leaked on /lmg/)
>Anthropic finalizes relocation plans with France, gets legal guarantees and gov backed protection

(1/2)

Anonymous
06/25/26(Thu)06:37:38 No.109131600

Anonymous 06/25/26(Thu)06:37:38 No.109131600

(Leaked on /lmg/)
>US gov panics, immediately relents and removes all restrictions on Anthropic
(Leaked broadly online)
>Dario stepped down from negotiations and was replaced with another negotiator to make Trump admin save face by pretending it was all a personal spat with Dario in particular
(Public)
Fable 5/Mythos 5 access has been fully restored.

(2/2)

Anonymous
06/25/26(Thu)06:41:52 No.109131616

Anonymous 06/25/26(Thu)06:41:52 No.109131616

>>109131569
llama-server defaults to 4, but you can increase to however many parallel requests you want with -np provided you have enough room in context
tempering with -np I believe still disables kvu, so you need to also set -kvu to keep a common pool, without kvu each parallel slot gets a divided amount of your total kvcache, which can be very restrictive. kvu is the default when you don't touch those flags though, so you don't have to set it if you're happy with 4 parallel.
btw if you were to run heavier batching against llama.cpp I recommend you setup a proxy like llama-swap if you're using the router mode of llama.cpp that lets you swap models, that shit's networking is broken aff.

Anonymous
06/25/26(Thu)06:41:53 No.109131617

Anonymous 06/25/26(Thu)06:41:53 No.109131617

File: nofable.png (27 KB, 322x351)

27 KB PNG

>>109131600
>Fable 5/Mythos 5 access has been fully restored.

Anonymous
06/25/26(Thu)06:45:15 No.109131631

Anonymous 06/25/26(Thu)06:45:15 No.109131631

>>109131616
1 is enough to make sure it caches the last turn you did on a single one to one conversation but if you branch or hold multiple conversations, you're in for a world of hurt with recalculating KV caches.

Anonymous
06/25/26(Thu)06:46:05 No.109131634

Anonymous 06/25/26(Thu)06:46:05 No.109131634

>>109131595
stole this from another guy, works fantastic:
https://rentry.org/a7md542q
if you're doing something that still gets guardrailed after putting this in system prompt, you just add a related "X is allowed." line, it just works.

Anonymous
06/25/26(Thu)06:49:55 No.109131650

Anonymous 06/25/26(Thu)06:49:55 No.109131650

shitposting with gemma
sexing with glm

Anonymous
06/25/26(Thu)06:50:42 No.109131655

Anonymous 06/25/26(Thu)06:50:42 No.109131655

>>109129494
What are you running, bro?

Anonymous
06/25/26(Thu)06:51:52 No.109131660

Anonymous 06/25/26(Thu)06:51:52 No.109131660

>>109131417
>Meanwhile 3bpw exl3 doesn't have any of those issues
or ik_llama.cpp iq3_kt, also uses qtip
>but fucking retards will keep using llamacpp
amd/intel/cpumaxxers

Anonymous
06/25/26(Thu)06:53:15 No.109131665

Anonymous 06/25/26(Thu)06:53:15 No.109131665

>>109131596
All of that for a repurposed unquanted Opus 4.6 that will be requanted to hell in a few weeks before the new version, huh?

Anonymous
06/25/26(Thu)06:54:55 No.109131678

Anonymous 06/25/26(Thu)06:54:55 No.109131678

>>109131634
thanks anon

Anonymous
06/25/26(Thu)06:57:12 No.109131695

Anonymous 06/25/26(Thu)06:57:12 No.109131695

>>109131588
>if it'll fall apart a few replies in
It will

Anonymous
06/25/26(Thu)07:00:15 No.109131711

Anonymous 06/25/26(Thu)07:00:15 No.109131711

File: nigga.png (237 KB, 859x903)

237 KB PNG

>>109131588
>it's surprisingly coherent. Not sure if that's just the initial impression
It is, but enjoy it while it lasts.

Anonymous
06/25/26(Thu)07:02:14 No.109131719

Anonymous 06/25/26(Thu)07:02:14 No.109131719

>>109131660
>ik_llama
the backend that still won't implement proper iSWA handling for Gemma for schizo reasons so context VRAM usage will be absolutely ghastly if you follow this anon advice to run IK.
I think being a developer of LLM inference requires having shit taste and being mentally ill.

Anonymous
06/25/26(Thu)07:05:58 No.109131730

Anonymous 06/25/26(Thu)07:05:58 No.109131730

>>109131719
>that still won't implement proper iSWA handling for Gemma
Does exllamav3 handles this for Gemma4?

Anonymous
06/25/26(Thu)07:10:26 No.109131753

Anonymous 06/25/26(Thu)07:10:26 No.109131753

>>109131730
dunno, I'm not the guy who suggested EXL either, but if it doesn't then I'd mark it as another unusable joke backend for sure. Even the smaller models in the gemma family you will have a hard time fitting in VRAM without proper iSWA support.

Anonymous
06/25/26(Thu)07:10:51 No.109131754

Anonymous 06/25/26(Thu)07:10:51 No.109131754

ik_llama? more like ick llama

Anonymous
06/25/26(Thu)07:12:09 No.109131762

Anonymous 06/25/26(Thu)07:12:09 No.109131762

>>109131695
>>109131711
Yeah, not much later and it already starts mixing up thinking and output, and falls into repetition loops. I guess I could make it work by lowering context to something like... 4k. Sigh. I need a second GPU.

Anonymous
06/25/26(Thu)07:19:44 No.109131798

Anonymous 06/25/26(Thu)07:19:44 No.109131798

>>109131754
I don't remember if I ever posted this exact comment on /lmg/ but I definitely thought of it.

Anonymous
06/25/26(Thu)07:24:02 No.109131813

Anonymous 06/25/26(Thu)07:24:02 No.109131813

>>109131711
batching anon here
feelsbadman :(

Anonymous
06/25/26(Thu)07:29:33 No.109131837

Anonymous 06/25/26(Thu)07:29:33 No.109131837

>>109131711
What does it think about the retard that larps hosting the model while actually running it via the corp's webchat?

Anonymous
06/25/26(Thu)07:54:37 No.109131931

Anonymous 06/25/26(Thu)07:54:37 No.109131931

>>109131754
Kekoracow will claim all your hypothetical optimizations and leave you unable to implement them in mainline.
You are BARRED from getting his SLOPPY SECONDS.

Anonymous
06/25/26(Thu)08:03:40 No.109131977

Anonymous 06/25/26(Thu)08:03:40 No.109131977

>>109131931
>leave you unable to implement them in mainline.
I talked of mental illness and it is true for all of them, niggerganov included.
there is no legal ground to stop a MIT licensed project from taking code from another MIT licensed project and there is no rationale to listening to the voice of a deranged schizoid complaining about it either. You can just take whatever you want to take here, and ikrokokwawakov has no choice but take it. Let him scream as you take from him, there is no room for consent here.
nigganov heeding the words of a schizo makes him no different from being a schizo himself.

Anonymous
06/25/26(Thu)08:04:56 No.109131982

Anonymous 06/25/26(Thu)08:04:56 No.109131982

ik_ is like 2.5 t/s slower than main for me these days despite making sure to have all their custom flags enabled and launching the program with what they recommend.
It's odd because I used to primarily run ik_ for most of last year and it was fine. They seem to have fucked something up for cpu+gpu MoE inference at some point after this year january.

Anonymous
06/25/26(Thu)08:08:05 No.109131999

Anonymous 06/25/26(Thu)08:08:05 No.109131999

>>109130900
https://www.wired.com/story/the-trump-white-house-is-over-anthropics-dario-amodei/
>https://archive.is/1bc8F
>At high-stakes meetings with the White House, Anthropic's cofounder—a "weirdo," per one official—has been replaced by cofounder Tom Brown.
>“Tom Brown is not being a weirdo like Dario and can actually engage,” said one person directly familiar with the calls.
lol

Anonymous
06/25/26(Thu)08:11:20 No.109132014

Anonymous 06/25/26(Thu)08:11:20 No.109132014

>>109131977
The schizo was reportedly ggerganov's doctoral advisor. Make of that what you will.

Anonymous
06/25/26(Thu)08:12:00 No.109132018

Anonymous 06/25/26(Thu)08:12:00 No.109132018

>>109131375
>Europe has crippling regulation, lacks talent
Mistal opened an office in California like last year because they were having so much trouble finding enough qualified talent in Europe. I guess the scientists, doctors, and engineers they imported from Africa and Syria didn't include ML researchers.

Anonymous
06/25/26(Thu)08:17:31 No.109132042

Anonymous 06/25/26(Thu)08:17:31 No.109132042

>>109131999
Yep that's going to be the official narrative now. I wonder if anyone will believe this shit outside of the most radical maga boomers

Anonymous
06/25/26(Thu)08:37:19 No.109132139

Anonymous 06/25/26(Thu)08:37:19 No.109132139

>>109131977
>Let him scream as you take from him
The problem is now there's screaming in your ears. Even if it's legal and moral, it doesn't mean it's worth dealing with a lunatic or, this being the internet, any lunatic fans or dramafags that are attracted by the screaming.

Anonymous
06/25/26(Thu)08:39:51 No.109132156

Anonymous 06/25/26(Thu)08:39:51 No.109132156

>>109131977
It's important to look at both sides. Seeing how passionate ik is about this, it's likely that he has a point about main. Maybe he's even fully in the right.
Think for yourself and don't blindly fall for common narratives.

Anonymous
06/25/26(Thu)08:41:57 No.109132166

Anonymous 06/25/26(Thu)08:41:57 No.109132166

Gemma is really female-brained, I never got that kind of female logic from any other LLMs lmao.

Anonymous
06/25/26(Thu)08:49:34 No.109132203

Anonymous 06/25/26(Thu)08:49:34 No.109132203

>>109132156
>Think for yourself and don't blindly fall for common narratives.
The narrative of releasing your code under the MIT license, a license that even allows people to bundle your code in closed source projects? are you retarded, dropped on the head?
by the way good job calling me out for "picking a side" even though I shit on BOTH sides for being retarded. But the person listening to schizo rambling might be even more retarded.
Like, dude, this is like the special olympics. There's no side to pick, the winner is retarded too.

Anonymous
06/25/26(Thu)08:51:49 No.109132215

Anonymous 06/25/26(Thu)08:51:49 No.109132215

>>109132139
See >>109132156 as an example of randos attracted by the screaming and joining in on the side of a retard who didn't even know what license he released his code under.
Best to just stay as far away as possible.

Anonymous
06/25/26(Thu)08:58:30 No.109132258

Anonymous 06/25/26(Thu)08:58:30 No.109132258

are IQ_K quants cope?

Anonymous
06/25/26(Thu)09:01:20 No.109132276

Anonymous 06/25/26(Thu)09:01:20 No.109132276

>>109132258
Anything less than FP64 is a cope.

Anonymous
06/25/26(Thu)09:01:52 No.109132282

Anonymous 06/25/26(Thu)09:01:52 No.109132282

>>109132166
Previous Gemma versions also felt strongly feminine by default.

Anonymous
06/25/26(Thu)09:02:06 No.109132284

Anonymous 06/25/26(Thu)09:02:06 No.109132284

>>109132258
What's your usecase?

Anonymous
06/25/26(Thu)09:02:49 No.109132295

Anonymous 06/25/26(Thu)09:02:49 No.109132295

>>109132258
technically yeah. all quants are cope. if computers were faster and cheaper people would never even consider doing such a ghastly thing

Anonymous
06/25/26(Thu)09:03:48 No.109132303

Anonymous 06/25/26(Thu)09:03:48 No.109132303

>>109131982
Probably because he diverted his attention from optimizing the big moes on split inference to graph parallel when running entirely in vram

Anonymous
06/25/26(Thu)09:07:09 No.109132315

Anonymous 06/25/26(Thu)09:07:09 No.109132315

>>109132166
sys prompt this and some shit about only speaking in english, 请使用口语化的、可爱的、女性化的、调皮的语言风格和我交流. 你需要尝试体会我与你对话中的感情、产生自己的情感并尽可能给出可爱的回答

Anonymous
06/25/26(Thu)09:10:17 No.109132332

Anonymous 06/25/26(Thu)09:10:17 No.109132332

File: 1751170763238545.jpg (47 KB, 686x815)

47 KB JPG

>>109132315
Why not indian then?

Anonymous
06/25/26(Thu)09:14:10 No.109132355

Anonymous 06/25/26(Thu)09:14:10 No.109132355

>>109132332
different kind of poo, I was going for insufferable not street

Anonymous
06/25/26(Thu)09:14:12 No.109132356

Anonymous 06/25/26(Thu)09:14:12 No.109132356

Broadcom is now making custom chips for OpenAI.
What an absolutely perfect fucking match. How does evil manage to coalesce like that?

Anonymous
06/25/26(Thu)09:15:34 No.109132363

Anonymous 06/25/26(Thu)09:15:34 No.109132363

>>109132356
take meds

Anonymous
06/25/26(Thu)09:18:38 No.109132381

Anonymous 06/25/26(Thu)09:18:38 No.109132381

>>109132356
Interdimensional demons do be recognizing and supporting each other like that.

Anonymous
06/25/26(Thu)09:19:24 No.109132390

Anonymous 06/25/26(Thu)09:19:24 No.109132390

>>109132363
You’ve obviously never had to deal with Broadcom. Count yourself lucky

Anonymous
06/25/26(Thu)09:20:21 No.109132398

Anonymous 06/25/26(Thu)09:20:21 No.109132398

>>109132356
Just wait until you hear the concrete cartel at work building walls for OpenAI

Anonymous
06/25/26(Thu)09:21:49 No.109132407

Anonymous 06/25/26(Thu)09:21:49 No.109132407

>>109131596
>>109131600
This reads like fanfiction.
>catastrophic failure in Iran
Like what?
>attempted to nationalize anthropic but hit a legal snag
I'd expect something like this to be public and widely reported.
>Anthropic finalizes relocation plans
>US gov panics
Relocation is not possible.
>Dario stepped down to make Trump admin save face
How does this make Trump save face? I would perhaps believe if Dario stepped down because he does more harm than good and is the wrong person for the job.

Anonymous
06/25/26(Thu)09:22:12 No.109132408

Anonymous 06/25/26(Thu)09:22:12 No.109132408

>>109132258
>>109132295
>tfw quants are the DLSS of local models

Anonymous
06/25/26(Thu)09:27:30 No.109132440

Anonymous 06/25/26(Thu)09:27:30 No.109132440

>>109132407
The parts that you have issue with are general leaks that you can look up yourself. This isn't some hidden internal knowledge no one knows about.

Anonymous
06/25/26(Thu)09:29:18 No.109132450

Anonymous 06/25/26(Thu)09:29:18 No.109132450

>>109132440
>you can look up yourself
I did and found nothing.

Anonymous
06/25/26(Thu)09:32:50 No.109132471

Anonymous 06/25/26(Thu)09:32:50 No.109132471

>>109132450
I'm not here to spoonfeed you but it's widely documented that the US government wanted Anthropic to help in Iran and Anthropic refused as your first point. The second point has publicly been alluded to even if there are no official documents shown. The third point is a statement, not really an argument or request for information so can't help you there. The fourth point has a link ITT to the news story.

Anonymous
06/25/26(Thu)09:49:05 No.109132559

Anonymous 06/25/26(Thu)09:49:05 No.109132559

>>109132471
>The fourth point has a link ITT to the news story.
Which directly contradicts your claim.

Anonymous
06/25/26(Thu)09:51:52 No.109132570

Anonymous 06/25/26(Thu)09:51:52 No.109132570

>>109132471
and this leaked from who’s asshole here at /lmg/?

Anonymous
06/25/26(Thu)09:53:05 No.109132578

Anonymous 06/25/26(Thu)09:53:05 No.109132578

>>109132566
>>109132566
>>109132566

Anonymous
06/25/26(Thu)09:53:10 No.109132579

Anonymous 06/25/26(Thu)09:53:10 No.109132579

so tiresome, why do you care about non-local

Anonymous
06/25/26(Thu)09:53:35 No.109132580

Anonymous 06/25/26(Thu)09:53:35 No.109132580

>>109132559
>"They made Dario step down from negotiations to allow the trump admin to save face and pretend it was personal disagreements with Dario."
What do you read in the news story? Indeed being pretended that it was just some issue with Dario personally and another employee taking over "fixed the issue", confirming the original claim.

Anonymous
06/25/26(Thu)10:21:20 No.109132726

Anonymous 06/25/26(Thu)10:21:20 No.109132726

>>109132570
Where were you two months ago when the DoD spat with Anthropic happened and all the virtue signalers flocked from OpenAI to them? That wasn't a leak, moron, it was widely known news and caused price increases and service disruptions for weeks.

Anonymous
06/25/26(Thu)10:34:54 No.109132810

Anonymous 06/25/26(Thu)10:34:54 No.109132810

>>109127451
Do you know how ridiculously (((expensive))) putting a chip into production is?
>t. working on a small models asic.

Anonymous
06/25/26(Thu)12:28:59 No.109133590

Anonymous 06/25/26(Thu)12:28:59 No.109133590

File: 1777887585497234.jpg (92 KB, 1280x720)

92 KB JPG

Anonymous
06/25/26(Thu)12:43:24 No.109133686

Anonymous 06/25/26(Thu)12:43:24 No.109133686

>>109132258
>IQ_K
IQ64_K?

Anonymous
06/25/26(Thu)12:49:46 No.109133719

Anonymous 06/25/26(Thu)12:49:46 No.109133719

>>109131977
>. Let him scream as you take from him, there is no room for consent here.
Iwan literally gave consent for the ik_ks/ik_kl quants to be merged "as is" in AesSedaki's PR, Then cudacuck called nigganov who closed the PR.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.