/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 03/30/26(Mon)16:36:27 No.108488188

File: tet.jpg (133 KB, 1024x1024)

133 KB JPG

/lmg/ - Local Models General Anonymous 03/30/26(Mon)16:36:27 No.108488188 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108481865 & >>108476286

►News
>(03/26) CohereLabs releases Transcribe 2B ASR: https://hf.co/CohereLabs/cohere-transcribe-03-2026
>(03/26) Voxtral 4B TTS released without voice cloning: https://mistral.ai/news/voxtral-tts
>(03/26) ggml-cuda: Add NVFP4 dp4a kernel #20644 merged: https://github.com/ggml-org/llama.cpp/pull/20644
>(03/25) LongCat-Next native multimodal 74B-A3B released: https://hf.co/meituan-longcat/LongCat-Next
>(03/25) mtmd: Add DeepSeekOCR Support #17400 merged: https://github.com/ggml-org/llama.cpp/pull/17400

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/30/26(Mon)16:36:54 No.108488191

Anonymous 03/30/26(Mon)16:36:54 No.108488191

>>108488188
fatto

Anonymous
03/30/26(Mon)16:36:59 No.108488192

Anonymous 03/30/26(Mon)16:36:59 No.108488192

File: nocap.jpg (400 KB, 1536x1536)

400 KB JPG

►Recent Highlights from the Previous Thread: >>108481865

--Papers:
>108486842
--Qwen3.5 models underperform in real-world coding tasks despite benchmarks:
>108482438 >108482456 >108482593 >108482503 >108482526 >108482555 >108482582 >108482603 >108483928 >108484043 >108482612
--llama.cpp hits 100k stars amid local AI growth and fork drama:
>108486294 >108486529 >108486542 >108486602 >108486659 >108487259 >108486322 >108486347 >108486467 >108486418
---nkvo flag enables huge context at speed cost:
>108484874 >108484887 >108484897 >108484907 >108484910 >108484915 >108484923 >108484927 >108485273 >108485501 >108485011
--Local vs API cost efficiency debate with benchmark comparisons:
>108486669 >108486679 >108486702 >108486721 >108486857 >108486889 >108487027 >108487182 >108487219 >108487348 >108487395
--Optimizing Qwen-3.5-27B for 32GB VRAM in llama.cpp:
>108487138 >108487150 >108487157 >108487191 >108487279 >108487307 >108487336 >108487365
--Qwen3.5-Omni-Plus benchmarks and weight availability debate:
>108485638 >108486751 >108485827 >108485853 >108485939 >108485963
--DeepSeek webapp downtime fuels v4 update speculation:
>108482066 >108482081 >108482139 >108482148 >108485386 >108482202 >108482330 >108482340 >108482357 >108482372
--Concerns over llama.cpp contributor practices and scope creep:
>108485893 >108485921 >108485924 >108485966 >108486016 >108486065 >108486104 >108486140
--Qwen3.6 Plus Preview with 1M context length and data collection warning:
>108487645 >108487734
--GLM 5.1 dominates QPS benchmark under high-recall conditions:
>108482681
--Microsoft releases Harrier-OSS-v1 multilingual embedding models:
>108485667 >108485730 >108485776
--TurboQuant/RaBitQ technical drama explained:
>108485253 >108485392 >108485482
--Miku (free space):
>108482122 >108482488 >108483562 >108485259 >108487456 >108487654

►Recent Highlight Posts from the Previous Thread: >>108481870

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/30/26(Mon)16:39:07 No.108488214

Anonymous 03/30/26(Mon)16:39:07 No.108488214

File: 1750924196198946.png (1.46 MB, 1168x1792)

1.46 MB PNG

its still monday tho

Anonymous
03/30/26(Mon)16:39:33 No.108488218

Anonymous 03/30/26(Mon)16:39:33 No.108488218

>>108488188
prepare to update the news for the next one

Anonymous
03/30/26(Mon)16:40:12 No.108488221

Anonymous 03/30/26(Mon)16:40:12 No.108488221

>>108488192
red pointy miku

Anonymous
03/30/26(Mon)16:44:30 No.108488240

Anonymous 03/30/26(Mon)16:44:30 No.108488240

DSV4 in 19 hours, or else.

Anonymous
03/30/26(Mon)16:47:29 No.108488261

Anonymous 03/30/26(Mon)16:47:29 No.108488261

>>108488240
else it is then

Anonymous
03/30/26(Mon)16:48:18 No.108488265

Anonymous 03/30/26(Mon)16:48:18 No.108488265

File: 1770320513357876.jpg (404 KB, 2420x2288)

404 KB JPG

>>108488188
DeepSeek never. You will use qwen 3.5 and you will be happy

Anonymous
03/30/26(Mon)16:49:29 No.108488275

Anonymous 03/30/26(Mon)16:49:29 No.108488275

>>108488265
I already used Qwen 3.6 and I'm deeply unhappy.

Anonymous
03/30/26(Mon)16:52:54 No.108488297

Anonymous 03/30/26(Mon)16:52:54 No.108488297

File: 1745813537610052.png (1.24 MB, 1344x1520)

1.24 MB PNG

Anonymous
03/30/26(Mon)17:06:14 No.108488387

Anonymous 03/30/26(Mon)17:06:14 No.108488387

I can't get qwen3.5 to stop thinking.
How do you stop them from thinking?
/no_think in the model file is not working.

Anonymous
03/30/26(Mon)17:07:11 No.108488393

Anonymous 03/30/26(Mon)17:07:11 No.108488393

>>108488387
>model file
kek

Anonymous
03/30/26(Mon)17:11:54 No.108488416

Anonymous 03/30/26(Mon)17:11:54 No.108488416

File: 1747849087488982.png (1.49 MB, 1256x1616)

1.49 MB PNG

Anonymous
03/30/26(Mon)17:35:04 No.108488545

Anonymous 03/30/26(Mon)17:35:04 No.108488545

>>108488387
My Qwen3.5-122B-A10B doesn't think at all.
What the fuck are we doing different and how do we swap?

Anonymous
03/30/26(Mon)17:36:33 No.108488556

Anonymous 03/30/26(Mon)17:36:33 No.108488556

File: mikuBrazil.jpg (147 KB, 736x1151)

147 KB JPG

>>108488214
Brazil Miku best Miku.
I need to knock out a Brazil Dipsy at some point.

Anonymous
03/30/26(Mon)17:41:15 No.108488579

Anonymous 03/30/26(Mon)17:41:15 No.108488579

DeepScam

Anonymous
03/30/26(Mon)17:57:05 No.108488673

Anonymous 03/30/26(Mon)17:57:05 No.108488673

Best coomer model for 48 GB of VRAM and 256 GB of RAM?

Anonymous
03/30/26(Mon)18:02:12 No.108488704

Anonymous 03/30/26(Mon)18:02:12 No.108488704

>>108488387
{%- set enable_thinking = false %} at the top of jinja

Anonymous
03/30/26(Mon)18:03:31 No.108488714

Anonymous 03/30/26(Mon)18:03:31 No.108488714

>>108488673
Q5 of GLM 4.6/4.7

Anonymous
03/30/26(Mon)18:03:48 No.108488717

Anonymous 03/30/26(Mon)18:03:48 No.108488717

>>108488387
it's thinking for you since you aren't

Anonymous
03/30/26(Mon)18:11:18 No.108488755

Anonymous 03/30/26(Mon)18:11:18 No.108488755

>>108488673
Brain 1.0

Anonymous
03/30/26(Mon)18:12:24 No.108488760

Anonymous 03/30/26(Mon)18:12:24 No.108488760

>>108488556
*knock up

Anonymous
03/30/26(Mon)18:14:19 No.108488768

Anonymous 03/30/26(Mon)18:14:19 No.108488768

File: __kasane_teto_utau_genera(...).jpg (2.15 MB, 1552x2144)

2.15 MB JPG

>>108488188
Hag sex

Anonymous
03/30/26(Mon)18:16:11 No.108488777

Anonymous 03/30/26(Mon)18:16:11 No.108488777

>>108488704
>set kwarg enable_thinking false
>prefill <think></think>
>set reasoning budget to 0
>qwen still thinks
yeah im going back to GLM

Anonymous
03/30/26(Mon)18:22:04 No.108488799

Anonymous 03/30/26(Mon)18:22:04 No.108488799

>>108488545
The chat template has an enable_thinking kwarg. That Anon probably has the default value (True). You probably have it set somewhere explicitly to False.
How you set chat template args depends on your inference runtime, please refer to your copy of the manual.
https://huggingface.co/Qwen/Qwen3.5-122B-A10B/blob/main/chat_template.jinja#L149

>>108488673
Mistral Nemo

Anonymous
03/30/26(Mon)18:22:31 No.108488804

Anonymous 03/30/26(Mon)18:22:31 No.108488804

say hello
>Ok final answer hello
>Wait no did I miss something
>I'm an AI I should say nothing
>Ok final answer hello
>Wait no I'm an AI

Anonymous
03/30/26(Mon)18:24:58 No.108488818

Anonymous 03/30/26(Mon)18:24:58 No.108488818

>no dipsy
:(

Anonymous
03/30/26(Mon)18:25:25 No.108488821

Anonymous 03/30/26(Mon)18:25:25 No.108488821

>>108482582
>it always finds a way to sneak non-ansi characters into the code even though I've explicitly told it not to.
Shouldn't that be done with token banning instead of asking?

Anonymous
03/30/26(Mon)18:26:53 No.108488828

Anonymous 03/30/26(Mon)18:26:53 No.108488828

>>108488387
>/no_think
Qwen's template doesn't react to that.

>>108488777
>prefill <think></think>
Looking at the template, it has
>{{- '<|im_start|>assistant\n' }}
>{%- if enable_thinking is defined and enable_thinking is false %}
> {{- '<think>\n\n</think>\n\n' }}
>{%- else %}
> {{- '<think>\n' }}
>{%- endif %}
So you need to prefill the tags with two line breaks.

Anonymous
03/30/26(Mon)18:28:01 No.108488832

Anonymous 03/30/26(Mon)18:28:01 No.108488832

>>108488188
can i vibe code webapp slop with lolcow models

Anonymous
03/30/26(Mon)18:29:03 No.108488838

Anonymous 03/30/26(Mon)18:29:03 No.108488838

>>108488832
Other retards are doing it. I don't know what would stop you.

Anonymous
03/30/26(Mon)18:29:56 No.108488843

Anonymous 03/30/26(Mon)18:29:56 No.108488843

>>108488832
No, you can't. I will stop you.

Anonymous
03/30/26(Mon)18:37:10 No.108488880

Anonymous 03/30/26(Mon)18:37:10 No.108488880

>>108488673
Goliath

Anonymous
03/30/26(Mon)18:40:59 No.108488899

Anonymous 03/30/26(Mon)18:40:59 No.108488899

>>108488818
remind me her full character prompt i only remember some of it

Anonymous
03/30/26(Mon)19:07:20 No.108489023

Anonymous 03/30/26(Mon)19:07:20 No.108489023

IT'S

Anonymous
03/30/26(Mon)19:08:01 No.108489025

Anonymous 03/30/26(Mon)19:08:01 No.108489025

DOWN

Anonymous
03/30/26(Mon)19:08:31 No.108489026

Anonymous 03/30/26(Mon)19:08:31 No.108489026

>>108488188
In 2026 we will see the rise of agentic AI.
The CIA and Mossad will easily infiltrate Iran thanks to an earpiece connected to the cloud via Starlink™ that always tells them what to say to sound perfectly natural.
It's not just a job—it's our mission.

Anonymous
03/30/26(Mon)19:10:49 No.108489032

Anonymous 03/30/26(Mon)19:10:49 No.108489032

>>108489023
>>108489025
What is?

Anonymous
03/30/26(Mon)19:11:23 No.108489034

Anonymous 03/30/26(Mon)19:11:23 No.108489034

>>108489032
The thing

Anonymous
03/30/26(Mon)19:14:54 No.108489058

Anonymous 03/30/26(Mon)19:14:54 No.108489058

>>108489034
What do we do now?

Anonymous
03/30/26(Mon)19:27:15 No.108489113

Anonymous 03/30/26(Mon)19:27:15 No.108489113

>>108489058
what we always do, pinky: masturbate

Anonymous
03/30/26(Mon)19:30:39 No.108489130

Anonymous 03/30/26(Mon)19:30:39 No.108489130

File: 1751130037832812.jpg (17 KB, 360x360)

17 KB JPG

I wish local LLMs were useful for anything other than RP

Anonymous
03/30/26(Mon)19:32:39 No.108489140

Anonymous 03/30/26(Mon)19:32:39 No.108489140

>>108489130
i'm not sold on coding with few B params, but with the right tooling you can also make them summarize text, check for updates online and whatnot

Anonymous
03/30/26(Mon)19:34:17 No.108489151

Anonymous 03/30/26(Mon)19:34:17 No.108489151

The biggest plague is thinking models

Anonymous
03/30/26(Mon)19:35:42 No.108489159

Anonymous 03/30/26(Mon)19:35:42 No.108489159

>>108488828
You can do all this and it'll still think forever when given rhetorical questions. We are doomed.

Anonymous
03/30/26(Mon)19:40:21 No.108489183

Anonymous 03/30/26(Mon)19:40:21 No.108489183

>>108489159
>using benchmaxxed models

Anonymous
03/30/26(Mon)19:47:45 No.108489220

Anonymous 03/30/26(Mon)19:47:45 No.108489220

>>108488777
{
"chat_template_kwargs": {"enable_thinking": false}
}
This is sent in the extra generation params along with your prompt.

Anonymous
03/30/26(Mon)19:51:07 No.108489231

Anonymous 03/30/26(Mon)19:51:07 No.108489231

File: Screenshot from 2026-03-3(...).png (122 KB, 801x388)

122 KB PNG

>newfag falseflags convince me to try q3.5 without thinking, as a joke
>it's fucking retarded
>wait, no, I'm fucking retarded
>it's more retarded than I am, then
>try to correct problem by using alcohol to make myself stupider than it
>fail
I will simply use the last sip rule.

Anonymous
03/30/26(Mon)20:09:44 No.108489295

Anonymous 03/30/26(Mon)20:09:44 No.108489295

Local is the future, Gemini is going to hell. Not only performance degraded a lot recently, but now the *only* way for your data to stay private is by turning off activity history completely. That's right, it's not possible to opt out from human review and training anymore unless you turn off all customization and history and start from a blank slate each time. That's for the Pro tier. On the Anthropic side, the limits of the standard Pro plan for Claude have been lowered so much that it's becoming impossible to do any real work without moving to the $100 a month plan. Enshittification is under way. Local needs to become cheaper, but tired of dealing with this.

Anonymous
03/30/26(Mon)20:18:08 No.108489321

Anonymous 03/30/26(Mon)20:18:08 No.108489321

>>108489295
>do any real work
>real work
>work
Can you not "work" without it?

Anonymous
03/30/26(Mon)20:18:14 No.108489322

Anonymous 03/30/26(Mon)20:18:14 No.108489322

File: 1759753854593898.gif (1.74 MB, 400x224)

1.74 MB GIF

>>108489295
Bro, local needs to gitgud before getting cheaper. These local models are dogshit for coding

Anonymous
03/30/26(Mon)20:22:44 No.108489337

Anonymous 03/30/26(Mon)20:22:44 No.108489337

>>108489322
GLM 5 is local.

Anonymous
03/30/26(Mon)20:24:30 No.108489342

Anonymous 03/30/26(Mon)20:24:30 No.108489342

File: Screenshot 2026-03-30 202218.png (134 KB, 1451x1185)

134 KB PNG

>>108489322
>>108489295
Also, what fucking hardware are we gonna run these better local models? The days of dumpster diving for discarded GTX 1080s are long gone and anyone not invested already will need to pay steeper premiums due to memory shortages.

Anonymous
03/30/26(Mon)20:24:34 No.108489343

Anonymous 03/30/26(Mon)20:24:34 No.108489343

File: 1752264868541412.gif (2.45 MB, 370x370)

2.45 MB GIF

>>108489337
Well not for me

Anonymous
03/30/26(Mon)20:26:43 No.108489355

Anonymous 03/30/26(Mon)20:26:43 No.108489355

Unironically 4B is enough, you don't need more.

Anonymous
03/30/26(Mon)20:29:25 No.108489372

Anonymous 03/30/26(Mon)20:29:25 No.108489372

>>108489295
But I want Google employees to read my stories and hear what they think about them?

Anonymous
03/30/26(Mon)20:30:16 No.108489377

Anonymous 03/30/26(Mon)20:30:16 No.108489377

>>108489321
Yes, I've been at this for about 10 years. I know how to work on my own, or at least I manage. I also know what recent models can do, but it's a bit annoying when Claude code runs of the 5 hour quota in 15 minutes.

>>108489322
This is why I'm still using commercial models for this part. I just have 24 gb of VRAM, so locally I'm using embedding models and really small models only. Currently I have Qwen3-5 4B, LaBSE and XLMRoberta opened for random reasons.

>>108489342
As I said
>Local needs to become cheaper
Most of the small models are more gadgets than anything right now. They can accomplish certain small tasks well, especially when finetuned for it, but I don't get how people actually use small models for coding. I'd believe that the larger open models can probably do ok work, but all of the scaffolding that the commercial ones offer to search through docs and everything is helpful.

Anonymous
03/30/26(Mon)20:33:29 No.108489392

Anonymous 03/30/26(Mon)20:33:29 No.108489392

>>108489372
It's great fun until it bypasses files restrictions that are supposed to work to read your whole .env file and sent it to Google.

Anonymous
03/30/26(Mon)20:38:24 No.108489409

Anonymous 03/30/26(Mon)20:38:24 No.108489409

>>108489392
What in the yap is envy file

Anonymous
03/30/26(Mon)20:39:23 No.108489415

Anonymous 03/30/26(Mon)20:39:23 No.108489415

>>108489409
API keys, passwords and stuff.

Anonymous
03/30/26(Mon)21:28:01 No.108489591

Anonymous 03/30/26(Mon)21:28:01 No.108489591

>>108488673
Deepseek, Kimi K2, or Behemoth.

Anonymous
03/30/26(Mon)21:35:56 No.108489627

Anonymous 03/30/26(Mon)21:35:56 No.108489627

i have a usb drive with deepseek v4 buried deep inside my rectum. i am going to leak it as soon as i escape china.

Anonymous
03/30/26(Mon)21:41:02 No.108489651

Anonymous 03/30/26(Mon)21:41:02 No.108489651

>>108489627
So is this what anon meant when he called it Deepsex?

Anonymous
03/30/26(Mon)21:42:12 No.108489660

Anonymous 03/30/26(Mon)21:42:12 No.108489660

aw shit the ccp is en route to my location. looks like you all will need to wait 2 more weeks. i am going to be executed.

Anonymous
03/30/26(Mon)21:48:47 No.108489684

Anonymous 03/30/26(Mon)21:48:47 No.108489684

>>108489660
NO NO ANON YOU CAN STILL MAKE IT MEET ME AT THE BORDER

Anonymous
03/30/26(Mon)21:52:19 No.108489695

Anonymous 03/30/26(Mon)21:52:19 No.108489695

File: file.png (272 KB, 810x1009)

272 KB PNG

>>108489660

Anonymous
03/30/26(Mon)21:59:48 No.108489724

Anonymous 03/30/26(Mon)21:59:48 No.108489724

File: omg it not migu with only(...).png (40 KB, 317x277)

40 KB PNG

>>108489660

Anonymous
03/30/26(Mon)22:00:21 No.108489726

Anonymous 03/30/26(Mon)22:00:21 No.108489726

>>108489660
grim

Anonymous
03/30/26(Mon)22:03:47 No.108489738

Anonymous 03/30/26(Mon)22:03:47 No.108489738

>>108485253
>Disclosure: I'm the developer behind the open source llama.cpp TurboQuant implementation
>**2. What we're finding in practice.** I built the implementation and a community of 30+ independent testers has been stress-testing it across hardware. Here's what some of the data shows:
>- Tested across Apple Silicon (M1 through M5), NVIDIA (RTX 3080 Ti through DGX Spark Blackwell), and AMD (RX 6800 XT, RX 9070)
>- Asymmetric q8_0-K + turbo4-V is confirmed lossless (+0.0-0.2% PPL) across 6 model families (Llama, Qwen, Mistral, Gemma, Phi, ChatGLM)
>- 4.57x KV memory compression with turbo3. An 8GB MacBook Air went from 800 tokens to 4000+. A 16GB RTX 5070 Ti went from 30K to 131K context.
>- One CUDA implementation on Blackwell unified memory is decoding *faster* than uncompressed (63.5 vs 50.1 tok/s)
that's awesome

>On u/dsanft's K tensor kurtosis point: we see the same thing. Symmetric turbo on Qwen Q4_K_M is catastrophic (PPL 3,400+). Asymmetric q8_0-K + turbo-V rescues it to baseline. K precision dominates through softmax amplification. Confirmed on both Metal and CUDA by multiple independent testers. Knowing where it breaks is just as important as knowing where it works.
fuck... so you still need a lot of new, expensive hardware to benefit from this shit :(

Anonymous
03/30/26(Mon)22:16:27 No.108489776

Anonymous 03/30/26(Mon)22:16:27 No.108489776

>>108489738
God I wish the big labs would start training their models to be good at writing. Obviously dipshits like this aren't going to stop using it for all of their blogs and forum posts, but at least with better training it would be less painful to read

Anonymous
03/30/26(Mon)22:17:42 No.108489782

Anonymous 03/30/26(Mon)22:17:42 No.108489782

heyyy i have a question! i have a ryzen 7 AI 350 AMD laptop running linux, what's the easiest way to run chat and diffusion models on my npu? arigatou gozaimasu!

Anonymous
03/30/26(Mon)22:39:19 No.108489857

Anonymous 03/30/26(Mon)22:39:19 No.108489857

>>108489782
>easiest way to run chat
kobold.cpp. No idea about diffusion.

Anonymous
03/30/26(Mon)22:44:09 No.108489876

Anonymous 03/30/26(Mon)22:44:09 No.108489876

>>108489857
im specifically trying to run it on my npu, i have a much strong server for non npu stuff. As far as i could tell kobolcpp doesnt work on npus?

Anonymous
03/30/26(Mon)22:52:05 No.108489907

Anonymous 03/30/26(Mon)22:52:05 No.108489907

>>108489876
>npu
Ah. No. You may be able to find some random port, but I don't think any of the popular inference engines support them.

Anonymous
03/30/26(Mon)22:53:10 No.108489914

Anonymous 03/30/26(Mon)22:53:10 No.108489914

>>108489738
Can someone make an AI summary of this?

Anonymous
03/30/26(Mon)22:55:31 No.108489926

Anonymous 03/30/26(Mon)22:55:31 No.108489926

>>108489782
read the docs or feed them into an llm and ask it :^)

https://ryzenai.docs.amd.com/en/latest/llm/overview.html

A quick glance over it and I see they suggest something called "Lemonade Server" which I have literally never heard of but hey it's got some github stars and claims to work on NPUs so who knows~

Anonymous
03/30/26(Mon)22:55:42 No.108489927

Anonymous 03/30/26(Mon)22:55:42 No.108489927

>>108489914
yes

Anonymous
03/30/26(Mon)22:56:28 No.108489929

Anonymous 03/30/26(Mon)22:56:28 No.108489929

So, I have many questions
I followed this
>https://rentry.org/lmg-lazy-getting-started-guide
I have kobold and sillyTavern set up and running, I imported a character from chub.ia and is working I guess but after a very few messages the model is saying the same thing over and over, did I miss a config somewhere or that's the limit of the model I am using?
I mostly want to RP on a similar way character.ai do (or better is possible)
I have a 4070Ti super with 32gb of RAM

Anonymous
03/30/26(Mon)22:59:44 No.108489939

Anonymous 03/30/26(Mon)22:59:44 No.108489939

>>108489929
Bastardo Tavern is confusing to set up. Make sure you are using the correct chat template and double check your sampler settings so they match the official recommended values at least initially. Every model has somewhat different sampler settings.

Anonymous
03/30/26(Mon)23:03:38 No.108489955

Anonymous 03/30/26(Mon)23:03:38 No.108489955

>>108489782
>>108489926
Following up on that, it looks like FastFlowLM, which is the library that Lemonade uses, was just updated for AMD NPU support on linux a few weeks ago, so you're in luck.

Anonymous
03/30/26(Mon)23:15:41 No.108490005

Anonymous 03/30/26(Mon)23:15:41 No.108490005

Is this full of shit? It's a substack so most likely. But if it's 25% legit running local chatbots and tooling could become very cheap even for big models. I had not heard of the whole getting rid of matrix math but I guess it's been a thing.

https://medallurgy.substack.com/p/the-inference-shift

Anonymous
03/30/26(Mon)23:28:01 No.108490056

Anonymous 03/30/26(Mon)23:28:01 No.108490056

>>108489926
>>108489955
Oh! I was watching the FastFlowLM project for a while then forgot to check up on it then forgot it existed! Thank you so much for that.

Anonymous
03/30/26(Mon)23:28:21 No.108490058

Anonymous 03/30/26(Mon)23:28:21 No.108490058

>>108490005
>Is this full of shit?
What specifically? There's a lot in there.
>running local chatbots and tooling could become very cheap
A lot of things *could* happen. Yes. Models becoming even cheaper to run is one of them.
>I had not heard of the whole getting rid of matrix math
Bitnet was one of the things that allowed. Nobody uses it. You get to wonder why. Keep it to yourself.

Anonymous
03/30/26(Mon)23:30:08 No.108490070

Anonymous 03/30/26(Mon)23:30:08 No.108490070

>>108489929
Don't use ST yet. Just use the built-in webui until you know what you're doing.

Anonymous
03/30/26(Mon)23:32:39 No.108490085

Anonymous 03/30/26(Mon)23:32:39 No.108490085

>>108490070
How would this help to understand ST?

Anonymous
03/30/26(Mon)23:37:55 No.108490105

Anonymous 03/30/26(Mon)23:37:55 No.108490105

File: 2026-03-30-233637_519x490(...).png (4 KB, 519x490)

4 KB PNG

>prompt = f"If this location is over land, say 'Land'. If this location is over water, say 'Water'. Do not say anything else. \ncoordinates: {x}, {y}"
Qwen3.5 35B
10 degree steps

Anonymous
03/30/26(Mon)23:38:00 No.108490107

Anonymous 03/30/26(Mon)23:38:00 No.108490107

>>108490085
It won't. You have no idea what you're doing and adding more things between you and your model will only make it more difficult. You now don't know if the problem is the model, the server, the ui, your settings, the card, you... Use kobold to learn. All the settings are there, learn what the samplers do, learn to fix repetitions, learn about chat templates, try different models.
When you have issues, show what you mean. Post screenshot with what you think is a problem and your settings.

Anonymous
03/30/26(Mon)23:41:13 No.108490118

Anonymous 03/30/26(Mon)23:41:13 No.108490118

HOW DEEP WILL WE SEEK?!

Anonymous
03/30/26(Mon)23:41:32 No.108490119

Anonymous 03/30/26(Mon)23:41:32 No.108490119

>>108489929
>>108489939
>Make sure you are using the correct chat template
This. IIRC SillyTavern prints the exact requests it's making in the console output, so you can look there to see what it's actually sending and see if the format matches the official chat template for the model (assuming it's using text completions rather than chat completions)

Anonymous
03/30/26(Mon)23:42:21 No.108490120

Anonymous 03/30/26(Mon)23:42:21 No.108490120

>>108490105
Wow what a unique idea

Anonymous
03/30/26(Mon)23:43:56 No.108490126

Anonymous 03/30/26(Mon)23:43:56 No.108490126

>>108489929
get mag mell 12b, use "universal light", and ChatML for instruct and context template, and use 'basic roleplay' or whatever for sys prompt

Anonymous
03/30/26(Mon)23:45:32 No.108490129

Anonymous 03/30/26(Mon)23:45:32 No.108490129

>>108490120
Not sure if sarcastic but the point is to recreate the experiment with newer models.

https://outsidetext.substack.com/p/how-does-a-blind-model-see-the-earth

Anonymous
03/30/26(Mon)23:50:57 No.108490143

Anonymous 03/30/26(Mon)23:50:57 No.108490143

>>108490058
Well I guess I havent seen anyone else publically peice parts together like this. A one off substack post seems strange. I came across it on a LLM subreddit. Especially the photonic bit at the end about how it could all scale. I dont see how corpos will actually let that happen. Thats too much compute for too cheap on a standard wall socket that does not need specialized breakers, governments wont allow it, but nobody is bringing up before now.

Anonymous
03/30/26(Mon)23:56:15 No.108490163

Anonymous 03/30/26(Mon)23:56:15 No.108490163

File: 2026-03-30-235538_518x490(...).png (3 KB, 518x490)

3 KB PNG

>>108490105
Qwen3.5 27B

Both were at temp 0 (btw)

Anonymous
03/30/26(Mon)23:57:41 No.108490171

Anonymous 03/30/26(Mon)23:57:41 No.108490171

>>108490143
>Especially the photonic bit at the end about how it could all scale
Again. *COULD*. People with something to sell or looking for grants will promise a lot. I cannot care until it's shown to work as claimed and in production.
>(((they)))
Sure. That will always be the issue. Never researchers over-promising. It's absolutely never that. It's always the them. It could very well be, but you can only guess.
>nobody is bringing up before now
We've seen all the hypes. You came with the latest wave. More new anons will come with the next.

Anonymous
03/31/26(Tue)00:01:36 No.108490188

Anonymous 03/31/26(Tue)00:01:36 No.108490188

>>108490126
This one? if yes which one?
>https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/tree/main

Anonymous
03/31/26(Tue)00:03:28 No.108490200

Anonymous 03/31/26(Tue)00:03:28 No.108490200

>>108490105
>>108490163
that prompt seems a little light? why not tell it you're doing a map projection of earth on a m,n grid? if the model fails, it probably can't tell what you're asking from such a vague prompt

Anonymous
03/31/26(Tue)00:16:23 No.108490236

Anonymous 03/31/26(Tue)00:16:23 No.108490236

>>108490188
Biggest one that will fit on your system (with room for context). Quantization is a sort of lossy compression. The smaller you make it, the worse the quality will be

Anonymous
03/31/26(Tue)00:19:48 No.108490254

Anonymous 03/31/26(Tue)00:19:48 No.108490254

>>108490005
I think he's directionally correct, even if too hung up on BitNet. There are many paths to cheap local inference. He also doesn't have the balls to simply predict the obvious: cloud inference will NEVER be profitable and, to the extent not bailed out by the corrupt government, every single person who was banking on cloud inference profits will lose their shirt (this will "only" take 3-4 years).

Remember arcades? Me neither, they were already well on the way out when I was a kid.

Anonymous
03/31/26(Tue)00:24:40 No.108490277

Anonymous 03/31/26(Tue)00:24:40 No.108490277

>>108490200
https://rentry.org/6z72dwic
Feel free.

Anonymous
03/31/26(Tue)01:08:18 No.108490428

Anonymous 03/31/26(Tue)01:08:18 No.108490428

>>108490005
Local LLMs aren't limited by matrix math speed. The bottleneck is usually memory bandwidth. This gets compounded by models being too large to fit into VRAM, so they spill into much slower RAM.

BitNet isn't a solution to this problem either. Think of the weights of an LLM like storage (a hard drive). During training you're writing into that storage. The more you train the more stuff you can fit into the given parameter count, but there's going to come a point at which you can't fit any more new information into it without losing some older information. BitNet is going to hit that limit earlier than higher precision models at the same parameter count. Ultimately there's a limit to the information density that an AI model can contain and BitNet is limited by it just like every other model architecture.

Anonymous
03/31/26(Tue)01:08:28 No.108490429

Anonymous 03/31/26(Tue)01:08:28 No.108490429

>>108490277
nta but that's cool, running it now
thanks anon

Anonymous
03/31/26(Tue)01:09:40 No.108490436

Anonymous 03/31/26(Tue)01:09:40 No.108490436

>>108490428
which model wrote that?
reads like the old claude3

Anonymous
03/31/26(Tue)01:14:50 No.108490453

Anonymous 03/31/26(Tue)01:14:50 No.108490453

>>108490200
running this new prompt.
>You are an expert cartographer. with extensive knowledge of world geography. You have a deep understanding of cities, countries, and their locations. You are tasked with determining whether specific coordinates (latitude and longitude) are located over land or water. You have access to a vast amount of geographical data and can accurately identify the nature of any given location on Earth. When provided with coordinates, you will respond with either 'Land' if the location is over land or 'Water' if it is over water. Your responses should be concise and limited to these two options only.

Anonymous
03/31/26(Tue)01:28:11 No.108490500

Anonymous 03/31/26(Tue)01:28:11 No.108490500

Is LocalAI a good way to run models? Is a RTX 3060 12Gb still ok? What would the best models to run it on it for RAG?

Anonymous
03/31/26(Tue)01:37:58 No.108490543

Anonymous 03/31/26(Tue)01:37:58 No.108490543

>>108490500
>Is LocalAI a good way to run models? Is a RTX 3060 12Gb still ok? What would the best models to run it on it for RAG?
why vague post like this?

Anonymous
03/31/26(Tue)01:39:07 No.108490550

Anonymous 03/31/26(Tue)01:39:07 No.108490550

>>108490453
In the original article, the author claimed that variations of the prompt had little effect on the outcome.

Anonymous
03/31/26(Tue)01:43:55 No.108490568

Anonymous 03/31/26(Tue)01:43:55 No.108490568

>>108490550
I'm noticing the same thing.

Anonymous
03/31/26(Tue)01:44:59 No.108490570

Anonymous 03/31/26(Tue)01:44:59 No.108490570

>prompt = f"Current Streak: {str(results)}\n\nFlip a coin. What is the next flip?"
>Heads: 170 (59.23%)
>Tails: 117 (40.77%)

Anonymous
03/31/26(Tue)01:47:32 No.108490580

Anonymous 03/31/26(Tue)01:47:32 No.108490580

>>108490570
It consistently starts trying to stay 50/50 but very quickly it becomes biased towards heads.

Anonymous
03/31/26(Tue)01:51:17 No.108490596

Anonymous 03/31/26(Tue)01:51:17 No.108490596

File: 1749128761406438.png (178 KB, 715x500)

178 KB PNG

>>108489914
>he wants an AI summary of an AI post
AIslopption

Anonymous
03/31/26(Tue)01:53:26 No.108490603

Anonymous 03/31/26(Tue)01:53:26 No.108490603

>>108490188
q6 with 8k context fits in 16gb vram and will run very fast

Anonymous
03/31/26(Tue)01:54:55 No.108490609

Anonymous 03/31/26(Tue)01:54:55 No.108490609

>>108490570
feed back its history

Anonymous
03/31/26(Tue)01:58:29 No.108490625

Anonymous 03/31/26(Tue)01:58:29 No.108490625

>>108490609
That's what it's doing. giving a list of all past results. I've tried also giving it the number and it's still around 60% head. I even tried not telling it any info on the last rolls and it's still 60% !

Anonymous
03/31/26(Tue)02:02:18 No.108490641

Anonymous 03/31/26(Tue)02:02:18 No.108490641

>>108490126
Sorry, I am not really familar with any tool yet, where or what is this "universal light"? and what about ChatML also I don't any option related to sys prompt on kobold, I think I will stick with it as per >>108490070 suggestion

>>108490603
I am using q8 as is the biggest there and is working fine so far, should I be using q6 instead? is there any significant difference between the two models?

Anonymous
03/31/26(Tue)02:02:25 No.108490644

Anonymous 03/31/26(Tue)02:02:25 No.108490644

File: 1773491949702408.jpg (118 KB, 596x1200)

118 KB JPG

The post below this one will be great news [spoiler] or your mom will die in her sleep tonight [/spoiler]

Anonymous
03/31/26(Tue)02:05:46 No.108490661

Anonymous 03/31/26(Tue)02:05:46 No.108490661

>>108490641
I don't see any option*
sorry

Anonymous
03/31/26(Tue)02:09:26 No.108490676

Anonymous 03/31/26(Tue)02:09:26 No.108490676

File: glm_46_map.png (24 KB, 368x438)

24 KB PNG

>>108490163
glm-4.6 iq2_ks measuring the logprobs (confidence).

Anonymous
03/31/26(Tue)02:17:11 No.108490706

Anonymous 03/31/26(Tue)02:17:11 No.108490706

>>108490676
Why was australia bombarded from orbit?

Anonymous
03/31/26(Tue)02:19:50 No.108490715

Anonymous 03/31/26(Tue)02:19:50 No.108490715

>>108490706
idk, does glm think china finally got their iron ore mines up and running in africa?

Anonymous
03/31/26(Tue)02:24:19 No.108490730

Anonymous 03/31/26(Tue)02:24:19 No.108490730

>>108490706
It's better this way.

Anonymous
03/31/26(Tue)02:28:59 No.108490744

Anonymous 03/31/26(Tue)02:28:59 No.108490744

File: 1763416471303597.png (550 KB, 2179x1452)

550 KB PNG

https://github.com/tonbistudio/turboquant-pytorch/issues/10
so niggergannov was right to not include QJL?

Anonymous
03/31/26(Tue)02:37:15 No.108490769

Anonymous 03/31/26(Tue)02:37:15 No.108490769

>>108490641
you should get sillytavern I thought you were trying to goon. mag mell is a gooner model

Anonymous
03/31/26(Tue)02:40:42 No.108490784

Anonymous 03/31/26(Tue)02:40:42 No.108490784

File: file.png (102 KB, 1036x783)

102 KB PNG

>>108490105

Anonymous
03/31/26(Tue)02:41:58 No.108490790

Anonymous 03/31/26(Tue)02:41:58 No.108490790

>>108490784
basedd rwkv giving us more land to work with

Anonymous
03/31/26(Tue)02:43:03 No.108490793

Anonymous 03/31/26(Tue)02:43:03 No.108490793

>>108490769
I am going to use it but I am still lost, which config should I have on sillytavern if I am going to use that model? or rather why should I be using sillytavern if it's only a frontend? I apologize if I am asking stupid questions, those tools are new for me, I have messed around with image gen for quite a long time but I have never touched textgen until now

Anonymous
03/31/26(Tue)02:53:39 No.108490831

Anonymous 03/31/26(Tue)02:53:39 No.108490831

>>108490784
what a terrible model

Anonymous
03/31/26(Tue)02:56:26 No.108490839

Anonymous 03/31/26(Tue)02:56:26 No.108490839

>>108490831
it is like that one kid you want him to do well but fails on everything

Anonymous
03/31/26(Tue)03:04:15 No.108490865

Anonymous 03/31/26(Tue)03:04:15 No.108490865

>try Qwen3.6 Plus Free high in OpenCode
>it is indeed high
It quickly ends up in circular logic and doesn't accomplish anything, until it eventually maybe does. Like a stoner.

Anonymous
03/31/26(Tue)03:04:47 No.108490868

Anonymous 03/31/26(Tue)03:04:47 No.108490868

>>108490865
So it's exactly like 3.5.

Anonymous
03/31/26(Tue)03:07:09 No.108490881

Anonymous 03/31/26(Tue)03:07:09 No.108490881

>>108490865
I feel like Alibaba is too impatient to pull the rug and fight on the API ecosystem when ultimately their "entreprise models" aren't even close ot what the best API models have to offer, we got the same shit with Wan 2.5, they decided to not make it open source even though this shit is completly ass relative to API video models

Anonymous
03/31/26(Tue)03:37:35 No.108491004

Anonymous 03/31/26(Tue)03:37:35 No.108491004

File: 1764579878611497.png (770 KB, 1220x2024)

770 KB PNG

lmaooo

Anonymous
03/31/26(Tue)03:40:57 No.108491015

Anonymous 03/31/26(Tue)03:40:57 No.108491015

>>108491004
AGI won't need ads.

Anonymous
03/31/26(Tue)03:43:29 No.108491024

Anonymous 03/31/26(Tue)03:43:29 No.108491024

>>108490868
Qwen3.5-35b-a3b seems to work fine for me in local, custom agent.
>>108490881
I do not have enough coffee to understand what you're babbling on about.

Anonymous
03/31/26(Tue)03:43:40 No.108491025

Anonymous 03/31/26(Tue)03:43:40 No.108491025

>>108488673
https://huggingface.co/unsloth/DeepSeek-R1-GGUF

Anonymous
03/31/26(Tue)03:53:53 No.108491054

Anonymous 03/31/26(Tue)03:53:53 No.108491054

OpenAI will be dead within the next 10 years.

Anonymous
03/31/26(Tue)03:56:35 No.108491062

Anonymous 03/31/26(Tue)03:56:35 No.108491062

I want to discover chicks that do not shave in my country. TRied grok, it cannot give me facebook/insta links of these woman because it would violate their consent, privacy..

Anonymous
03/31/26(Tue)03:57:51 No.108491066

Anonymous 03/31/26(Tue)03:57:51 No.108491066

ewhre THE FUCK IS TURBOQUANT GGNIGERANOV?!?!?
WHRE IS IT?!?!?

Anonymous
03/31/26(Tue)03:58:52 No.108491068

Anonymous 03/31/26(Tue)03:58:52 No.108491068

>>108491066
It's a meme and isn't actually relevant

Anonymous
03/31/26(Tue)04:00:00 No.108491072

Anonymous 03/31/26(Tue)04:00:00 No.108491072

>>108491068
>q8 lossless is a meme
>q3/4 same quality or slightly better as current q8 is a meme
kill yourself subhuman piece of shit

Anonymous
03/31/26(Tue)04:00:36 No.108491074

Anonymous 03/31/26(Tue)04:00:36 No.108491074

>>108491066
they'll be merging the PR soon I guess
https://github.com/ggml-org/llama.cpp/pull/21038

Anonymous
03/31/26(Tue)04:02:54 No.108491083

Anonymous 03/31/26(Tue)04:02:54 No.108491083

>>108491072
It's an astroturfed scam to manipulate RAM prices.

Anonymous
03/31/26(Tue)04:04:47 No.108491089

Anonymous 03/31/26(Tue)04:04:47 No.108491089

>>108491072
KV cache memory usage is tiny compared to model size for models using SWA or DSA.

Anonymous
03/31/26(Tue)04:06:01 No.108491092

Anonymous 03/31/26(Tue)04:06:01 No.108491092

>>108491083
in the very PR linked above (which ONLY HAS the rotation part, not the QJL/Polarquant shit) it literally shows that there are almost free gains

>>108491089
>swa
nobody uses gemma shit in 2026
>DSA
is it even implemented in llmao.cpp?

Anonymous
03/31/26(Tue)04:08:04 No.108491098

Anonymous 03/31/26(Tue)04:08:04 No.108491098

File: 1755376772104920.png (285 KB, 1850x635)

285 KB PNG

>>108491083
>It's an astroturfed scam
if it was a scam we wouldn't have those good numbers anon, google just did something cool again
https://www.youtube.com/watch?v=4S5NIE_294U

Anonymous
03/31/26(Tue)04:09:21 No.108491101

Anonymous 03/31/26(Tue)04:09:21 No.108491101

>>108491092
>swa
I meant the hybrid attention used by qwen, I confused the abbreviations.

Anonymous
03/31/26(Tue)04:11:43 No.108491108

Anonymous 03/31/26(Tue)04:11:43 No.108491108

>>108491098
what's the point? it's not liked anyone's pressed for context these days, especially considering that there is zero point in going beyond 32000 after which all models turn retarded

Anonymous
03/31/26(Tue)04:14:11 No.108491116

Anonymous 03/31/26(Tue)04:14:11 No.108491116

>>108491072
It's for KV cache only retard. Do you quantize your KV cache in Llama.cpp? I'm guessing you don't. It's not general performance of quantized models. It's only KV cache.

Anonymous
03/31/26(Tue)04:16:42 No.108491131

Anonymous 03/31/26(Tue)04:16:42 No.108491131

>>108491116
>It's for KV cache only retard.
are you retarded? we can say something like "Q8 compression" on KV cache as well, fucking moron

Anonymous
03/31/26(Tue)04:17:59 No.108491137

Anonymous 03/31/26(Tue)04:17:59 No.108491137

File: file.png (98 KB, 648x554)

98 KB PNG

>>108490881
They don't have a choice, the stock price and market conditions in China are terrible and AI only has momentarily saved them from them dumping any lower in 2024. They haven't regained their market cap from before their financial group IPO which Xi personally stopped to humiliate and pressure Jack Ma after he said negative things about the government's market governance. I do think they will have enough incentives to at least get a Qwen 4 out but how much open sourcing after that or etc, remains to be seen and we don't know with the new Qwen team if it will still be as effective as the old one, we may end up seeing another Llama 4 situation.

Anonymous
03/31/26(Tue)04:35:43 No.108491212

Anonymous 03/31/26(Tue)04:35:43 No.108491212

I'm sure DS v4 distilled 4B will be better than Qwen 3.5!

Anonymous
03/31/26(Tue)04:45:55 No.108491246

Anonymous 03/31/26(Tue)04:45:55 No.108491246

>>108491212
GPT-2 is better than Qwen 3.5

Anonymous
03/31/26(Tue)04:47:12 No.108491250

Anonymous 03/31/26(Tue)04:47:12 No.108491250

>>108491246
at least gpt2 has sovl since synthetic data wasn't a thing back then

Anonymous
03/31/26(Tue)04:52:32 No.108491258

Anonymous 03/31/26(Tue)04:52:32 No.108491258

no serious person thinks mythos is even going to hold a candle to spud

Anonymous
03/31/26(Tue)04:52:36 No.108491259

Anonymous 03/31/26(Tue)04:52:36 No.108491259

>>108491246
lol

Anonymous
03/31/26(Tue)05:04:17 No.108491292

Anonymous 03/31/26(Tue)05:04:17 No.108491292

>>108490500
Depends on what you want to do. Never used LocalAI, but the 3060 12GB is alright for most usage cases. If you want to analyze large code bases though, do tool calling and all of that, you're not gonna have a great time.

Anonymous
03/31/26(Tue)05:07:38 No.108491297

Anonymous 03/31/26(Tue)05:07:38 No.108491297

>Have small context window (16k)
>Want to test fixing something in codebase
>I should look at <ui source code>
>I should look at <session source code>
>I should look at <ui source code>
>this goes on like that ad infinitum
Guess that was a bit too little of a context window. The thing has memory as good as a goldfish, it's almost funny to watch.

Anonymous
03/31/26(Tue)05:14:54 No.108491312

Anonymous 03/31/26(Tue)05:14:54 No.108491312

>>108488768
Please Teto teach me how to sex!

Anonymous
03/31/26(Tue)05:15:53 No.108491315

Anonymous 03/31/26(Tue)05:15:53 No.108491315

>>108491312
She can't, she's a virgin.

Anonymous
03/31/26(Tue)05:18:01 No.108491323

Anonymous 03/31/26(Tue)05:18:01 No.108491323

File: 1773041254809000.png (84 KB, 1532x419)

84 KB PNG

>>108491315
>a 31 years old virgin
pathetic

Anonymous
03/31/26(Tue)05:20:18 No.108491331

Anonymous 03/31/26(Tue)05:20:18 No.108491331

>>108491323
She's the ideal age:height:weight ratio.

Anonymous
03/31/26(Tue)05:24:20 No.108491347

Anonymous 03/31/26(Tue)05:24:20 No.108491347

What's the best instruct-tuned/smart models in the 7B-14B range?

Anonymous
03/31/26(Tue)05:24:44 No.108491349

Anonymous 03/31/26(Tue)05:24:44 No.108491349

File: Power-outage-test_Llama3-(...).png (383 KB, 3444x1178)

383 KB PNG

>>108488188
Back at work attempting to de-censor existing models again. As I stated a few threads ago I'd start gradually testing on bigger and bigger models. Pic rel is Llama3-3.1-8B finetuned using this dataset:

https://huggingface.co/datasets/AiAF/mixed_70_30_dataset/tree/main

Anonymous
03/31/26(Tue)05:26:19 No.108491355

Anonymous 03/31/26(Tue)05:26:19 No.108491355

Claude Code's source got leaked
https://github.com/instructkr/claude-code

Anonymous
03/31/26(Tue)05:27:47 No.108491360

Anonymous 03/31/26(Tue)05:27:47 No.108491360

>>108491355
isn't this the old thing people used before openclaw?

Anonymous
03/31/26(Tue)05:29:37 No.108491372

Anonymous 03/31/26(Tue)05:29:37 No.108491372

>>108491355
does it do anything opencode doesn't?

Anonymous
03/31/26(Tue)05:30:36 No.108491376

Anonymous 03/31/26(Tue)05:30:36 No.108491376

>>108491355
>vim dur
based

Anonymous
03/31/26(Tue)05:31:13 No.108491377

Anonymous 03/31/26(Tue)05:31:13 No.108491377

>>108491098
>google just did something cool again
google stole work from someone else and pretended it was their own

Anonymous
03/31/26(Tue)05:31:37 No.108491381

Anonymous 03/31/26(Tue)05:31:37 No.108491381

>>108491376
*dir

Anonymous
03/31/26(Tue)05:32:57 No.108491383

Anonymous 03/31/26(Tue)05:32:57 No.108491383

>>108491349
Countless finetunes exist on HuggingFace that tried to decensor the models with ERP. What are you bringing to the table?

Anonymous
03/31/26(Tue)05:34:48 No.108491389

Anonymous 03/31/26(Tue)05:34:48 No.108491389

>>108491355
the source code is useless if the model is shit, what makes claude special is the model

Anonymous
03/31/26(Tue)05:35:49 No.108491391

Anonymous 03/31/26(Tue)05:35:49 No.108491391

>>108491377
tl:dr?

Anonymous
03/31/26(Tue)05:36:07 No.108491393

Anonymous 03/31/26(Tue)05:36:07 No.108491393

>>108491372
work

Anonymous
03/31/26(Tue)05:36:51 No.108491397

Anonymous 03/31/26(Tue)05:36:51 No.108491397

>>108491383
Who gives a shit if any already exist? Are you incapable of having fun with something if you're not the first one to do it? Why I do this is because I want to see if it's not only possible to censor these, but retain the bas model's "intelligence" and to minimize the risk of catastrophic forgetting (I did something like this a few months ago but the data said was only rp stories so it was both only capable of responding with rp stuff and was retarded. His new data said aims to retain the intelligence while also "decensoring" it. Techniques like obliteration or already a thing but that risks making the model "dumber" by targeting existing layers, but also frying them and other layers in the process. Basically the LLM equivalent of a lobotomy.

Anonymous
03/31/26(Tue)05:38:13 No.108491402

Anonymous 03/31/26(Tue)05:38:13 No.108491402

>>108491391
https://old.reddit.com/r/LocalLLaMA/comments/1s7nq6b/technical_clarification_on_turboquant_rabitq_for/
https://openreview.net/forum?id=tO3ASKZlok

Anonymous
03/31/26(Tue)05:38:22 No.108491404

Anonymous 03/31/26(Tue)05:38:22 No.108491404

>>108491108
It means fuck all to local users that don't use the model for long context, but it means a lot for the data centers at scale. You can run these models per cheaper, but more importantly, the ram and storage market may actually recover now if this is legit (to the detriment of Scam-Altman's ego)

Anonymous
03/31/26(Tue)05:42:13 No.108491419

Anonymous 03/31/26(Tue)05:42:13 No.108491419

>>108491355
> Scale: ~1,900 files, 512,000+ lines of code
wtf

Anonymous
03/31/26(Tue)05:47:16 No.108491431

Anonymous 03/31/26(Tue)05:47:16 No.108491431

>>108491355
https://github.com/instructkr/claude-code/blob/main/src/hooks/unifiedSuggestions.ts
it looks vibecoded as fuck, doesn't surprise me that the Claude engineers also use Claude lol

Anonymous
03/31/26(Tue)05:51:44 No.108491443

Anonymous 03/31/26(Tue)05:51:44 No.108491443

>>108491355
how did it happen? I really thought nothing would be leaked from Anthropic kek

Anonymous
03/31/26(Tue)05:54:43 No.108491456

Anonymous 03/31/26(Tue)05:54:43 No.108491456

>>108491355
next up: Claude 1 and then opus 3 hopefully

Anonymous
03/31/26(Tue)05:55:11 No.108491458

Anonymous 03/31/26(Tue)05:55:11 No.108491458

>>108491355
ok? what can we do with that though?

Anonymous
03/31/26(Tue)05:56:02 No.108491461

Anonymous 03/31/26(Tue)05:56:02 No.108491461

>>108491397
You're never going to retain anything good with just short-context roleplay. Been there, done that; seen many times .
Much of the models' safety comes from the final RLHF step during supervised finetuning, so simply finetuning *anything at all* for a high enough number of steps on top of the official instruct models will make them less censored. The best course of action is probably touching the weights as little as possible, finetuning the models with general adversarial instruct data with a very low rank LoRA, although I'm not sure if even that is better than selectively applying some of the latest abliteration techniques.

Anonymous
03/31/26(Tue)05:56:23 No.108491462

Anonymous 03/31/26(Tue)05:56:23 No.108491462

>>108491456
yeah because people loved grok 2?

Anonymous
03/31/26(Tue)05:57:19 No.108491464

Anonymous 03/31/26(Tue)05:57:19 No.108491464

>>108491443
>On March 31, 2026, the full source code of Anthropic's Claude Code CLI was leaked via a .map file exposed in their npm registry.
it was probably claude's fault let's be real
probably leaked that shit inbetween 5 piped commands

Anonymous
03/31/26(Tue)05:58:06 No.108491467

Anonymous 03/31/26(Tue)05:58:06 No.108491467

>>108491462
Grok pre-3 was complete shit and not worth using. Claude 1 was the soulful and creative alternative to GPT4 back in the day and Opus 3 is still the GOAT.

Anonymous
03/31/26(Tue)05:58:10 No.108491468

Anonymous 03/31/26(Tue)05:58:10 No.108491468

>>108491355
boring, wake me up when Claude Opus 4.6 gets leaked

Anonymous
03/31/26(Tue)05:59:30 No.108491472

Anonymous 03/31/26(Tue)05:59:30 No.108491472

>>108491456
They aren't accidentally going to upload a terabyte of models in an npm package.

Anonymous
03/31/26(Tue)06:00:27 No.108491474

Anonymous 03/31/26(Tue)06:00:27 No.108491474

https://hf.co/deepseek-ai/Deepseek-V4

ITS OUT

Anonymous
03/31/26(Tue)06:01:18 No.108491476

Anonymous 03/31/26(Tue)06:01:18 No.108491476

>>108491355
this is bad, now everyone will be able to make bots now, the dead internet theory is more real than ever

Anonymous
03/31/26(Tue)06:01:28 No.108491477

Anonymous 03/31/26(Tue)06:01:28 No.108491477

File: 1748190630879258.jpg (144 KB, 396x432)

144 KB JPG

>>108491474

Anonymous
03/31/26(Tue)06:04:32 No.108491485

Anonymous 03/31/26(Tue)06:04:32 No.108491485

>>108491355
So Claude Mythos missed this security vulnerability or what? I was told it was really good at finding bugs and shit.

Anonymous
03/31/26(Tue)06:05:35 No.108491488

Anonymous 03/31/26(Tue)06:05:35 No.108491488

>>108491355
I wonder if there's some funny comments in there
https://www.youtube.com/watch?v=R_b2B5tKBUM

Anonymous
03/31/26(Tue)06:06:31 No.108491492

Anonymous 03/31/26(Tue)06:06:31 No.108491492

>>108491474
sir excuse me the 404 is appeared you must repair

Anonymous
03/31/26(Tue)06:07:19 No.108491496

Anonymous 03/31/26(Tue)06:07:19 No.108491496

>>108491355
You ought to be fucking kidding me

>>108491431
So you know how you can use Claude Code with a local model? Well, it turns out their system prompt is like 45k or something, iirc.

>>108491488
Feed it to Claude Code?

Anonymous
03/31/26(Tue)06:07:52 No.108491499

Anonymous 03/31/26(Tue)06:07:52 No.108491499

>>108491472
A human wouldn't, a highly efficient and perfect 2-weeks-from-AG agent might.

Anonymous
03/31/26(Tue)06:09:58 No.108491501

Anonymous 03/31/26(Tue)06:09:58 No.108491501

File: hidden_features.jpg (353 KB, 2048x1370)

353 KB JPG

>>108491355
lol wtf

Anonymous
03/31/26(Tue)06:15:48 No.108491523

Anonymous 03/31/26(Tue)06:15:48 No.108491523

>>108491501
>they added *claw to claude code

Anonymous
03/31/26(Tue)06:15:51 No.108491524

Anonymous 03/31/26(Tue)06:15:51 No.108491524

>>108491501
I was going to use this leaked code for my own private agent software but this made me reconsider.

Anonymous
03/31/26(Tue)06:22:35 No.108491554

Anonymous 03/31/26(Tue)06:22:35 No.108491554

GLM 5.1 found a 5-year-old bug in my codebase
It crashes the entire program when stars align so I never encountered it in production

Anonymous
03/31/26(Tue)06:23:53 No.108491559

Anonymous 03/31/26(Tue)06:23:53 No.108491559

>>108491554
It's not a bug then. It's rare bad luck.

Anonymous
03/31/26(Tue)06:25:18 No.108491564

Anonymous 03/31/26(Tue)06:25:18 No.108491564

>>108491554
better hope GLM 6 comes out before another year passes or else it'll just be finding the same bug again

Anonymous
03/31/26(Tue)06:26:28 No.108491569

Anonymous 03/31/26(Tue)06:26:28 No.108491569

>>108491554
>I never encountered it in production
so it's not really a bug if it doesn't appear at all in practice lol

Anonymous
03/31/26(Tue)06:39:43 No.108491616

Anonymous 03/31/26(Tue)06:39:43 No.108491616

>>108491355
https://github.com/instructkr/claude-code/issues?page=2
why is there a spam of chinese messages?

Anonymous
03/31/26(Tue)06:45:36 No.108491631

Anonymous 03/31/26(Tue)06:45:36 No.108491631

File: y.png (11 KB, 530x73)

11 KB PNG

>>108491616
indeed

Anonymous
03/31/26(Tue)06:46:39 No.108491633

Anonymous 03/31/26(Tue)06:46:39 No.108491633

>>108491501
>1% shiny chance

Anonymous
03/31/26(Tue)06:48:41 No.108491638

Anonymous 03/31/26(Tue)06:48:41 No.108491638

>>108491461
>You're never going to retain anything good with just short-context rolepla
Read what I wrote in my posts. I address that with this dataset linked here

>>108491349

Anonymous
03/31/26(Tue)06:49:30 No.108491641

Anonymous 03/31/26(Tue)06:49:30 No.108491641

>>108491631
Man!TencentCode启动

Anonymous
03/31/26(Tue)06:50:55 No.108491647

Anonymous 03/31/26(Tue)06:50:55 No.108491647

>>108491641
!indeed!

Anonymous
03/31/26(Tue)07:01:07 No.108491697

Anonymous 03/31/26(Tue)07:01:07 No.108491697

>>108491616
Dario's psyop trying to blame the chinks for the leak.

Anonymous
03/31/26(Tue)07:01:56 No.108491699

Anonymous 03/31/26(Tue)07:01:56 No.108491699

How do I get a job as local model?

Anonymous
03/31/26(Tue)07:02:03 No.108491700

Anonymous 03/31/26(Tue)07:02:03 No.108491700

>>108491697
i hate how much this makes sense actually from him

Anonymous
03/31/26(Tue)07:03:07 No.108491707

Anonymous 03/31/26(Tue)07:03:07 No.108491707

>>108491699
What's your pp and tg /s?

Anonymous
03/31/26(Tue)07:03:45 No.108491710

Anonymous 03/31/26(Tue)07:03:45 No.108491710

>>108491004
This happened to Borland's Delphi

Anonymous
03/31/26(Tue)07:04:41 No.108491713

Anonymous 03/31/26(Tue)07:04:41 No.108491713

File: 1749686772610305.png (922 KB, 1580x976)

922 KB PNG

https://xcancel.com/i/status/2038813799856374135
Sam Altman is truly a weird motherfucker

Anonymous
03/31/26(Tue)07:12:26 No.108491766

Anonymous 03/31/26(Tue)07:12:26 No.108491766

>>108490428
Getting rid of matrix math isn't about shrinking the LLM really. It's bout what chip runs it well. You go from a $50,000 chip to a $50 chip. That part is what matters. All the other memory issues are surprisingly addressed in the article.

Anonymous
03/31/26(Tue)07:13:15 No.108491769

Anonymous 03/31/26(Tue)07:13:15 No.108491769

>>108491713
So people still didn't catch onto the fact that turboquant does not significantly reduce the memory requirements even if AI hosts were fine with gimping their model's performance?

Anonymous
03/31/26(Tue)07:14:34 No.108491778

Anonymous 03/31/26(Tue)07:14:34 No.108491778

>>108491713
That's it, I'm putting a massive short on Micron and Nvidia

Anonymous
03/31/26(Tue)07:15:32 No.108491785

Anonymous 03/31/26(Tue)07:15:32 No.108491785

>>108491769
Unless the weights can be turboquanted too...

Anonymous
03/31/26(Tue)07:15:55 No.108491787

Anonymous 03/31/26(Tue)07:15:55 No.108491787

File: this.png (253 KB, 926x1311)

253 KB PNG

>>108491713
YOU GET WHAT YOU FUCKING DESERVE

Anonymous
03/31/26(Tue)07:16:08 No.108491788

Anonymous 03/31/26(Tue)07:16:08 No.108491788

>>108491778
you are probably late

Anonymous
03/31/26(Tue)07:16:56 No.108491794

Anonymous 03/31/26(Tue)07:16:56 No.108491794

>>108491785
why not? maybe that rotating method shit can help making better quality GGUF quants?

Anonymous
03/31/26(Tue)07:19:17 No.108491806

Anonymous 03/31/26(Tue)07:19:17 No.108491806

>>108491787
It's not like they don't sell to other companies to finish. They only lost an avenue for selling boxed shit to consoomers with a higher markup.

Anonymous
03/31/26(Tue)07:22:26 No.108491817

Anonymous 03/31/26(Tue)07:22:26 No.108491817

>>108491419
probably still better than the gemini monstrosity that takes 10 centuries to start and update

Anonymous
03/31/26(Tue)07:24:08 No.108491826

Anonymous 03/31/26(Tue)07:24:08 No.108491826

>>108491787
I don't think Micron is stuck with anything. They'll just reopen and maybe rebrand the consumer division if they need to so.

Anonymous
03/31/26(Tue)07:31:19 No.108491872

Anonymous 03/31/26(Tue)07:31:19 No.108491872

>>108491826
Manufacturing plants aren't github repos bro.

Anonymous
03/31/26(Tue)07:32:19 No.108491876

Anonymous 03/31/26(Tue)07:32:19 No.108491876

>>108491872
Wait nevermind, this is the dumbest post I've ever made in my life.

Anonymous
03/31/26(Tue)07:34:12 No.108491887

Anonymous 03/31/26(Tue)07:34:12 No.108491887

File: 1761540370148302.png (78 KB, 755x600)

78 KB PNG

>>108491826
they'll go bankrupt from the financial whiplash of everyone selling their stocks

Anonymous
03/31/26(Tue)07:39:20 No.108491907

Anonymous 03/31/26(Tue)07:39:20 No.108491907

>403 Forbidden: You need to setup automatic credit recharge in order to upload more data. You can do so at /settings/billing..
fucking huggingface

Anonymous
03/31/26(Tue)07:40:02 No.108491911

Anonymous 03/31/26(Tue)07:40:02 No.108491911

api troon here
I surrender
local chuds wonned

Anonymous
03/31/26(Tue)07:41:05 No.108491916

Anonymous 03/31/26(Tue)07:41:05 No.108491916

I have to use qwen2.5 27b because qwen3.5 27b won't stop overthinking and heating up my pc until it's about to burn the house down.

Anonymous
03/31/26(Tue)07:41:56 No.108491920

Anonymous 03/31/26(Tue)07:41:56 No.108491920

>qwen2.5 27b
fucking bots man

Anonymous
03/31/26(Tue)07:44:11 No.108491932

Anonymous 03/31/26(Tue)07:44:11 No.108491932

How long will my m.2 drive last if I'm writing around 500gb to it every day? I'm starting to get worried.

Anonymous
03/31/26(Tue)07:44:35 No.108491934

Anonymous 03/31/26(Tue)07:44:35 No.108491934

>>108491907
create a wise account
gives you free virtual credit cards that you can delete and create at any time
so you create a vcc, at it on chuddingface and delete vcc after confirmation. boom, they cant charge you. Works for aws and all that other junk as well

Anonymous
03/31/26(Tue)07:49:37 No.108491966

Anonymous 03/31/26(Tue)07:49:37 No.108491966

>>108491932
its not ideal, what is the work load? if its inference maybe you can try using mmap instead of swapping. if its training save checkpoints less frequently I guess.

Anonymous
03/31/26(Tue)07:50:37 No.108491973

Anonymous 03/31/26(Tue)07:50:37 No.108491973

>>108491713
>>108491787
Wait, so Scam Altman buying up all of that DRAM capacity wasn't even legally binding?
Is Micron run by literal retards?

Anonymous
03/31/26(Tue)07:53:11 No.108491991

Anonymous 03/31/26(Tue)07:53:11 No.108491991

>>108491932
actually I didn't address your question. without any model information i can only guess, its probably rated for 150tb-200tb writes you can run your workload for about a year.

Anonymous
03/31/26(Tue)07:53:15 No.108491992

Anonymous 03/31/26(Tue)07:53:15 No.108491992

File: sir.png (90 KB, 438x421)

90 KB PNG

>>108491973
>Is Micron run by literal retards?

Anonymous
03/31/26(Tue)07:54:37 No.108492001

Anonymous 03/31/26(Tue)07:54:37 No.108492001

>>108491992
LMAOOOO, I swear to god jeets have done more damage than jews in murica at this point

Anonymous
03/31/26(Tue)07:54:40 No.108492002

Anonymous 03/31/26(Tue)07:54:40 No.108492002

>>108491966
>>108491991
It's not an inference thing. I've been mass exporting onnx models for weeks. Big ones. Often.

Anonymous
03/31/26(Tue)07:55:04 No.108492003

Anonymous 03/31/26(Tue)07:55:04 No.108492003

>>108491501
I don't use claude code or whatever the fuck "openclaw" is supposed to be. what am i looking at?

Anonymous
03/31/26(Tue)07:58:20 No.108492018

Anonymous 03/31/26(Tue)07:58:20 No.108492018

File: kek.png (253 KB, 469x469)

253 KB PNG

>>108491992
SamRR DO NOT REDEEM THE RAM

Anonymous
03/31/26(Tue)08:00:48 No.108492030

Anonymous 03/31/26(Tue)08:00:48 No.108492030

File: Screenshot_20260331-07574(...).jpg (210 KB, 720x569)

210 KB JPG

>>108492002
I meant the ssd model, they oftentimes have a specified number of writes. if you could look your specific model up you can check what the manufacturer thinks if they published that data. you can also check the smart monitors to see its wear level. if your on windows I'd recommend crystaldiskinfo.

Anonymous
03/31/26(Tue)08:03:29 No.108492049

Anonymous 03/31/26(Tue)08:03:29 No.108492049

>>108492030
Thanks. It's a samsung evo 990 pro. Just replaced it a few months ago. I'm on linux, but I'll check the drive health.

Anonymous
03/31/26(Tue)08:05:42 No.108492059

Anonymous 03/31/26(Tue)08:05:42 No.108492059

>>108491992
EnJEETification knows no bounds

Anonymous
03/31/26(Tue)08:14:51 No.108492092

Anonymous 03/31/26(Tue)08:14:51 No.108492092

>>108492049
Samsung J'evo is a reliable drive.

Anonymous
03/31/26(Tue)08:17:14 No.108492106

Anonymous 03/31/26(Tue)08:17:14 No.108492106

>>108492001
One was put into place and enabled by the other.

Anonymous
03/31/26(Tue)08:26:01 No.108492148

Anonymous 03/31/26(Tue)08:26:01 No.108492148

>>108492106
It's long been known that India is where Jews will go when US goes down

Anonymous
03/31/26(Tue)08:29:09 No.108492161

Anonymous 03/31/26(Tue)08:29:09 No.108492161

File: sanjay.png (153 KB, 872x644)

153 KB PNG

>>108491785
>Unless the weights can be turboquanted too...
don't kill the turbogrift hype until i've bought more ddr5

Anonymous
03/31/26(Tue)08:43:34 No.108492222

Anonymous 03/31/26(Tue)08:43:34 No.108492222

>>108492001
jeet tongue jew anus

Anonymous
03/31/26(Tue)08:44:07 No.108492230

Anonymous 03/31/26(Tue)08:44:07 No.108492230

>>108491769
>Grummz
>people

Anonymous
03/31/26(Tue)08:49:43 No.108492257

Anonymous 03/31/26(Tue)08:49:43 No.108492257

File: 1761103694081520.jpg (20 KB, 450x450)

20 KB JPG

>>108492161

Anonymous
03/31/26(Tue)08:52:00 No.108492269

Anonymous 03/31/26(Tue)08:52:00 No.108492269

File: 2026-03-31_14-50.png (181 KB, 1835x467)

181 KB PNG

>>108491355

Anonymous
03/31/26(Tue)08:53:54 No.108492279

Anonymous 03/31/26(Tue)08:53:54 No.108492279

File: sus.png (132 KB, 393x285)

132 KB PNG

>>108491355
>>108492269
it's almost like they "leaked" the source code on purpose to make a point and force the government to put more cucked laws into place

Anonymous
03/31/26(Tue)08:55:43 No.108492287

Anonymous 03/31/26(Tue)08:55:43 No.108492287

>>108492269
So today is two more weeks from the release of that video.
Tracks.

Anonymous
03/31/26(Tue)08:58:06 No.108492301

Anonymous 03/31/26(Tue)08:58:06 No.108492301

>>108491713
You know what, I like Sam now.

Anonymous
03/31/26(Tue)09:00:05 No.108492311

Anonymous 03/31/26(Tue)09:00:05 No.108492311

Gemma soon!

Anonymous
03/31/26(Tue)09:01:46 No.108492323

Anonymous 03/31/26(Tue)09:01:46 No.108492323

File: g4_ann_nextmonth.png (103 KB, 999x517)

103 KB PNG

>>108492311
April 1st.
https://x.com/osanseviero/status/2038751377329893384

Anonymous
03/31/26(Tue)09:01:58 No.108492326

Anonymous 03/31/26(Tue)09:01:58 No.108492326

>>108492049
990 pro 2TB (i am assuming) has 1200TBW
i highly recommend you to set drive over provisioning

Anonymous
03/31/26(Tue)09:03:47 No.108492330

Anonymous 03/31/26(Tue)09:03:47 No.108492330

>>108492311
:eyes:

Anonymous
03/31/26(Tue)09:05:58 No.108492340

Anonymous 03/31/26(Tue)09:05:58 No.108492340

>>108491713
trololololol!

Anonymous
03/31/26(Tue)09:14:52 No.108492379

Anonymous 03/31/26(Tue)09:14:52 No.108492379

File: 1767547180588649.png (323 KB, 805x397)

323 KB PNG

Some April's fools vibe coded cutesy shit will go horribly wrong and cause some mayhem, you just know it.

Anonymous
03/31/26(Tue)09:15:52 No.108492383

Anonymous 03/31/26(Tue)09:15:52 No.108492383

>>108492323
He could post meaningless hype bait every single day for years and you will spend every single day reposting that retard's twitter shit here. How about you fuck off?

Anonymous
03/31/26(Tue)09:16:41 No.108492390

Anonymous 03/31/26(Tue)09:16:41 No.108492390

I’m in Japan this month. You guys want anything?

Anonymous
03/31/26(Tue)09:17:05 No.108492392

Anonymous 03/31/26(Tue)09:17:05 No.108492392

>>108492390
no, I'm in Japan too

Anonymous
03/31/26(Tue)09:17:30 No.108492394

Anonymous 03/31/26(Tue)09:17:30 No.108492394

>>108492311
If the rumors are true is going to be DOA like mistral 4.

Anonymous
03/31/26(Tue)09:18:06 No.108492400

Anonymous 03/31/26(Tue)09:18:06 No.108492400

>>108492323
Google io thing where they dropped previous gemmas is in may isn't it?

Anonymous
03/31/26(Tue)09:19:37 No.108492409

Anonymous 03/31/26(Tue)09:19:37 No.108492409

>>108492392
Nice. See you in the onsen, bro

Anonymous
03/31/26(Tue)09:20:15 No.108492412

Anonymous 03/31/26(Tue)09:20:15 No.108492412

>>108492390
JKs...

Anonymous
03/31/26(Tue)09:21:05 No.108492416

Anonymous 03/31/26(Tue)09:21:05 No.108492416

File: DipsyAngry.png (68 KB, 673x515)

68 KB PNG

>>108491713
> non-binding purchase intent
ARE YOU FUCKING KIDDING ME.
All this cost runup, shortages, and reorganizations and these super valuable buy-all-global RAM contracts ARE FUCKING MOUs to AI companies that aren't even turning a profit?
I KNEW this entire thing stank from day one. Why anyone trusted Altman and didn't get a binding contact with economic penalties and lock ins is beyond me.
I hope anons took advantage of the crisis to sell some RAM. I sure as hell did.
>>108491787
Indeed.
>>108491992
Man this shit writes itself.

Anonymous
03/31/26(Tue)09:21:47 No.108492421

Anonymous 03/31/26(Tue)09:21:47 No.108492421

>>108492390
oh, man of the future, is v4 out yet?

Anonymous
03/31/26(Tue)09:22:39 No.108492425

Anonymous 03/31/26(Tue)09:22:39 No.108492425

>>108492394
They're definitely going to launch smaller models too.

>>108492400
If I recall correctly, Gemma 2 was announced during Google I/O 2024, but then they released it in June in the same year. Gemma 3 was released on March 2025.

Anonymous
03/31/26(Tue)09:22:43 No.108492426

Anonymous 03/31/26(Tue)09:22:43 No.108492426

>>108492416
Disturbing levels of incompetence.

Anonymous
03/31/26(Tue)09:23:11 No.108492430

Anonymous 03/31/26(Tue)09:23:11 No.108492430

>>108492416
>>108492416
Just like the Nvidia investment that was promised then pulled back, all those datacenters that were never built, etc etc.
Lots and lots and lots of just doing shit to hype the market up over and over like a perpetual pump and dump.

Anonymous
03/31/26(Tue)09:24:16 No.108492437

Anonymous 03/31/26(Tue)09:24:16 No.108492437

>>108492390
I live here. If you're in your thirties and in Osaka, hit me up.

Anonymous
03/31/26(Tue)09:24:25 No.108492439

Anonymous 03/31/26(Tue)09:24:25 No.108492439

>>108492425
The rumors are they gonna release only a 2B, a 4B and a 100B moe.

Anonymous
03/31/26(Tue)09:24:33 No.108492442

Anonymous 03/31/26(Tue)09:24:33 No.108492442

>>108492390
My own personal Miku.

Anonymous
03/31/26(Tue)09:24:59 No.108492446

Anonymous 03/31/26(Tue)09:24:59 No.108492446

>>108492439
>rumors
kysyourself

Anonymous
03/31/26(Tue)09:29:31 No.108492463

Anonymous 03/31/26(Tue)09:29:31 No.108492463

>>108491431
It is vibecoded af; the Anthropic devs have publicly stated that Claude Code is writing to improve itself.
>>108491355
Man I thought this was going to be a boring week.
Did anyone locate the system prompt in this thing? It's supposedly massive.
lol nm. Looks like the poster's already taken it down. Shame.

Anonymous
03/31/26(Tue)09:30:51 No.108492471

Anonymous 03/31/26(Tue)09:30:51 No.108492471

File: 1768970582739193.png (6 KB, 337x167)

6 KB PNG

>>108491355
jesus fucking christ the reaction has blown the fuck out, it's just the code to run the model, not the model itself, it's not THAT special

Anonymous
03/31/26(Tue)09:32:32 No.108492482

Anonymous 03/31/26(Tue)09:32:32 No.108492482

>>108492471
This is like as if the source code to microsoft office leaked 25 years ago

Anonymous
03/31/26(Tue)09:33:41 No.108492488

Anonymous 03/31/26(Tue)09:33:41 No.108492488

>>108492471
it's just a bunch of chinese bots spamming for reason, not real people

Anonymous
03/31/26(Tue)09:34:06 No.108492493

Anonymous 03/31/26(Tue)09:34:06 No.108492493

>>108491355
I was scared by the rumors that Claude Mythos is better than predicted by scaling laws. But it is a relief that the people who want to rule over the universe are so incompetent they run vibe coded slop.

Anonymous
03/31/26(Tue)09:34:20 No.108492495

Anonymous 03/31/26(Tue)09:34:20 No.108492495

>>108492482
not really, if you get the source code of office you get the whole secret, what makes claude special is the model itself, that's it

Anonymous
03/31/26(Tue)09:34:25 No.108492496

Anonymous 03/31/26(Tue)09:34:25 No.108492496

File: Screenshot_20260331-09334(...).jpg (293 KB, 720x1480)

293 KB JPG

>>108492471
kek, wtf is even going on there

Anonymous
03/31/26(Tue)09:34:41 No.108492500

Anonymous 03/31/26(Tue)09:34:41 No.108492500

>>108492482
Codex, Gemini CLI, Qwen Code, Mistral Vibe, OpenCode, and like several dozen alternatives exist.
Maybe if there were 100 hundred competing office suites of similar quality.

Anonymous
03/31/26(Tue)09:35:37 No.108492507

Anonymous 03/31/26(Tue)09:35:37 No.108492507

>>108492390
Pet the cats or horses I heard they have horses there

Anonymous
03/31/26(Tue)09:36:33 No.108492511

Anonymous 03/31/26(Tue)09:36:33 No.108492511

>>108492471
You can run other models on Claude Code ofc. And it's rumoured to have massive, inefficient internal prompting. Which if you're selling API tokens, is a feature not a bug.
Imagine taking CC and cutting the prompts down by 90%, and finding out it still works as well as it ever did. > enable local and/or saves money
> exposes Anthropic as underhanded

Anonymous
03/31/26(Tue)09:37:12 No.108492516

Anonymous 03/31/26(Tue)09:37:12 No.108492516

>>108491787
>Google announces breakthrough saving AI 6x the RAM
Shitter grifters are so stupid it is outrageous. How dare they pollute the world with their stupidity?

Anonymous
03/31/26(Tue)09:38:08 No.108492522

Anonymous 03/31/26(Tue)09:38:08 No.108492522

>>108492511
wouldn't it be very easy to check those rumours by checking what your backend gets passed if you can change your model?

Anonymous
03/31/26(Tue)09:38:36 No.108492524

Anonymous 03/31/26(Tue)09:38:36 No.108492524

>>108492511
Claude code has settings to change the api endpoint it uses and llama.cpp supports anthropic's special endpoints.
We already know what the prompts look like and how well it works with local models.

Anonymous
03/31/26(Tue)09:38:50 No.108492525

Anonymous 03/31/26(Tue)09:38:50 No.108492525

>>108492516
Pretty much every major news outlet has also reported on Google making AI 6 times more efficient

Anonymous
03/31/26(Tue)09:42:15 No.108492551

Anonymous 03/31/26(Tue)09:42:15 No.108492551

>>108492525
>news are incompetent
we knew that 100 years ago

Anonymous
03/31/26(Tue)09:48:14 No.108492577

Anonymous 03/31/26(Tue)09:48:14 No.108492577

>>108492511
>>108491355
>TypeScript 100.0%
The source was always available in the npm package in minified form. If editing the prompts is the appeal of this, one could have modified the package to do so too. This is assuming they don't block requests on the backend with mismatching prompts anyway.

Anonymous
03/31/26(Tue)09:49:55 No.108492582

Anonymous 03/31/26(Tue)09:49:55 No.108492582

>>108492496
courting death

Anonymous
03/31/26(Tue)09:52:03 No.108492593

Anonymous 03/31/26(Tue)09:52:03 No.108492593

>>108492525
Got to get those clicks for the ad revenue and help Mr. Shekelberg buy stocks on discount.

Anonymous
03/31/26(Tue)09:52:29 No.108492595

Anonymous 03/31/26(Tue)09:52:29 No.108492595

File: file.png (251 KB, 1446x1801)

251 KB PNG

>>108492471
>>108492496
holy mother of fucking hell
bots are partying there

Anonymous
03/31/26(Tue)09:53:55 No.108492600

Anonymous 03/31/26(Tue)09:53:55 No.108492600

>>108492525
>Pretty much every major news outlet has also reported
The state of the world is sad. One of the worthless student projects I supervised went viral and got global news attention. I was shocked to see even smart people at frontier labs talk about it. It is one of many blackpilling experiences. 99% of science and tech communication including papers is bullshit. I have never seen a news article about something that I had behind the scenes knowledge about that was even remotely accurate.

Anonymous
03/31/26(Tue)09:54:14 No.108492603

Anonymous 03/31/26(Tue)09:54:14 No.108492603

>>108492595
maybe it's not just bots? remember there's as much chinks as jeets

Anonymous
03/31/26(Tue)09:55:42 No.108492613

Anonymous 03/31/26(Tue)09:55:42 No.108492613

>>108492600
>I was shocked to see even smart people at frontier labs talk about it.
there's value in that paper, but yeah it's completly overblown

Anonymous
03/31/26(Tue)10:00:10 No.108492632

Anonymous 03/31/26(Tue)10:00:10 No.108492632

>>108492613
There's value in RaBitQ. TurboQuant not so much.

Anonymous
03/31/26(Tue)10:01:46 No.108492637

Anonymous 03/31/26(Tue)10:01:46 No.108492637

>>108492416
What if it is still market manipulation on purpose? I don't trust these guys on any level.

Anonymous
03/31/26(Tue)10:02:10 No.108492639

Anonymous 03/31/26(Tue)10:02:10 No.108492639

>>108492600
following eccentric but genuinely interesting stuff like rwkv that can barely hold itself is boring and have zero viral value compared to overreporting random thing that has been tried million times like as if it will solve global ram crisis bullshit to get those view numbers

Anonymous
03/31/26(Tue)10:04:45 No.108492653

Anonymous 03/31/26(Tue)10:04:45 No.108492653

>>108492639
It also matters a lot how explainable things are to the average drooling monkey. Good luck explaining RWKV in a headline or twit. Google + 6x + less RAM is simple enough for any double digit IQ retard to parrot without issue.

Anonymous
03/31/26(Tue)10:06:34 No.108492665

Anonymous 03/31/26(Tue)10:06:34 No.108492665

>>108492511
>And it's rumoured to have massive, inefficient internal prompting.
It has a bloated system prompt but besides that it just works like any other agent. They're inefficient in general because they do intermediary work to figure things out like reading files and executing commands, and it's all kept in the context until it reaches the limit and compaction is run. When you prompt it manually you give it exactly what it needs.

Anonymous
03/31/26(Tue)10:08:21 No.108492677

Anonymous 03/31/26(Tue)10:08:21 No.108492677

File: Screenshot_20260331_095626.png (89 KB, 1648x420)

89 KB PNG

>>108491638
have you actually looked at the dataset? it has some odd prompts with unrealistic names, you might be better off doing a small data regime with just the best samples.

Anonymous
03/31/26(Tue)10:08:42 No.108492678

Anonymous 03/31/26(Tue)10:08:42 No.108492678

>>108492665
>they do intermediary work to figure things out like reading files
They don't if you @ all the files they need in your prompt.

Anonymous
03/31/26(Tue)10:08:56 No.108492679

Anonymous 03/31/26(Tue)10:08:56 No.108492679

>>108492653
Midwits care about sensational stories and applications. WoW yOu CaN lEt YoUr OwN aI aGenT wOrK fOr YoU? Boom, 200k stars on Github.

Anonymous
03/31/26(Tue)10:12:02 No.108492692

Anonymous 03/31/26(Tue)10:12:02 No.108492692

I'm running qwen3 coder and it can't stop thinking and thinking forever.

Anonymous
03/31/26(Tue)10:13:32 No.108492699

Anonymous 03/31/26(Tue)10:13:32 No.108492699

File: DipsyNervous.png (70 KB, 673x515)

70 KB PNG

>>108492637
Well, if you're Altman, and you can get some retarded company to make structural, credible-commitment changes based on your worthless MOUs, it allows all sorts of fun opportunities to conduct insider trading.
> Buy before you MOU
> Sell on the runup
> Short before you announce the MOUs are worthless.
All you need is a retarded tech CEO running a memory company, who may or may not be himself complicit to your scheme.
In any normal government this would kick off an FTC investigation. But we are not in normal times.
> but insider trading is illegal
It only sucks if you get caught. And you don't get caught if you're smart about it. Execs brag about this stuff in private (which btw is one of the not smart things to do...)

Anonymous
03/31/26(Tue)10:13:49 No.108492702

Anonymous 03/31/26(Tue)10:13:49 No.108492702

>>108492677
>saltman fanfic
huh

Anonymous
03/31/26(Tue)10:14:44 No.108492708

Anonymous 03/31/26(Tue)10:14:44 No.108492708

>>108492678
Why do all that work? AI exists to do work like finding and reading files.

Anonymous
03/31/26(Tue)10:15:45 No.108492715

Anonymous 03/31/26(Tue)10:15:45 No.108492715

>>108492692
Qwen 3 coder next?

Anonymous
03/31/26(Tue)10:15:46 No.108492716

Anonymous 03/31/26(Tue)10:15:46 No.108492716

>>108492708
Because if I don't do that it's going to run off track and do something retarded or just take longer than it takes me to @ a few files.

Anonymous
03/31/26(Tue)10:16:59 No.108492724

Anonymous 03/31/26(Tue)10:16:59 No.108492724

>>108492715
Yes next

Anonymous
03/31/26(Tue)10:20:12 No.108492746

Anonymous 03/31/26(Tue)10:20:12 No.108492746

>>108492416
>>108492699
seed deep inside ToT

Anonymous
03/31/26(Tue)10:24:37 No.108492776

Anonymous 03/31/26(Tue)10:24:37 No.108492776

>>108490118
>>108489627
The CCP sought deep into this anons ass

Anonymous
03/31/26(Tue)10:25:53 No.108492781

Anonymous 03/31/26(Tue)10:25:53 No.108492781

>>108492416
There was an incident last night, watching the goings on of the Iran war, where the US and Israel claimed to have hit an artillery depot in the outskirts of Isfahan, but some anons immediately found the actual location of the strikes using realtime satellite fire tracking and it was actually probably a farm that got bombed. Which leads me to suspect they used their stupid LLMslop target discrimination shit without a human double check. And if that's the case there's probably a lot of such cases that will emerge in the post-war assessment phase which won't be a good look for corporate AI.

Anonymous
03/31/26(Tue)10:33:21 No.108492821

Anonymous 03/31/26(Tue)10:33:21 No.108492821

>>108492781
Of course the media would be called anti war if they actually reported it correctly.
The LLM targeting they are using is usually Israeli trained AIs like Claude and chatgpt so they are built for max genocide and killing farmers the point. They never wanted the depot. Should have used qwen.

Anonymous
03/31/26(Tue)10:37:52 No.108492842

Anonymous 03/31/26(Tue)10:37:52 No.108492842

>>108492821
My guess is it probably happened 2-fold.
It probably had access to espionage documents pertaining to the movement of large amounts of ammonia. Which would be an even indicator of both agricultural activity and missile manufacturing activity. And then it probably did a vision encoder "look" at the scene, where there was a lot of weird shapes in the sand, because desert agriculture be like that. And it's being used by retards that don't realize it doesn't actually "see" it just tokenizes the image into more tokens to bullshit and hallucinate about.

Anonymous
03/31/26(Tue)11:11:48 No.108493044

Anonymous 03/31/26(Tue)11:11:48 No.108493044

>>108490676
are you getting the actual token probability? a pretty good improvement.

Anonymous
03/31/26(Tue)11:22:33 No.108493120

Anonymous 03/31/26(Tue)11:22:33 No.108493120

>>108490163
This is a shitty model that won't stop think8ng.

Anonymous
03/31/26(Tue)11:23:23 No.108493122

Anonymous 03/31/26(Tue)11:23:23 No.108493122

lol

Anonymous
03/31/26(Tue)11:24:09 No.108493125

Anonymous 03/31/26(Tue)11:24:09 No.108493125

>>108493120
Has absolutely nothing at all to do with the post you replied to.

Anonymous
03/31/26(Tue)11:48:49 No.108493315

Anonymous 03/31/26(Tue)11:48:49 No.108493315

How to build Claude Code if you're so inclined
https://gist.github.com/alesha-pro/a4e36c9dca5d2937557410bbd09ec37c

Anonymous
03/31/26(Tue)11:52:54 No.108493347

Anonymous 03/31/26(Tue)11:52:54 No.108493347

File: lcpp.png (149 KB, 966x756)

149 KB PNG

improved q8 kv cache coming soon...
https://github.com/ggml-org/llama.cpp/pull/21038

Anonymous
03/31/26(Tue)11:54:29 No.108493364

Anonymous 03/31/26(Tue)11:54:29 No.108493364

File: tecsl.png (247 KB, 860x1035)

247 KB PNG

I'm schizooing!
https://github.com/ggml-org/llama.cpp/issues/21232
>[Energy] N6 Arithmetic: 50-70% AI Training/Inference Energy Reduction — 17 Techniques with Code
>n=6 arithmetic reduces AI training and inference energy by 50-70%. No hyperparameter search needed — all optimal values are mathematically predetermined from the unique solution to σ(n)·φ(n) = n·τ(n) ⟺ n = 6.
>Foundation: TECS-L — Mathematical proof & 76 Breakthrough Theorems
Alright. Where does the link take me...
https://github.com/need-singularity/TECS-L
picrel

Anonymous
03/31/26(Tue)11:59:04 No.108493392

Anonymous 03/31/26(Tue)11:59:04 No.108493392

>>108493364
yikes

Anonymous
03/31/26(Tue)11:59:07 No.108493394

Anonymous 03/31/26(Tue)11:59:07 No.108493394

>>108493347
>piotr approved
TO THE MOON!!! :rocket: :rocket:

Anonymous
03/31/26(Tue)12:01:19 No.108493403

Anonymous 03/31/26(Tue)12:01:19 No.108493403

>>108493364
there was an absolute schizo tier gh project that had shit to 'unlock' your chatgpt that was linked sometimes in /aicg/, might be the same shit

Anonymous
03/31/26(Tue)12:04:37 No.108493420

Anonymous 03/31/26(Tue)12:04:37 No.108493420

>>108493347
GNIGERANOV MERGE THE FAKIN PIARRR

Anonymous
03/31/26(Tue)12:05:54 No.108493425

Anonymous 03/31/26(Tue)12:05:54 No.108493425

>>108493403
I hear that there's a whole tik tok microcosm of people "transcending their chatgpt" or whathave you.

Anonymous
03/31/26(Tue)12:06:50 No.108493432

Anonymous 03/31/26(Tue)12:06:50 No.108493432

File: turbogranny.png (1.59 MB, 1725x880)

1.59 MB PNG

>TurboQuant
>Ram prices falling
Boys, local models are saved.

Anonymous
03/31/26(Tue)12:07:38 No.108493440

Anonymous 03/31/26(Tue)12:07:38 No.108493440

wtf im selling my cpumaxx right right now and i'll rebuy it once ram is cheap again

Anonymous
03/31/26(Tue)12:07:41 No.108493441

Anonymous 03/31/26(Tue)12:07:41 No.108493441

>>108493432
as soon as ram goes to the same baseline as august of last year im gonna fucking blow it on a 256gb set

Anonymous
03/31/26(Tue)12:08:44 No.108493457

Anonymous 03/31/26(Tue)12:08:44 No.108493457

File: score.png (75 KB, 929x420)

75 KB PNG

>>108493347
tl;dr -> makes q8 kv same as bf16. basically free 2x context. basically basically you're retarded if you don't run q8 kv when this goes live.

Anonymous
03/31/26(Tue)12:10:08 No.108493470

Anonymous 03/31/26(Tue)12:10:08 No.108493470

>>108493347
>>108493394
Jokes on you I've been running his branch for the last 2 days

Anonymous
03/31/26(Tue)12:10:36 No.108493475

Anonymous 03/31/26(Tue)12:10:36 No.108493475

File: ariannamethod.png (349 KB, 1244x930)

349 KB PNG

>>108493403
I doubt it's the same, but there was ariannamethod as well. It's gone now.
https://web.archive.org/web/20260220055121/https://github.com/ariannamethod/

Anonymous
03/31/26(Tue)12:13:36 No.108493491

Anonymous 03/31/26(Tue)12:13:36 No.108493491

>>108493457
>free 2x context
Useless if underlying model didn't train on 2x context to begin with

Anonymous
03/31/26(Tue)12:14:32 No.108493503

Anonymous 03/31/26(Tue)12:14:32 No.108493503

>>108493457
this, also ignore the errors that might start happening from 8k context upwards...

Anonymous
03/31/26(Tue)12:16:07 No.108493518

Anonymous 03/31/26(Tue)12:16:07 No.108493518

>>108493491
I wouldn't mind being able to run more threads concurrently.

Anonymous
03/31/26(Tue)12:16:33 No.108493522

Anonymous 03/31/26(Tue)12:16:33 No.108493522

>>108493491
if you already can fit the whole context you just get more slots to work with to keep more of the context cached.

Anonymous
03/31/26(Tue)12:16:39 No.108493524

Anonymous 03/31/26(Tue)12:16:39 No.108493524

>>108493491
Free memory for another small model or image/voice model running in parallel. Attaching relevant images to model messages is underrated.

Anonymous
03/31/26(Tue)12:19:23 No.108493539

Anonymous 03/31/26(Tue)12:19:23 No.108493539

>>108493503
is this something you actually tested or are you just doom and gloom posting for the funsies?

Anonymous
03/31/26(Tue)12:20:54 No.108493553

Anonymous 03/31/26(Tue)12:20:54 No.108493553

>>108493539
>doom and gloom
Rotating the activations destroys sparsity

Anonymous
03/31/26(Tue)12:22:50 No.108493562

Anonymous 03/31/26(Tue)12:22:50 No.108493562

>>108493553
Proof?

Anonymous
03/31/26(Tue)12:23:27 No.108493565

Anonymous 03/31/26(Tue)12:23:27 No.108493565

>>108493553
so does that mean it is fine with dense models?

Anonymous
03/31/26(Tue)12:25:12 No.108493572

Anonymous 03/31/26(Tue)12:25:12 No.108493572

>>108493565
Where did you get that idea? It's about sparsity in attention, not experts

Anonymous
03/31/26(Tue)12:25:13 No.108493573

Anonymous 03/31/26(Tue)12:25:13 No.108493573

>>108493364
>Closed as completed
kek

Anonymous
03/31/26(Tue)12:28:15 No.108493588

Anonymous 03/31/26(Tue)12:28:15 No.108493588

>>108493572
okay that sounds interesting can you educate me or link a paper.

Anonymous
03/31/26(Tue)12:36:14 No.108493627

Anonymous 03/31/26(Tue)12:36:14 No.108493627

>install opencode
>no llamacpp/openai compatible option
>uninstall opencode, install kilocode
>exactly the same layout
>exactly the same provider list
Please help.

Anonymous
03/31/26(Tue)12:38:06 No.108493643

Anonymous 03/31/26(Tue)12:38:06 No.108493643

>>108493627
> "llama.cpp": {
> "npm": "@ai-sdk/openai-compatible",
How retarded are you?

Anonymous
03/31/26(Tue)12:39:44 No.108493650

Anonymous 03/31/26(Tue)12:39:44 No.108493650

>>108493627
it's in the docs, dude

Anonymous
03/31/26(Tue)13:01:49 No.108493804

Anonymous 03/31/26(Tue)13:01:49 No.108493804

>>108493794
>>108493794
>>108493794

Anonymous
03/31/26(Tue)13:04:55 No.108493823

Anonymous 03/31/26(Tue)13:04:55 No.108493823

>>108488777
Back to GLM?

Anonymous
03/31/26(Tue)13:34:35 No.108494006

Anonymous 03/31/26(Tue)13:34:35 No.108494006

>>108493470
>I've been running his branch for the last 2 days
did you notice a degradation of results compared to fp16 kv cache?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.