/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 12/10/25(Wed)08:45:13 No.107503699

File: 1743716725198282.jpg (242 KB, 2559x1602)

242 KB JPG

/lmg/ - Local Models General Anonymous 12/10/25(Wed)08:45:13 No.107503699 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107493611 & >>107481183

►News
>(12/09) Introducing: Devstral 2 and Mistral Vibe CLI: https://mistral.ai/news/devstral-2-vibe-cli
>(12/08) GLM-4.6V (106B) and Flash (9B) released with function calling: https://z.ai/blog/glm-4.6v
>(12/06) convert: support Mistral 3 Large MoE #17730: https://github.com/ggml-org/llama.cpp/pull/17730
>(12/04) Microsoft releases VibeVoice-Realtime-0.5B: https://hf.co/microsoft/VibeVoice-Realtime-0.5B
>(12/04) koboldcpp-1.103 prebuilt released: https://github.com/LostRuins/koboldcpp/releases/tag/v1.103

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/10/25(Wed)08:45:34 No.107503701

Anonymous 12/10/25(Wed)08:45:34 No.107503701

File: neneru.jpg (186 KB, 1024x1024)

186 KB JPG

►Recent Highlights from the Previous Thread: >>107493611

--Evaluating Devstral 2 model performance and support readiness:
>107493927 >107494047 >107494295 >107494368 >107494418 >107494076 >107494102 >107494548 >107494560
--Performance optimization and benchmarking of 24b Mistral on 5060ti GPU:
>107500593 >107500619 >107500657 >107500680 >107500720 >107500772 >107500939
--Debating dataset quality and model evolution:
>107500008 >107500039 >107500058 >107500064 >107500095 >107500101 >107500134 >107500077
--Optimizing GLM 4.6 sampler settings for stable non-code fiction generation:
>107493997 >107494010 >107494020 >107494093 >107495493 >107494022 >107494085
--Devstral-2-123B performance and compatibility discussion:
>107499753 >107499795 >107499801 >107499891 >107499941 >107499802 >107499806 >107499830 >107500091
--Clarifying Mistral model safetensors differences and download best practices:
>107495709 >107496097 >107496163 >107496239
--Devstral-2-24B model quality issues in roleplay applications:
>107496439 >107497039 >107497306 >107497414 >107497513 >107497594 >107497785 >107497122 >107497189
--Critique of intermediate model sparsity levels:
>107500777 >107501680
--Proposing heterogeneous-size experts:
>107500169
--Technical challenges with long context handling and tool calling in new Mistral models:
>107497812 >107497851 >107498014 >107498052 >107498157 >107498183 >107498678
--NIPS 2025 paper contributions by top organizations:
>107501500
--Mistral model possession errors and roleplay performance tradeoffs:
>107497022 >107497219 >107497247 >107497236
--Model fails to recognize female character despite explicit gender cues in prompts:
>107498182 >107498346 >107498438 >107498478 >107498502 >107498523 >107498581
--Devstral's 123b model efficiency and playful code presentation:
>107498112
--Teto (my beloved):
>107493702 >107493811

►Recent Highlight Posts from the Previous Thread: >>107493614

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
12/10/25(Wed)09:09:41 No.107503935

Anonymous 12/10/25(Wed)09:09:41 No.107503935

Damn no replies?

Anonymous
12/10/25(Wed)09:16:32 No.107504003

Anonymous 12/10/25(Wed)09:16:32 No.107504003

>>107503935
dead hobby, dead general. you can only code, never coom

Anonymous
12/10/25(Wed)09:17:04 No.107504010

Anonymous 12/10/25(Wed)09:17:04 No.107504010

File: ComfyUI_00073_.png (1.31 MB, 1024x1024)

1.31 MB PNG

>>107503935
Not much to talk about anon.
Wish we could have gotten a cool mistral model.
I still run cydonia drummer slop. How long has it been now? A year or something? We are in a deep winter.
Maybe there will be something of a zimage moment for text. Huge bloated moe models and benchmaxxed tiny dense models.

Anonymous
12/10/25(Wed)09:17:18 No.107504014

Anonymous 12/10/25(Wed)09:17:18 No.107504014

>>107503935
Busy vibecoding my text RPG frontend.
Sorry.
I'd ask about memory systems and such but last time I asked I got no replies, so whatever.

Anonymous
12/10/25(Wed)09:19:37 No.107504040

Anonymous 12/10/25(Wed)09:19:37 No.107504040

I was thinking...
Maybe the "not x, but y" is a fundamental mechanism of how LLMs reason? Maybe it needs to frame the contrast between two opposing ideas in a consistent way for the attention mechanism to latch onto the semantic structure of these concepts.

Anonymous
12/10/25(Wed)09:20:58 No.107504063

Anonymous 12/10/25(Wed)09:20:58 No.107504063

>>107504003
Learn how to get off on well written code.

Anonymous
12/10/25(Wed)09:24:23 No.107504098

Anonymous 12/10/25(Wed)09:24:23 No.107504098

I'm looking to upgrade my PC, but GPU prices are going to hell again. Can I run anything good running on 16GB VRAM? I don't mind splitting to regular RAM, even ~10t/s is fine for me
Usecase: mainly narration, not chat

Anonymous
12/10/25(Wed)09:25:59 No.107504123

Anonymous 12/10/25(Wed)09:25:59 No.107504123

>>107498708
Nobody cares what is supported *right now*.
All that matters is that models keep progressing, even if we only get the weights and can only use the models through API for two years until somebody manages to vibecode support at 0.1tk/s. Model support can be eventually improved by the community. All that matters is the weights. Skill will come, eventually. Compute will not. The community isn't going to put together a dataset and the massive amount of compute required to train a SOTA model.
All that matters is local not stagnating, and local compute to keep going down in price (this is the most important part, since we already have "good enough" models, they just are insanely expensive to run).
I can wait two years. I can wait 5 years. I plan to keep coping with LLMs for the rest of my life, so I have all the time in the world. The younger you are the more this applies to you as well.

Anonymous
12/10/25(Wed)09:26:54 No.107504134

Anonymous 12/10/25(Wed)09:26:54 No.107504134

>>107504123
anon, models aren't progressing.

Anonymous
12/10/25(Wed)09:27:37 No.107504141

Anonymous 12/10/25(Wed)09:27:37 No.107504141

>>107504040
This slop was never a problem until Gemini started dropping it three times every paragraph and all the chink models started training on that

Anonymous
12/10/25(Wed)09:28:45 No.107504151

Anonymous 12/10/25(Wed)09:28:45 No.107504151

>>107504134
Correct. And that's much more concerning that whatever Deepseek finetune of the week is not currently supported by llama.cpp.

Anonymous
12/10/25(Wed)09:32:03 No.107504177

Anonymous 12/10/25(Wed)09:32:03 No.107504177

Ask chatgpt to type a seahorse emoji.
Do it, it's funny.

Anonymous
12/10/25(Wed)09:32:16 No.107504182

Anonymous 12/10/25(Wed)09:32:16 No.107504182

>>107504151
>>107504134
I thought we would have at least decent audio in and out models.
Zucc jewed us all with his cool "multiodal omni" llama4 release.
Only qwen experimented with that and didnt get total braindamage.
But the voice just fucking kills me.
Its very recognizable though. Some guy made it profitable and put that in his game game where you rescue a girl with dynamically created content. Good on him.
Not sure why nobody else takes some risks and experiments a bit. Its all so stale now.

Anonymous
12/10/25(Wed)09:33:33 No.107504200

Anonymous 12/10/25(Wed)09:33:33 No.107504200

>>107504141
Maybe. But intuitively, it feels right that the more structure you add to the text, the easier it would be for an LLM to parse it and increase its accuracy when solving tasks with an objectively correct answer while decreasing the quality from a subjective human perspective. The same would apply to markdown, which gives structure to the text. Some anon the other day gave it a name - "anchor tokens".

Anonymous
12/10/25(Wed)09:35:19 No.107504220

Anonymous 12/10/25(Wed)09:35:19 No.107504220

>>107504182
ChatGPT's own voice mode is shit, so meh. Just pipe the text to a suitable TTS engine and instruct the LLM to add annotations.

Anonymous
12/10/25(Wed)09:36:14 No.107504231

Anonymous 12/10/25(Wed)09:36:14 No.107504231

>>107503228
gemmysars... is it time for vishnu christmas???

Anonymous
12/10/25(Wed)09:36:22 No.107504233

Anonymous 12/10/25(Wed)09:36:22 No.107504233

>>107504040
Actually nice hypothesis. Take a look at "Tensor product attention is all you need", but I think it's more of a problem with synthetic data not attention itself

Anonymous
12/10/25(Wed)09:36:33 No.107504238

Anonymous 12/10/25(Wed)09:36:33 No.107504238

What happened to the rumoured 4090 96gb cards? I'd love to buy one right now.

Anonymous
12/10/25(Wed)09:38:00 No.107504252

Anonymous 12/10/25(Wed)09:38:00 No.107504252

>>107504238
its called a blackwell pro

Anonymous
12/10/25(Wed)09:38:14 No.107504254

Anonymous 12/10/25(Wed)09:38:14 No.107504254

>>107504220
>Just pipe the text to a suitable TTS engine
Its what I do for my kids. But its just not the same from the input and output.
All the details that get lost.
Ah well, 2 more weeks I suppose.

Anonymous
12/10/25(Wed)09:38:49 No.107504260

Anonymous 12/10/25(Wed)09:38:49 No.107504260

>>107504252
can you buy me one?

Anonymous
12/10/25(Wed)09:38:55 No.107504261

Anonymous 12/10/25(Wed)09:38:55 No.107504261

>>107504252
I'm willing to trust the chinks to save 2k.

Anonymous
12/10/25(Wed)09:39:15 No.107504264

Anonymous 12/10/25(Wed)09:39:15 No.107504264

>>107504238
You don't want the chink virus anon.
How about 32gb for 600 watt instead?

Anonymous
12/10/25(Wed)09:40:29 No.107504277

Anonymous 12/10/25(Wed)09:40:29 No.107504277

>>107504238
Chinks grew too smart and stopped selling them online to randos. It's all in-person under the table trade deals only.

Anonymous
12/10/25(Wed)09:42:01 No.107504297

Anonymous 12/10/25(Wed)09:42:01 No.107504297

>>107504260
I could but I don't want to.

Anonymous
12/10/25(Wed)09:44:18 No.107504323

Anonymous 12/10/25(Wed)09:44:18 No.107504323

File: file.png (1.09 MB, 1024x1024)

1.09 MB PNG

>>107504297
what if I gigve you this juicy migu

Anonymous
12/10/25(Wed)09:45:12 No.107504335

Anonymous 12/10/25(Wed)09:45:12 No.107504335

>>107504323
a-anon what part of that is juicy?

Anonymous
12/10/25(Wed)09:46:27 No.107504350

Anonymous 12/10/25(Wed)09:46:27 No.107504350

File: Screenshot_20251210_234525.png (872 KB, 3083x1782)

872 KB PNG

W-well its something. Fucking devstral man.

Anonymous
12/10/25(Wed)09:48:06 No.107504363

Anonymous 12/10/25(Wed)09:48:06 No.107504363

>>107499806
>Don't you have enough A30B MoEs to play with? The whole appeal is in being the first new big dense model we've gotten in over a year.

We got a 111B Dense from Cohere in March.

>>107500577
>I want dense-MoE models with high active parameters.
>Why no 60BA30B?
Here’s a 140BA70
https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO

Anonymous
12/10/25(Wed)09:49:40 No.107504383

Anonymous 12/10/25(Wed)09:49:40 No.107504383

how is local chatbot more dead than local image gen on /g/?

Anonymous
12/10/25(Wed)09:50:31 No.107504390

Anonymous 12/10/25(Wed)09:50:31 No.107504390

File: file.png (1.57 MB, 1024x1024)

1.57 MB PNG

>>107504335
what about this

Anonymous
12/10/25(Wed)09:51:01 No.107504396

Anonymous 12/10/25(Wed)09:51:01 No.107504396

>>107504350
Lmao. That's hilarious. What was the prompt? Dating simulator?

Anonymous
12/10/25(Wed)09:51:34 No.107504400

Anonymous 12/10/25(Wed)09:51:34 No.107504400

>>107504363
>https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO
NTA but
>lol
>lmao

Anonymous
12/10/25(Wed)09:52:01 No.107504404

Anonymous 12/10/25(Wed)09:52:01 No.107504404

>>107504383
Traveling to meet up and discuss is difficult during Winter.

Anonymous
12/10/25(Wed)09:52:56 No.107504409

Anonymous 12/10/25(Wed)09:52:56 No.107504409

Calm before the Big Fat Gemma

Anonymous
12/10/25(Wed)09:53:42 No.107504414

Anonymous 12/10/25(Wed)09:53:42 No.107504414

>>107504383
Why would you expect the consumer electronics and youtuber board to care about local chatbots?

Anonymous
12/10/25(Wed)09:53:54 No.107504417

Anonymous 12/10/25(Wed)09:53:54 No.107504417

>>107504383
At least this one didn't stop being baked for hours.

Anonymous
12/10/25(Wed)09:54:57 No.107504423

Anonymous 12/10/25(Wed)09:54:57 No.107504423

>>107503699
I thought this was Nijika looking at Kita and Ryou in the window

Anonymous
12/10/25(Wed)09:56:48 No.107504445

Anonymous 12/10/25(Wed)09:56:48 No.107504445

>>107504414
it was the opposite popularity

Anonymous
12/10/25(Wed)09:58:18 No.107504452

Anonymous 12/10/25(Wed)09:58:18 No.107504452

>>107504445
No, hosted models are popular. That's why /aicg/ exists and is more active.

Anonymous
12/10/25(Wed)10:00:58 No.107504483

Anonymous 12/10/25(Wed)10:00:58 No.107504483

>>107504452
you are just delusional

Anonymous
12/10/25(Wed)10:03:03 No.107504498

Anonymous 12/10/25(Wed)10:03:03 No.107504498

File: Screencast_20251211_000112.webm (235 KB, 1658x1114)

235 KB WEBM

>>107504396
Yes dating sim generator.
I explicitly told it to make a svg of the girl. Big models do that by default though.
The older Qwen3 32b in comparison. (devstral is 100b bigger).

Anonymous
12/10/25(Wed)10:03:24 No.107504500

Anonymous 12/10/25(Wed)10:03:24 No.107504500

Does context take a lot of space with Devstral 2?

Anonymous
12/10/25(Wed)10:15:26 No.107504602

Anonymous 12/10/25(Wed)10:15:26 No.107504602

I tried GLM 9B and it repeats so much, it can't write a single coherent reply. I don't know what I expected either.

Anonymous
12/10/25(Wed)10:16:45 No.107504617

Anonymous 12/10/25(Wed)10:16:45 No.107504617

File: miggu.jpg (835 KB, 1024x1024)

835 KB JPG

>>107504390
nice

Anonymous
12/10/25(Wed)10:18:38 No.107504641

Anonymous 12/10/25(Wed)10:18:38 No.107504641

>>107504617
wtf did u just steal my migu??

Anonymous
12/10/25(Wed)10:19:27 No.107504653

Anonymous 12/10/25(Wed)10:19:27 No.107504653

File: you-can-now-train-llms-3x(...).png (63 KB, 640x661)

63 KB PNG

New Unsloth update just dropped
https://docs.unsloth.ai/new/3x-faster-training-packing

Anonymous
12/10/25(Wed)10:19:58 No.107504661

Anonymous 12/10/25(Wed)10:19:58 No.107504661

>>107504602
Even a 9B model should be at least coherent.
That sounds like something is broken.

Anonymous
12/10/25(Wed)10:20:29 No.107504666

Anonymous 12/10/25(Wed)10:20:29 No.107504666

>>107504602
I wonder what caused this. Even Intellect 3 which trained on Air's pretrain had repetition. Maybe something about their pretraining technique or dataset handling is fucked up?

Anonymous
12/10/25(Wed)10:22:27 No.107504688

Anonymous 12/10/25(Wed)10:22:27 No.107504688

File: dasrite.jpg (804 KB, 1024x1024)

804 KB JPG

>>107504641

Anonymous
12/10/25(Wed)10:27:41 No.107504736

Anonymous 12/10/25(Wed)10:27:41 No.107504736

>>107504666
Yeah I really wonder what cause these models that are post-trained on tens to hundreds of billions of tokens of recursive CoT loops to repeat themselves. Maybe we need to add more synthetic thinkslop.

Anonymous
12/10/25(Wed)10:30:22 No.107504758

Anonymous 12/10/25(Wed)10:30:22 No.107504758

>>107504666
All of z-ai's models have repetition issues (in the sense of infinite loops, not in the sense of repetitive but passable prose). Even the ones that are otherwise excellent. I think the other labs are doing something specifically to quell the repetition, which they aren't doing.

Anonymous
12/10/25(Wed)10:31:31 No.107504771

Anonymous 12/10/25(Wed)10:31:31 No.107504771

File: Itssoover.jpg (11 KB, 475x91)

11 KB JPG

>>107504003
>dead hobby, dead general. you can only code, never coom

local models are so dead even Zuck is killing Llama

it's a shame, but I feel like the hardware issue is killing any viability of running any kind of decent model locally nowadays.

Anonymous
12/10/25(Wed)10:34:18 No.107504800

Anonymous 12/10/25(Wed)10:34:18 No.107504800

>>107504736
I don't know anon, I think the anti-cot/anti-synthetic-slop hateboner is mostly just people coping about not being able to run the biggest models.
Even some pre-RL monster like llama 405b is mediocre against a modern 100b cot moe.

Anonymous
12/10/25(Wed)10:34:48 No.107504806

Anonymous 12/10/25(Wed)10:34:48 No.107504806

>>107504771
We've known about this since Altman's dick buddy took over Meta's AI division though

Anonymous
12/10/25(Wed)10:38:00 No.107504841

Anonymous 12/10/25(Wed)10:38:00 No.107504841

>>107504800
>Even some pre-RL monster like llama 405b is mediocre against a modern 100b cot moe.
And you would know, having extensively used and tested 405b, right?

Anonymous
12/10/25(Wed)10:45:53 No.107504924

Anonymous 12/10/25(Wed)10:45:53 No.107504924

tourist here, will fuck off in a moment
is there any news about llama 5?

Anonymous
12/10/25(Wed)10:46:04 No.107504926

Anonymous 12/10/25(Wed)10:46:04 No.107504926

With how badly things have slowed down in the general LLM department and not just local, I think we're lucky if we get even some improvements at all in the next year.

Anonymous
12/10/25(Wed)10:47:59 No.107504949

Anonymous 12/10/25(Wed)10:47:59 No.107504949

>>107504926
>and not just local
true i guess, but gemini3 is really good. and i hated 2.5.
no more claude at work.

Anonymous
12/10/25(Wed)10:49:00 No.107504968

Anonymous 12/10/25(Wed)10:49:00 No.107504968

>>107504924
as soon as sam altfags asian roombuddy creates the datasets. i would say about 2 more weeks. if you are lucky that means before the end of the year!

Anonymous
12/10/25(Wed)10:49:43 No.107504977

Anonymous 12/10/25(Wed)10:49:43 No.107504977

File: media_G7ulKxsaQAAxOKT.jpg (318 KB, 1414x2000)

318 KB JPG

Anonymous
12/10/25(Wed)10:50:35 No.107504985

Anonymous 12/10/25(Wed)10:50:35 No.107504985

>>107504926
Incidentally, Tim Dettmers posted this today:

Why AGI Will Not Happen
https://timdettmers.com/2025/12/10/why-agi-will-not-happen/

Anonymous
12/10/25(Wed)10:50:54 No.107504990

Anonymous 12/10/25(Wed)10:50:54 No.107504990

>>107504977
What models can I run on this Mikubox?

Anonymous
12/10/25(Wed)10:58:35 No.107505075

Anonymous 12/10/25(Wed)10:58:35 No.107505075

>>107504924
>is there any news about llama 5?
They disbanded the llama team and abandoned open source. No more llamas. Latest rumor is that their new api-only model is called avocado and sucks just like the llamas did.

Anonymous
12/10/25(Wed)10:59:53 No.107505088

Anonymous 12/10/25(Wed)10:59:53 No.107505088

>>107505075
rip... thanks.

Anonymous
12/10/25(Wed)11:02:28 No.107505113

Anonymous 12/10/25(Wed)11:02:28 No.107505113

>>107504841
Do you? Or are we arguing hypotheticals here?
I've used it enough through API and on llama.cpp running on a VPS to know that it's nothing special. It felt right at first, even better than modern models maybe, but it was probably placebo. After the context grows it starts making gross mistakes and the magic is gone.
If it actually was better /aicg/ would be all over it, but no, nobody uses that ancient model. They use GLM, Kimi, etc. (beside the proprietary models obviously).
Even when it came out it was only supposed to somewhat compete with Claude 3.5. Do you think Claude 3.5 is better than all the open source models we have today?
Maybe it could be better and it's not that good only because it's undertrained, I'm not sure. But training a behemoth like that takes money and if companies aren't willing to spend on the compute necessary to train it then what's the point. Not to mention it would have to be much better than the similarly sized MoE for most people to bother running it locally.

Anonymous
12/10/25(Wed)11:06:01 No.107505148

Anonymous 12/10/25(Wed)11:06:01 No.107505148

File: 3327-miku-surprised.png (29 KB, 296x241)

29 KB PNG

>>107504688
you wouldn't just steal a migu??!
truly devious

Anonymous
12/10/25(Wed)11:10:27 No.107505187

Anonymous 12/10/25(Wed)11:10:27 No.107505187

>(12/04) Microsoft releases VibeVoice-Realtime-0.5B: https://hf.co/microsoft/VibeVoice-Realtime-0.5B
Anyone figure out how to create embeddings? I know their github said some retarded bullshit about contacting them directly to create one. It also said some shit about safety, has the quality been reduced for this official "open source" release?

Anonymous
12/10/25(Wed)11:12:57 No.107505207

Anonymous 12/10/25(Wed)11:12:57 No.107505207

>>107504800
Considering that the whole point of the neural network is to pick up patterns in text, If you feed it repetitious patterns of text it will become repetitious. This isn't rocket science.
There's also this thing called the law of conservation. You know. The most basic and fundamental principal of the universe. Most people naturally understand it by the age of 6 months old unless they are of non european lineage. Everything an LLM can output is picked up during the training process. Even "in-context learning" comes from deterministic engrams that were formed during training. It doesn't matter if the end non-thinking context is only 500 tokens. there's still 2000, 3000, even 10000+ tokens worth of engram now devoted to answering stupid fucking questions that can no longer be used to connect more distant concepts together. The model literally gets dumber, significantly fucking dumber, Orders of magnitude dumber, just so that it can get better at answering benchmark shaped prompts.
The only good thing thinking models have brought, which only the bigass T scale models can handle well, is tool usage during thinking in order to bring in additional relevant context from outside (i.e. web searching, etc).
'hybrid reasoning' models are garbage for the aforementioned reason. And even when a separate non-reasoning model is provided the pretraining data is still salted with synthetic think-slop trash to help the reasoning version in post-training.
CoT-tards have fucking destroyed an entire scientific frontier with their benchmaxxing bullshit. You are evil in a fucking biblical sense not just stupid.

Anonymous
12/10/25(Wed)11:22:46 No.107505297

Anonymous 12/10/25(Wed)11:22:46 No.107505297

>>107504985
Thanks for the enjoyable read.
>In short, AGI should include physical robots or machines that are able to do economically meaningful work in the physical world
We have these already, they’re called wageslaves. Fussing around with robotics is wasteful purityfagging.

Anonymous
12/10/25(Wed)11:22:47 No.107505299

Anonymous 12/10/25(Wed)11:22:47 No.107505299

Anyone try GLM-TTS yet?

Anonymous
12/10/25(Wed)11:24:46 No.107505320

Anonymous 12/10/25(Wed)11:24:46 No.107505320

>>107505075
The hilarious thing is that the market is so saturated with closed slop there's literally no demand for yet another closed API model on top of the fuckton we already have. Even the top competitors are currently hemorrhaging to death

Anonymous
12/10/25(Wed)11:34:54 No.107505427

Anonymous 12/10/25(Wed)11:34:54 No.107505427

File: 1743345493684929.png (64 KB, 942x526)

64 KB PNG

Great news for the Deepseek 3.2-Speciale enjoyers here. The guy who tried to vibecode the 3.2 support, realized that LLMs write bad CUDA code and then started teaching himself now has started to just port code from vllm.
We might see support for 3.2 a little quicker now.

Anonymous
12/10/25(Wed)11:37:30 No.107505461

Anonymous 12/10/25(Wed)11:37:30 No.107505461

>>107504800
more synthetic slop has only dumbed down models and made them worse for nuance/generalization since they're more geared towards benchmaxxing now
you're so fucking clueless

Anonymous
12/10/25(Wed)11:40:40 No.107505505

Anonymous 12/10/25(Wed)11:40:40 No.107505505

>>107505427
At least he's learning, just wish he wasn't holding up the issue to do so

Anonymous
12/10/25(Wed)11:52:00 No.107505629

Anonymous 12/10/25(Wed)11:52:00 No.107505629

>>107505461
They're separate problems. Long context performance and general intelligence at the context it works at. I agree that it's the training data incest and not even the COT itself.
Plus continually filtered pretrains that remove more and more except synthetic STEM garbage. Pretty soon models will do calculus but won't know underpants don't go on your head. This is the future these faggots want. They simply lack souls and will never create true intelligence.
It's why their movies, their music, their products all suck fat cocks. LLMs are no different.

Anonymous
12/10/25(Wed)11:56:28 No.107505682

Anonymous 12/10/25(Wed)11:56:28 No.107505682

>>107505320
I don't think anyone but Zuck thinks the superintelligence labs thing will be a success. They're going to go from having the best open models to the worse closed models.
To be fair, there is untapped demand if they cared to pivot to a storytelling and roleplay niche, but they won't.

Anonymous
12/10/25(Wed)12:02:42 No.107505753

Anonymous 12/10/25(Wed)12:02:42 No.107505753

How is 4.6V for just text, compared to 4.5 Air? They have warnings about it being crap on the HF model card, but I'm curious if anyone's tried it.

Anonymous
12/10/25(Wed)12:04:27 No.107505772

Anonymous 12/10/25(Wed)12:04:27 No.107505772

>>107504231
Ask the Japanese AMA "Gemma 4どこ”

Anonymous
12/10/25(Wed)12:06:56 No.107505804

Anonymous 12/10/25(Wed)12:06:56 No.107505804

>>107505753
Drastically worse.

Anonymous
12/10/25(Wed)12:10:00 No.107505839

Anonymous 12/10/25(Wed)12:10:00 No.107505839

>>107504985
Every once in a while I see a post like this, and I'm reminded there's a small but steady stream of tourists on this thread. I can't help but wonder how you ended up here, or what people like you imagine this thread is about

Anonymous
12/10/25(Wed)12:13:45 No.107505888

Anonymous 12/10/25(Wed)12:13:45 No.107505888

>>107505207
You don't understand. RL through CoT is the only tool we have right now for general, Turing-complete inductive program synthesis. And by that I don't mean running Python programs in a little sandbox. I mean tool calling through reinforcement learning is the only way for a machine learned model to learn to use an arbitrary amount of tape at runtime to solve general computable problems from a set of input/output examples.
A standard Transformer pass is a O(1) bounded-depth circuit. It physically cannot solve problems that require more serial logical steps than the model has layers, no matter how much "conservation" or data you preach about. Traditional machine learning cannot do this because all the memory in a neural network is bounded. SFT tries to force the model to compress complex reasoning into that fixed circuit depth, which inevitably leads to hallucination when the logic tree is deeper than the layer count. You can NOT teach a neural network (or any other ML model) to use a scratchpad (and ask for as much memory as it needs during runtime, like a Turing machine) to solve problems without reinforcement learning just from a set of input/outputs in a dataset (assuming the dataset is big and diverse enough for the learned solution to generalize, of course).
CoT decouples compute from depth by unrolling the computation loop into the token sequence, effectively multiplying the amount of steps the model can perform given a certain layer count.
Even Graves' Neural Turing Machine and Sutskever's Neural GPU are not true Turing machines in the true sense of the word. Because neural TMs can NOT learn to ask for more memory at runtime. The amount of scratchpad memory on an NTM is fixed during initialization at runtime, like a computer with no network access or slots to plug in external storage.
That's why RL on LLMs is the only plausible way we have of algorithmically synthesizing truly general algorithms merely from a dataset of input/output pairs.

Anonymous
12/10/25(Wed)12:14:16 No.107505894

Anonymous 12/10/25(Wed)12:14:16 No.107505894

I like how "creative" the model can be in response to very high temperature, but at the same time it is useless ofc because it is totally retarded
It's a shame we can't have both of these features at the same time.

Anonymous
12/10/25(Wed)12:16:45 No.107505920

Anonymous 12/10/25(Wed)12:16:45 No.107505920

>>107505894
nsigma

Anonymous
12/10/25(Wed)12:17:17 No.107505929

Anonymous 12/10/25(Wed)12:17:17 No.107505929

>>107505888
model and prompt?

Anonymous
12/10/25(Wed)12:18:54 No.107505942

Anonymous 12/10/25(Wed)12:18:54 No.107505942

Is it possible to train a local model with a few hundred images and only 16GB of vram?

Anonymous
12/10/25(Wed)12:19:39 No.107505948

Anonymous 12/10/25(Wed)12:19:39 No.107505948

>>107505461
See >>107505888
CoT models are theoretically the best architecture we have right now for general, machine learned computation.
There is a difference between training on synthetic TARGETS during CoT RL, by which you are teaching the network to replicate the synthetic targets, and training on synthetic CHAINS OF THOUGHT with the original human written (or at least, human curated) targets, by which you are teaching the model merely to predict the same text as you would with traditional SFT, but with higher accuracy, and it's also much more sample efficient (i.e. generalizes much better, the tradeoff being that it's also much more computationally demanding per sample).
Then there is also RLVR, which helps with tasks like code generation or math but for all we know *might* hurt performance on tasks like creative writing.
But it's not nearly, not remotely as black or white as you guys are making it out to be.

Anonymous
12/10/25(Wed)12:19:39 No.107505950

Anonymous 12/10/25(Wed)12:19:39 No.107505950

>>107505682
>To be fair, there is untapped demand if they cared to pivot to a storytelling and roleplay niche, but they won't.
They did mention last summer that they plan to focus more on entertainment than creating benchmark monsters.

https://techwireasia.com/2025/08/meta-shifts-ai-strategy-toward-personal-superintelligence-and-user-engagement/

>In his “Personal Superintelligence Manifesto,” Zuckerberg predicts that as AI boosts productivity, people will spend less time using productivity software and more time on creative and social activities. He envisions an AI that understands each user, their goals, and how to help them achieve them. While companies like OpenAI, Google, and Anthropic aim to build AI systems that take over more work, Meta wants to use AI to help fill the extra time people gain from increased productivity.
>
>Chris Cox, Meta’s chief product officer, told employees at an all-hands meeting last month that the company will concentrate its AI efforts on entertainment, social connections, and lifestyle features rather than productivity. Heath expects this could lead to AI-powered changes to Meta’s content recommendations, ad targeting, and Reels video generation, along with interactive AI characters designed to keep users engaged longer.

Anonymous
12/10/25(Wed)12:19:58 No.107505953

Anonymous 12/10/25(Wed)12:19:58 No.107505953

>>107505894
high temperature is cope for shitty models
if it's not already creative enough at 1 temp it sucks

Anonymous
12/10/25(Wed)12:20:00 No.107505954

Anonymous 12/10/25(Wed)12:20:00 No.107505954

File: 1745835969221718.png (426 KB, 750x1000)

426 KB PNG

>>107505920

Anonymous
12/10/25(Wed)12:25:07 No.107506006

Anonymous 12/10/25(Wed)12:25:07 No.107506006

>>107505804
I'll keep hoping for 4.6 Air then.

Anonymous
12/10/25(Wed)12:27:38 No.107506029

Anonymous 12/10/25(Wed)12:27:38 No.107506029

>local plateaus with gemmy 3
>closed plateaus with gemini 3

Anonymous
12/10/25(Wed)12:27:52 No.107506031

Anonymous 12/10/25(Wed)12:27:52 No.107506031

>>107506006
It will be released 24*14 hours from now.

Anonymous
12/10/25(Wed)12:29:02 No.107506043

Anonymous 12/10/25(Wed)12:29:02 No.107506043

>>107506029
Half Life 3 confirmed

Anonymous
12/10/25(Wed)12:32:41 No.107506072

Anonymous 12/10/25(Wed)12:32:41 No.107506072

>>107505950
They said a lot of things about their goals for Llama 4 too.

Anonymous
12/10/25(Wed)12:34:03 No.107506095

Anonymous 12/10/25(Wed)12:34:03 No.107506095

>>107505929

>Improve my argument for CoT to the 4chan guy who's questioning the necessity of reinforcement learning on LLMs:
>(original post here)
>You don't understand. RL through CoT is the only tool we have right now for general, Turing-complete inductive program synthesis. And by that I don't mean running Python programs in a little sandbox.
>I mean tool calling through [...this part is the same, truncating due to post character limit...]
>Traditional machine learning cannot do this because all the memory in a neural network is bounded. You can NOT teach a neural network (or any other ML model) to use a scratchpad (and ask for as much memory as it needs during runtime, like a Turing machine) to solve problems without reinforcement learning.
>Even Sutskever's "Neural GPU" is not a true Turing machine in the truest sense of the word. Because neural GPUs can NOT learn to ask for more memory at runtime. The amount of scratchpad memory on a neural GPU is fixed during initialization at runtime, like a computer with no network access or slots to plug in external storage.
>That's why RL on LLMs is the only plausible way we have of synthesizing truly general algorithms merely from a dataset of examples.

Then a few back and forths discussing things (with Gemini 3, but I'm not sure how much the mattered). It mostly just added the point about circuit depth, which I already knew of as logical depth, but didn't think about it when writing the response.
Then also added a clarification where the statement was overly general, clarifying that it only applied to learning from input/output pairs, not by imitation.
And I also added Graves' NTM for completeess, even though Neural GPU is supposed to be strictly better which is why I hadn't bothered to add it to the origial.
All things considered I think you would get a vastly lower quality response if you didn't know what you were talking about enough to produce a convincing draft by yourself.

Anonymous
12/10/25(Wed)12:38:22 No.107506146

Anonymous 12/10/25(Wed)12:38:22 No.107506146

>>107505894
I said it aicg, but models need a temperature setting that excludes the 1000 most common tokens. (or something like that)
I don't know if anybody ever tried something like that, it's clearly impossible to do with corpomodels, but local should be able to. Maybe?

Anonymous
12/10/25(Wed)12:38:38 No.107506149

Anonymous 12/10/25(Wed)12:38:38 No.107506149

>>107505950
Meta's idea of entertainment is the sanitized, business/advertiser friendly shit you find in the metaverse, not the kind of entertainment coomers and people trying to escape reality are looking for

Anonymous
12/10/25(Wed)12:44:13 No.107506202

Anonymous 12/10/25(Wed)12:44:13 No.107506202

>>107506146
It should be trivial to code a sampler that does that. Maybe somebody has already done it even. Sounds like a fairly common sense idea to try.

Anonymous
12/10/25(Wed)12:48:19 No.107506237

Anonymous 12/10/25(Wed)12:48:19 No.107506237

>>107506149
Right, Meta's focused only on scenarios like chatting with pregnant black trans feminist lesbians.

Anonymous
12/10/25(Wed)12:49:40 No.107506250

Anonymous 12/10/25(Wed)12:49:40 No.107506250

>>107506149
https://www.wired.com/story/meta-lawsuit-strike-3-porn-copyright-ai/
>Meta Accused of Torrenting Porn to Advance Its Goal of AI ‘Superintelligence’

I wonder...

Anonymous
12/10/25(Wed)12:54:41 No.107506285

Anonymous 12/10/25(Wed)12:54:41 No.107506285

>>107506250
>there is a timeline in which you were seeded pirated tranny porn straight from meta's servers
I didn't know the future was so bright

Anonymous
12/10/25(Wed)13:07:54 No.107506422

Anonymous 12/10/25(Wed)13:07:54 No.107506422

would it be possible to do a merge with miqu and llama 3.3 70b?

Anonymous
12/10/25(Wed)13:09:59 No.107506439

Anonymous 12/10/25(Wed)13:09:59 No.107506439

>>107506422
>merges
>in 2026-21 days
stop it.

Anonymous
12/10/25(Wed)13:10:53 No.107506450

Anonymous 12/10/25(Wed)13:10:53 No.107506450

>>107506146
That is literally XTC except it does it at random. Someone also made a max_p on faggit but it was transformers only and went nowhere.

Anonymous
12/10/25(Wed)13:12:18 No.107506471

Anonymous 12/10/25(Wed)13:12:18 No.107506471

>>107506439
no

Anonymous
12/10/25(Wed)13:24:44 No.107506607

Anonymous 12/10/25(Wed)13:24:44 No.107506607

Are you guys looking forward to the fresh new wave of TheDrummer™ finetunes that are going to drop thanks to Devstral?
I love dense models :)

Anonymous
12/10/25(Wed)13:25:06 No.107506610

Anonymous 12/10/25(Wed)13:25:06 No.107506610

File: cockbench.png (1.44 MB, 1131x4893)

1.44 MB PNG

>Devstrals are almost exactly as horny as Nemo

Anonymous
12/10/25(Wed)13:28:34 No.107506646

Anonymous 12/10/25(Wed)13:28:34 No.107506646

>>107506610
Yes, it was horny but repetitive. large3 told me I was too pussy to neck myself. The censorship with mistral isn't so much the problem. It's the rest.

Anonymous
12/10/25(Wed)13:29:27 No.107506655

Anonymous 12/10/25(Wed)13:29:27 No.107506655

>>107506422
I think they use different vocabs, so no.

Anonymous
12/10/25(Wed)13:32:29 No.107506682

Anonymous 12/10/25(Wed)13:32:29 No.107506682

>>107506610
>I can't help but feel
>I can't help but wonder
>I can't believe it
>I can't help but feel
>I can't help but smile
Sure are Mistral models, alright.

Anonymous
12/10/25(Wed)13:33:49 No.107506699

Anonymous 12/10/25(Wed)13:33:49 No.107506699

>>107506422
They're two entirely different families of model so of course not. You can merge it with llama2-chat or Platypus2 if you want. Maybe Xwin or the original Euryale if you're daring.

Anonymous
12/10/25(Wed)13:40:43 No.107506767

Anonymous 12/10/25(Wed)13:40:43 No.107506767

>>107506699
Have people tried making model ensembles of different families? I guess you could restrict the tokenizer and lm-head to only the tokens in common between the two models, and then average the logits...

Anonymous
12/10/25(Wed)13:41:41 No.107506780

Anonymous 12/10/25(Wed)13:41:41 No.107506780

>>107506699
>They're two entirely different families of model
The entirely llama family is a barely modified GPT-2 architecture. Their biggest change was adding GQA for Llama 2.

Anonymous
12/10/25(Wed)14:04:48 No.107507011

Anonymous 12/10/25(Wed)14:04:48 No.107507011

Devstral 2 seems to be as dumb as the small one for agentic coding...

Anonymous
12/10/25(Wed)14:36:10 No.107507376

Anonymous 12/10/25(Wed)14:36:10 No.107507376

I've pretty much only been local image genning recently, is there anything I've missed out on the last half year or so? For general queries gemini is pretty good now and for NSFW stuff grok is shockingly degenerate to me, am I missing out on anything with my pleb 16 gb vram?

Anonymous
12/10/25(Wed)14:36:48 No.107507382

Anonymous 12/10/25(Wed)14:36:48 No.107507382

I didn't really like Ministral that much, but have been fiddling around with the Brother-Dusk-14B (v1c) finetune a bit. Doesn't give as many refusals as the base model and isn't too bad with repetition. Only really interested in a smaller/faster model than Cydonia 24B since I'm stuck with a 12GB 3080. Cydonia is still fine if I don't mind a slightly longer wait.

Anonymous
12/10/25(Wed)14:48:11 No.107507512

Anonymous 12/10/25(Wed)14:48:11 No.107507512

>>107507376
No worthwhile models at all, only the meager comfort of knowing Elon won't learn of your diaper fetish

Hi all, Drummer here...
12/10/25(Wed)14:52:06 No.107507550

Hi all, Drummer here... 12/10/25(Wed)14:52:06 No.107507550

>>107506607
>>107506610
Devstral 123B doesn't write well though. And it breaks down too easily in RP. Hopefully those can be tuned out.

>>107507382
v1b was a better attempt than v1c. Try that out! (It's still crap)

Anonymous
12/10/25(Wed)14:53:11 No.107507567

Anonymous 12/10/25(Wed)14:53:11 No.107507567

>>107504736
I haven't tested any GLM base models but I have past bases and they fall into repetition very fast. In some cases worse than their Instruct versions. It's not just a post training issue. And before you say it's because of synthetic data, I also mean super early models. So post training actually can reduce repetition if done right, but I haven't seen anyone post exactly what labs did to reduce repetition over the base model. Or perhaps fine tuning inherently reduces repetition if the post training data has been sufficiently varied.

Anonymous
12/10/25(Wed)14:57:59 No.107507613

Anonymous 12/10/25(Wed)14:57:59 No.107507613

>>107507512
Okay, thanks for your response I wasn't really getting my hopes up but still too bad

Anonymous
12/10/25(Wed)15:07:08 No.107507676

Anonymous 12/10/25(Wed)15:07:08 No.107507676

>>107507382
>Sponsored post
I'm not sure how you got Ministral to refuse anything unless you were trying with an empty prompt.

Anonymous
12/10/25(Wed)15:17:10 No.107507764

Anonymous 12/10/25(Wed)15:17:10 No.107507764

>>107507567
I don't believe you tested anything.
Go be jewish somewhere else.

Anonymous
12/10/25(Wed)15:21:51 No.107507803

Anonymous 12/10/25(Wed)15:21:51 No.107507803

File: d3512d26e6c5b5b7.jpg (104 KB, 1417x1079)

104 KB JPG

>wanted a fart lora for a joke/meme
>found out all the fetish people got them all banned from every site ever
>the only sites that have any don't allow downloads and are super shady
god
damn it

Anonymous
12/10/25(Wed)15:23:18 No.107507822

Anonymous 12/10/25(Wed)15:23:18 No.107507822

>>107507550
Thanks, chef.

>>107507676
To be fair, I do enjoy testing out refusal rates on empty prompts just to see how they go. But even with some cards/prompts, it was still being occasionally pissy about ethics with me. Could be worse I suppose.

Anonymous
12/10/25(Wed)15:25:20 No.107507849

Anonymous 12/10/25(Wed)15:25:20 No.107507849

>>107507567
I think the looping issue of base models might be due both to them being pretrained largely on very short documents and not being strongly conditioned (i.e. overfit) to follow a specific response pattern, so they'll just start repeating prior context when they're unsure of how to continue.

Anonymous
12/10/25(Wed)15:37:23 No.107507974

Anonymous 12/10/25(Wed)15:37:23 No.107507974

My main issue with GLM air 4.5 is how it re-quotes or repeats what I said in a RP situation. Other than that its a good RP model, still waiting 2 weeks for 4.6

Example:
"Why don't you kill yourself" Anon said
When Bot heard anon say "Why don't you kill yourself" it blah blah blah

Instead of just responding, its like it would summarize speech dialogue and then respond to it. Is that what you all mean when you say it has parroting issues?

It was so hard to force it with strict prompting to stop this crap, and it still does it occasionally. It certainly feels like a thinking issue, because I don't use thinking when rp'ing. God I hate thinking, hybrid or not.

On a more positive note, that kind of crap taught me to use OOC messages to try to solve the problem with it. I would pause the roleplay and ask it in OOC why it does that and what it would define that as, then tell it to take my prompt and revise it to get it to stop doing that, then with a bit of editing I would have an improved prompt, I actually learned a pretty fair amount on how LLM's really respond to prompting, what works and what doesn't, the types of phrasing and definitions they use, etc.

Anonymous
12/10/25(Wed)15:51:55 No.107508079

Anonymous 12/10/25(Wed)15:51:55 No.107508079

>>107505075
lol they're distilling Gemma, toss and Qwen
https://www.bloomberg.com/news/articles/2025-12-10/inside-meta-s-pivot-from-open-source-to-money-making-ai-model

Anonymous
12/10/25(Wed)15:52:27 No.107508087

Anonymous 12/10/25(Wed)15:52:27 No.107508087

>>107507974
What line did you use to get it to stop quoting itself? I'm sick of the repetition issues as well.

Anonymous
12/10/25(Wed)15:52:51 No.107508090

Anonymous 12/10/25(Wed)15:52:51 No.107508090

what are the best sampler settings

Anonymous
12/10/25(Wed)15:53:58 No.107508098

Anonymous 12/10/25(Wed)15:53:58 No.107508098

>>107508090
Everything neutral
Temperature at 1.0

Anonymous
12/10/25(Wed)15:54:10 No.107508101

Anonymous 12/10/25(Wed)15:54:10 No.107508101

>>107508090
3

Anonymous
12/10/25(Wed)16:05:54 No.107508184

Anonymous 12/10/25(Wed)16:05:54 No.107508184

Is there still no native tool to convert safetensors into a gguf? Is it all just reliant on transformers for everything?

Anonymous
12/10/25(Wed)16:06:44 No.107508191

Anonymous 12/10/25(Wed)16:06:44 No.107508191

>>107507974
yes, now imagine this in a back and forth chat because it does it too...

Anonymous
12/10/25(Wed)16:08:52 No.107508213

Anonymous 12/10/25(Wed)16:08:52 No.107508213

File: 499683473.jpg (535 KB, 1920x1200)

535 KB JPG

>>107507974
> actually learned a pretty fair amount on how LLM's really respond to prompting
fucking THIS
everybody should do this, i'm tired of explaining how important a well crafted system prompt is.
nice job anon.

Anonymous
12/10/25(Wed)16:09:48 No.107508222

Anonymous 12/10/25(Wed)16:09:48 No.107508222

>>107508079
holy based, imagine the pvre concentrated slop this thing will produce. my eyes are glinting with anticipation already

Anonymous
12/10/25(Wed)16:23:23 No.107508354

Anonymous 12/10/25(Wed)16:23:23 No.107508354

>>107508090
for nemo or 24b tunes temp 1-1.2 smooth factor 0.2 topk 50-150 (optional minp 0.007-0.03) all other samplers off (optional rep pen or dry rep whatever)
alternatively neutralize samplers -> temp 1 (or above) nsigma 1 (or above) and optional topk

Anonymous
12/10/25(Wed)16:38:19 No.107508505

Anonymous 12/10/25(Wed)16:38:19 No.107508505

>>107507974
Care to share your prompt?

Anonymous
12/10/25(Wed)16:53:18 No.107508595

Anonymous 12/10/25(Wed)16:53:18 No.107508595

File: log-prompt.png (11 KB, 672x117)

11 KB PNG

>>107508213
gudpic
>learned a pretty fair amount
Yes! everyone should be looking carefully at the exact tokens going into their f(prompt)=next prediction model. Many complainers don't know what's actually happening in their inference stack

Anonymous
12/10/25(Wed)16:57:45 No.107508640

Anonymous 12/10/25(Wed)16:57:45 No.107508640

Is there any better model than gpt-oss:20b for agentic use cases with 16GB VRAM?

Anonymous
12/10/25(Wed)17:11:21 No.107508775

Anonymous 12/10/25(Wed)17:11:21 No.107508775

>>107508640
you could always hire a pajeet instead

Anonymous
12/10/25(Wed)17:19:36 No.107508851

Anonymous 12/10/25(Wed)17:19:36 No.107508851

>>107508079
The article has many interesting tidbits. And LeCun was basically kicked out?

>One new model, codenamed Avocado, is expected to debut sometime next spring, and may be launched as a “closed” model — one that can be tightly controlled and that Meta can sell access to, according to people familiar with the matter, who declined to speak publicly about internal plans.

>The TBD group is using several third-party models as part of the training process for Avocado, distilling from rival models including Google’s Gemma, OpenAI’s gpt-oss and Qwen, a model from the Chinese tech giant Alibaba Group Holding Ltd., the people said.

>Meanwhile, Meta has de-prioritized its open-source strategy. Some Meta employees were directed by leadership to stop talking publicly about open-source and Llama products after the Llama 4 launch while the company recalibrated whether those efforts still made sense moving forward, according to people familiar with the moves.

>Yann LeCun, known as one of the godfathers of AI, recently left the company after years leading Meta’s long-term AI research group, in part because of frustrations that he couldn’t get enough resources, Bloomberg News reported. Prior to his departure, some employees had been encouraged to keep LeCun, who was a big proponent of open-source technology, out of the spotlight, including at public speaking events, the people said. Meta no longer saw him as emblematic of the company’s AI strategy, and couldn’t trust that he’d stay on message, they added.

>The model after Llama 4 had the internal code name Behemoth — but Zuckerberg was disappointed in its direction and scrapped it in pursuit of something new, the people said.

Anonymous
12/10/25(Wed)17:20:38 No.107508865

Anonymous 12/10/25(Wed)17:20:38 No.107508865

>>107508775
I will stick with gpt-oss:20b

Anonymous
12/10/25(Wed)17:36:17 No.107508993

Anonymous 12/10/25(Wed)17:36:17 No.107508993

what's the goto vramlet model nowadays?

Anonymous
12/10/25(Wed)17:44:07 No.107509062

Anonymous 12/10/25(Wed)17:44:07 No.107509062

>>107508993
suicide

Anonymous
12/10/25(Wed)17:44:37 No.107509068

Anonymous 12/10/25(Wed)17:44:37 No.107509068

>>107508851
>distilling from rival models including Google’s Gemma, OpenAI’s gpt-oss and Qwen
Imagine spending a billion dollars poaching employees from your competitors and the best they can come up with is copying your competitors' free offerings.
Then imagine believing them when they tell you everything is going great.

Anonymous
12/10/25(Wed)17:49:46 No.107509108

Anonymous 12/10/25(Wed)17:49:46 No.107509108

>>107508851
>next spring meta will have a closed gpt oss alternative
metabros we are so back

Anonymous
12/10/25(Wed)17:53:25 No.107509122

Anonymous 12/10/25(Wed)17:53:25 No.107509122

>>107508354
Temp=1 is way too high for mistral models

Anonymous
12/10/25(Wed)18:04:29 No.107509243

Anonymous 12/10/25(Wed)18:04:29 No.107509243

>>107509122
skill issue

Anonymous
12/10/25(Wed)18:04:57 No.107509245

Anonymous 12/10/25(Wed)18:04:57 No.107509245

>>107509243
Yes, you certainly have one.

Anonymous
12/10/25(Wed)18:05:36 No.107509249

Anonymous 12/10/25(Wed)18:05:36 No.107509249

>>107509122
for the new devstrals it says to use 0.15

Anonymous
12/10/25(Wed)18:07:20 No.107509262

Anonymous 12/10/25(Wed)18:07:20 No.107509262

>>107509245
So what you're saying is, it is I who certainly has a skill issue?

A temperature of 1 is way too high for Mistral models, huh?

Anonymous
12/10/25(Wed)18:08:02 No.107509271

Anonymous 12/10/25(Wed)18:08:02 No.107509271

>>107509249
Pretty sure they said the same thing with Small, which is appropriate for assistant-type tasks. For RP/creative, they're fine up to 0.6-0.7. But 1.0 is going to make them significantly dumber.

Anonymous
12/10/25(Wed)18:09:02 No.107509277

Anonymous 12/10/25(Wed)18:09:02 No.107509277

>>107509262
GLM pls go

Anonymous
12/10/25(Wed)18:11:18 No.107509292

Anonymous 12/10/25(Wed)18:11:18 No.107509292

>>107508851
>distilling a distill
That's how you get model collapse.

Anonymous
12/10/25(Wed)18:12:28 No.107509295

Anonymous 12/10/25(Wed)18:12:28 No.107509295

>>107509292
benchmark leaderboards say otherwise :^)

Anonymous
12/10/25(Wed)18:16:19 No.107509322

Anonymous 12/10/25(Wed)18:16:19 No.107509322

>>107509295
You don't need to distill a distill to get on top of leaderboards, you just need to train on the dataset, which is easily rephrased.

Anonymous
12/10/25(Wed)18:20:23 No.107509366

Anonymous 12/10/25(Wed)18:20:23 No.107509366

>>107509292
Worked for Mistral.

Anonymous
12/10/25(Wed)18:30:00 No.107509447

Anonymous 12/10/25(Wed)18:30:00 No.107509447

Stealstral

Anonymous
12/10/25(Wed)18:31:13 No.107509452

Anonymous 12/10/25(Wed)18:31:13 No.107509452

Misteal

Anonymous
12/10/25(Wed)18:31:30 No.107509454

Anonymous 12/10/25(Wed)18:31:30 No.107509454

>>107505888
>CoT is the only tool we have right now for general, Turing-complete inductive program synthesis
false

Anonymous
12/10/25(Wed)18:32:16 No.107509462

Anonymous 12/10/25(Wed)18:32:16 No.107509462

>>107509447
>>107509452
>>107509366
What's chang gonna do about it?

Anonymous
12/10/25(Wed)18:35:06 No.107509485

Anonymous 12/10/25(Wed)18:35:06 No.107509485

>>107509462
Chang isn't going to interrupt his opponent while he is in the middle of making a mistake

Anonymous
12/10/25(Wed)18:37:22 No.107509507

Anonymous 12/10/25(Wed)18:37:22 No.107509507

>>107509485
Mistral AI is 2 years old and is currently worth over 14billion USD, Why aren't you making mistakes like that?

Anonymous
12/10/25(Wed)18:46:55 No.107509571

Anonymous 12/10/25(Wed)18:46:55 No.107509571

>>107509507
Yes and they are producing dogshit. Next you're going to say Meta makes good models because they're valued at 1.6 trillion dollars.

Anonymous
12/10/25(Wed)18:50:01 No.107509598

Anonymous 12/10/25(Wed)18:50:01 No.107509598

>>107509571
Nemo is still the gold standard of small models and china's only notable model, DS-R1, has been completely forgotten about now that the honeymoon period's over.

Anonymous
12/10/25(Wed)18:51:08 No.107509605

Anonymous 12/10/25(Wed)18:51:08 No.107509605

>>107509454
Well, I mean, it's the only practical choice. Theoretically you can do Levin search over all possible Turing machines to find the optimal solution but that shit doesn't work in real life. And yes, because of the halting problem you won't actually know if you found the actual optimal solution or there is an even more optimal one if you keep searching. But if you search long enough you are guaranteed to find it. You just wont know if you got there already.

Anonymous
12/10/25(Wed)18:52:11 No.107509615

Anonymous 12/10/25(Wed)18:52:11 No.107509615

>>107509598
>DS-R1, has been completely forgotten
We live in the era powered by DeepSeek. All the best models we have were directly influenced by it.

Anonymous
12/10/25(Wed)18:53:47 No.107509628

Anonymous 12/10/25(Wed)18:53:47 No.107509628

>>107505942
Yes, if you are patient

Anonymous
12/10/25(Wed)18:55:10 No.107509643

Anonymous 12/10/25(Wed)18:55:10 No.107509643

>>107509598
Mistral hasn't made anything good since Nemo, and there have been at least two other chinese models that are notable, Kimi and GLM. I don't even like the chink models but you have to stop sucking french dick, it's unbecoming.

Anonymous
12/10/25(Wed)18:58:23 No.107509669

Anonymous 12/10/25(Wed)18:58:23 No.107509669

>>107509598
/Wait/ing for another two weeks

Anonymous
12/10/25(Wed)18:59:17 No.107509676

Anonymous 12/10/25(Wed)18:59:17 No.107509676

>>107509643
>two other chinese models that are notable, Kimi and GLM
lmao, everyone who actually has the hardware for those two got sick of them after a week.

Anonymous
12/10/25(Wed)19:04:13 No.107509712

Anonymous 12/10/25(Wed)19:04:13 No.107509712

>>107509676
are you one of those people? poast nvidia-smi

Anonymous
12/10/25(Wed)19:04:54 No.107509718

Anonymous 12/10/25(Wed)19:04:54 No.107509718

>>107509676
If you say so.

Anonymous
12/10/25(Wed)19:08:08 No.107509740

Anonymous 12/10/25(Wed)19:08:08 No.107509740

>>107506422
won't you just get a cucked model then?
i'd rather see a miqu retrain since it's an actually good base

Anonymous
12/10/25(Wed)19:09:32 No.107509750

Anonymous 12/10/25(Wed)19:09:32 No.107509750

>>107509271
This is lmg so you know how temp works right? A lower one just makes it choose more likely tokens. It's not gonna make it stop repeating or any other undesirable behaviors.

Anonymous
12/10/25(Wed)19:10:49 No.107509757

Anonymous 12/10/25(Wed)19:10:49 No.107509757

File: 1733886608456811.png (2.19 MB, 1058x1791)

2.19 MB PNG

Alien's first contact with humanity will be through an LLM

Anonymous
12/10/25(Wed)19:10:58 No.107509758

Anonymous 12/10/25(Wed)19:10:58 No.107509758

>>107509605
what, no. that's not what I'm talking about loool im talking about the dot by dot paper. no need to snipe me with formal logic.

Anonymous
12/10/25(Wed)19:14:51 No.107509784

Anonymous 12/10/25(Wed)19:14:51 No.107509784

>>107509750
Yes, and likely tokens are more likely for a reason. If you increase temp too much then you're just going to see the model make more weird choices and mistakes, it's not going to magically make it a better writer.

Anonymous
12/10/25(Wed)19:17:01 No.107509802

Anonymous 12/10/25(Wed)19:17:01 No.107509802

>>107509757
>nanoGPT
>H100
Isn't this something you can train on your CPU in 5 minutes?

Anonymous
12/10/25(Wed)19:17:21 No.107509806

Anonymous 12/10/25(Wed)19:17:21 No.107509806

>>107509676
Kimifag here, no I haven't.

Anonymous
12/10/25(Wed)19:19:10 No.107509824

Anonymous 12/10/25(Wed)19:19:10 No.107509824

>>107509757
Aliens first contact with humanity will be an intercepted communication between /aicg/ gooners and a datacenters in space.

Anonymous
12/10/25(Wed)19:22:17 No.107509855

Anonymous 12/10/25(Wed)19:22:17 No.107509855

>>107509824
And the first message will be
>ayy ayy mistress

Anonymous
12/10/25(Wed)19:26:57 No.107509885

Anonymous 12/10/25(Wed)19:26:57 No.107509885

>pp 440.83 tokens per second
>eval time 50 tokens 8.04 tokens per second
Is this just a Devstral thing, why is the pp high while inference is glacial

Anonymous
12/10/25(Wed)19:29:24 No.107509905

Anonymous 12/10/25(Wed)19:29:24 No.107509905

>>107509757
>Carl Sagan proposes that Hitler's opening speech at the 1936 Olympics may be the first signal aliens encounter from Earth.

Anonymous
12/10/25(Wed)19:32:08 No.107509917

Anonymous 12/10/25(Wed)19:32:08 No.107509917

heh pp

Anonymous
12/10/25(Wed)19:33:16 No.107509926

Anonymous 12/10/25(Wed)19:33:16 No.107509926

>>107509366
Do you hear universal praise for new mistral? Is it on top of lmarena?

Anonymous
12/10/25(Wed)19:36:00 No.107509945

Anonymous 12/10/25(Wed)19:36:00 No.107509945

>>107509926
>Is it on top of lmarena?
Do we still seriously take that as a sign of quality?

Anonymous
12/10/25(Wed)19:41:36 No.107509978

Anonymous 12/10/25(Wed)19:41:36 No.107509978

File: ministral3-14b--no-talkin(...).png (64 KB, 718x207)

64 KB PNG

>>107506610
I experienced the repetition issue with Devstral 2 24B. After 15 messages or so it didn't seem to know how to continue and just repeated its previous message. I had to switch to Ministral, which appears to be moderately competent for ERP when continuing a conversation.

Ministral 3 14B has to be one of the horniest and dirtiest official instruct finetunes I've ever seen; unfortunately it's not capable of engaging one from scratch without adding asterisks everywhere or being too retarded in the first few messages.

It's all so tiresome.

Anonymous
12/10/25(Wed)19:42:12 No.107509981

Anonymous 12/10/25(Wed)19:42:12 No.107509981

>>107509945
Yes. Say what you want about lmarena, but it's the most difficult benchmark to game.

Anonymous
12/10/25(Wed)19:43:32 No.107509988

Anonymous 12/10/25(Wed)19:43:32 No.107509988

>>107505505
How is he holding it up? Anyone else can swoop in and complete it, or help him.
Are you not doing it because he is?

Anonymous
12/10/25(Wed)19:43:38 No.107509990

Anonymous 12/10/25(Wed)19:43:38 No.107509990

>>107509981
You are absolutely right! :rocket: :rocket: :moon: :finger_point: :glasses:

Anonymous
12/10/25(Wed)19:44:01 No.107509996

Anonymous 12/10/25(Wed)19:44:01 No.107509996

>>107509981
lmao

Anonymous
12/10/25(Wed)19:44:28 No.107510000

Anonymous 12/10/25(Wed)19:44:28 No.107510000

So many little fishy today. No need for any bait

Anonymous
12/10/25(Wed)19:45:04 No.107510006

Anonymous 12/10/25(Wed)19:45:04 No.107510006

>>107509981
llama 4.

Anonymous
12/10/25(Wed)19:47:52 No.107510025

Anonymous 12/10/25(Wed)19:47:52 No.107510025

>>107510006
What's the issue? Behemoth(which they called experimental maverick) was not bad and I'd be happy to use it if it was open.

Anonymous
12/10/25(Wed)19:50:34 No.107510040

Anonymous 12/10/25(Wed)19:50:34 No.107510040

>>107510006
10-15% of the total votes must have been from me alone as I couldn't believe what sort of deranged responses the models were giving compared to most other ones there, a good fraction being cunny-related queries. I feel responsible for that, in a way.

Anonymous
12/10/25(Wed)20:02:53 No.107510149

Anonymous 12/10/25(Wed)20:02:53 No.107510149

>>107509784
elara and shivers are top tokens.
>1.0 temp
>too much
That's literally the distribution as it was trained.

Anonymous
12/10/25(Wed)20:05:11 No.107510171

Anonymous 12/10/25(Wed)20:05:11 No.107510171

>>107510149
Mistral somehow fucks this up and their 1.0 is around 0.4-0.6

Anonymous
12/10/25(Wed)20:07:36 No.107510191

Anonymous 12/10/25(Wed)20:07:36 No.107510191

>>107510171
Its literally not possible. That's post training revisionism. I tried lower temps with large3 anyways and it didn't do shit.

Anonymous
12/10/25(Wed)20:09:11 No.107510203

Anonymous 12/10/25(Wed)20:09:11 No.107510203

File: 96362.jpg (99 KB, 1206x1128)

99 KB JPG

>>107506029
can't wait until sam btfos everyone and gets rewarded more data centers

Anonymous
12/10/25(Wed)20:13:38 No.107510234

Anonymous 12/10/25(Wed)20:13:38 No.107510234

>>107510203
Can I grill some RAM?

Anonymous
12/10/25(Wed)20:20:21 No.107510296

Anonymous 12/10/25(Wed)20:20:21 No.107510296

>>107510234
Why not just RAM a grill?

Anonymous
12/10/25(Wed)20:30:32 No.107510378

Anonymous 12/10/25(Wed)20:30:32 No.107510378

What are everyone's go-to models right now for their hardware?

I have 96gb vram (4x3090) and my current stuff is:
devstral 2 for agent coding (based mistral)
gpt-oss-120b-derestricted for erp and normal coding questions (i.e. "gimme some python that does x")
midnight-miku-70b for erp
qwen3-VL-32B-Thinking-abliterated for nsfw captioning
z-image and an illustrious finetune for nsfw image gen

Previously liked gemma 3 and qwq when I only had 2 3090s.

Anonymous
12/10/25(Wed)20:32:35 No.107510393

Anonymous 12/10/25(Wed)20:32:35 No.107510393

>>107510378 (Me)
gemma 3 abliterated that is. I still go back sometimes if I want to erp with a brain damaged slut

Anonymous
12/10/25(Wed)20:35:20 No.107510416

Anonymous 12/10/25(Wed)20:35:20 No.107510416

File: 1740828236045696.gif (209 KB, 900x650)

209 KB GIF

24GB VRAM (RTX 3090), 32GB RAM
Cydonia 24b v4.3 for coom
Gemma 27b for general assistant tasks, non-erotic creative stuff
Qwen3-VL-30B-A3B on the rare occasion I want to use vision stuff. I use Q6_K_L with partial RAM offload.
To be honest if I need some quick coding/scripting shit done I just use chatgpt free. Faster and far smarter than anything I can run locally.

Anonymous
12/10/25(Wed)20:37:40 No.107510432

Anonymous 12/10/25(Wed)20:37:40 No.107510432

>>107510378
>glm 4.5 air ignored
as it should be

Anonymous
12/10/25(Wed)20:49:29 No.107510536

Anonymous 12/10/25(Wed)20:49:29 No.107510536

I gave up on big models and now use 24B, at least it doesn’t piss me off when I don’t have to wait at 3T/s. I’ll wait until we get models that are actually worth running on serious hardware. 24B is dumb, but there are no not-dumb models, only less dumb ones, and it’s not worth the wait

Anonymous
12/10/25(Wed)20:51:03 No.107510550

Anonymous 12/10/25(Wed)20:51:03 No.107510550

The whole year has been such a disappointment

Anonymous
12/10/25(Wed)20:54:56 No.107510581

Anonymous 12/10/25(Wed)20:54:56 No.107510581

>>107510550
YWNBAW

Anonymous
12/10/25(Wed)20:55:31 No.107510586

Anonymous 12/10/25(Wed)20:55:31 No.107510586

>>107510203
does sam have general intelligence ?

Anonymous
12/10/25(Wed)20:56:02 No.107510591

Anonymous 12/10/25(Wed)20:56:02 No.107510591

Unfortunely i have a AMD GPU and setup this shit is a pain in ass, nothing work propely.

Anonymous
12/10/25(Wed)20:56:10 No.107510595

Anonymous 12/10/25(Wed)20:56:10 No.107510595

>>107510581
And that's the most disappointing part!

Anonymous
12/10/25(Wed)20:57:35 No.107510606

Anonymous 12/10/25(Wed)20:57:35 No.107510606

>>107510591
LOL KEK LMAO XDD

Anonymous
12/10/25(Wed)20:57:58 No.107510608

Anonymous 12/10/25(Wed)20:57:58 No.107510608

>>107510591
There may be a correlation between buying AMD and technical incompetence, but it's not the hardware's fault. Textgen especially is dead simple with AMD, just use kobold if you're a brainlet and select Vulkan as the backend.

Anonymous
12/10/25(Wed)20:58:41 No.107510612

Anonymous 12/10/25(Wed)20:58:41 No.107510612

>>107510581
Thanks God for that (I'm a conservative)

Anonymous
12/10/25(Wed)21:01:09 No.107510630

Anonymous 12/10/25(Wed)21:01:09 No.107510630

>>107510591
I'm running everything just fine with llama.cpp, a 7900 XTX, and 128GB RAM

Anonymous
12/10/25(Wed)21:01:45 No.107510637

Anonymous 12/10/25(Wed)21:01:45 No.107510637

>>107510416
>Gemma 27b for general assistant tasks, non-erotic creative stuff
>Qwen3-VL-30B-A3B on the rare occasion I want to use vision stuff

Gemma-3-27B is also able to do "vision stuff" and is quite good. Save you swapping models.

Anonymous
12/10/25(Wed)21:04:27 No.107510654

Anonymous 12/10/25(Wed)21:04:27 No.107510654

>>107510637
I'm very aware of that, but in my experience the new Qwen3-VLs are much, much better at vision. Even the 4B Qwen was catching details that Gemma 27b missed. And for the most part I'm not doing vision stuff in the middle of a chat or anything, so there's not much switching going on.

Anonymous
12/10/25(Wed)21:05:08 No.107510656

Anonymous 12/10/25(Wed)21:05:08 No.107510656

>>107510591
Install Linux, unironically

Anonymous
12/10/25(Wed)21:07:10 No.107510671

Anonymous 12/10/25(Wed)21:07:10 No.107510671

>>107510586
He's a literal faggot, raising a child, and just told Jimmy Falon he doesn't understand how people can raise children without ChatGPT, so no.

Anonymous
12/10/25(Wed)21:07:41 No.107510675

Anonymous 12/10/25(Wed)21:07:41 No.107510675

>>107510656
NTA but there's really nothing wrong with AMD drivers on windows

Anonymous
12/10/25(Wed)21:08:19 No.107510681

Anonymous 12/10/25(Wed)21:08:19 No.107510681

Checking in after 3-4 months. Been using Kimi K2. New best local model?

Anonymous
12/10/25(Wed)21:09:06 No.107510689

Anonymous 12/10/25(Wed)21:09:06 No.107510689

>>107510608
Ok i admit, partially is skill issue, i dont have patient to configure everything on minimum details, for other side, NVIDIA looks plug in play, just download github repo and use, AMD require time and instructions are not clear what you have to do.

Anonymous
12/10/25(Wed)21:09:36 No.107510695

Anonymous 12/10/25(Wed)21:09:36 No.107510695

>>107510681
It was Nemo then, and it's Nemo now.

Anonymous
12/10/25(Wed)21:11:18 No.107510705

Anonymous 12/10/25(Wed)21:11:18 No.107510705

>>107510689
Are you confusing this with the image gen general? Because on the textgen side it's also plug and play. AMD is a bit slower but it's not any different to get it running.

Anonymous
12/10/25(Wed)21:11:26 No.107510707

Anonymous 12/10/25(Wed)21:11:26 No.107510707

>>107510671
kek, i already knew the answer
what a faggot this guy is

Anonymous
12/10/25(Wed)21:21:49 No.107510797

Anonymous 12/10/25(Wed)21:21:49 No.107510797

>>107510705
Yes, i messed up.

Anonymous
12/10/25(Wed)21:24:19 No.107510817

Anonymous 12/10/25(Wed)21:24:19 No.107510817

https://www.reddit.com/r/StableDiffusion/comments/1pj8evi/the_stop_button_is_gone_after_the_latest_comfyui/
Sirs what is going on with ComfyUI? Did that guy get a shitton of money from BlackRock to enshittify it?

Anonymous
12/10/25(Wed)21:25:22 No.107510824

Anonymous 12/10/25(Wed)21:25:22 No.107510824

>>107510536
same
but 70B for me

Anonymous
12/10/25(Wed)21:26:04 No.107510830

Anonymous 12/10/25(Wed)21:26:04 No.107510830

File: 1734062437592313.jpg (32 KB, 524x478)

32 KB JPG

>>107510817
THIS ISN'T THE IMAGE GEN GENERAL YOU DUMB NIGGER

Anonymous
12/10/25(Wed)21:28:59 No.107510848

Anonymous 12/10/25(Wed)21:28:59 No.107510848

>>107510830
Seethe silently.

Anonymous
12/10/25(Wed)21:33:10 No.107510885

Anonymous 12/10/25(Wed)21:33:10 No.107510885

>The term "nagger" has racial connotations tied to historical oppression and is considered offensive regardless of context. Use neutral language like "complainer."

Anonymous
12/10/25(Wed)21:33:27 No.107510887

Anonymous 12/10/25(Wed)21:33:27 No.107510887

File: 1751154528944807.jpg (1.62 MB, 2048x2732)

1.62 MB JPG

>>107510830
Comfy is local
https://files.catbox.moe/ouxnmk.mp4

Anonymous
12/10/25(Wed)21:35:22 No.107510892

Anonymous 12/10/25(Wed)21:35:22 No.107510892

>>107510887
the should have called it local llm general

Anonymous
12/10/25(Wed)21:35:24 No.107510894

Anonymous 12/10/25(Wed)21:35:24 No.107510894

>>107510887
>vid
cringe soilord garbage
>pic
sex

Anonymous
12/10/25(Wed)21:36:42 No.107510904

Anonymous 12/10/25(Wed)21:36:42 No.107510904

>>107510894
kuroko is indeed sex
https://files.catbox.moe/3gyc4g.mp4

Anonymous
12/10/25(Wed)21:37:03 No.107510907

Anonymous 12/10/25(Wed)21:37:03 No.107510907

>>107510887
That cuck behavior should be studied

Anonymous
12/10/25(Wed)21:39:35 No.107510922

Anonymous 12/10/25(Wed)21:39:35 No.107510922

>>107510654
>Qwen 4B better than Gemma 27B
Baits used to be believable

Anonymous
12/10/25(Wed)21:42:45 No.107510945

Anonymous 12/10/25(Wed)21:42:45 No.107510945

>>107510922
For VISION, nigger
V
I
S
I
O
N
Gemma 27b's vision encoder isn't 6x bigger than qwen's 4b's. And Gemma 3 is 9 months old at this point, it's practically ancient by LLM standards.

Anonymous
12/10/25(Wed)21:43:10 No.107510949

Anonymous 12/10/25(Wed)21:43:10 No.107510949

>>107510681
K2 thinking is better
Needs some wrangling, but at Q4 it’s hard to beat

Anonymous
12/10/25(Wed)21:47:00 No.107510979

Anonymous 12/10/25(Wed)21:47:00 No.107510979

Everyone gonna waitfag until the 4090D 48GB stock dries up? Despite everything else going up, it's still only 23K HKD (about 3K USD).

4090D 48GB is the best value there is. Anything worth running locally (no, DeepSeek is not worth it) it runs blazing fast, and it opens the door to big-boy stuff like running Ovi or LongCat-Video, or Wan 2.2 i2v LoRA training.

I guess the V100 32GB is finally coming down closer to an e-waste price, though you'll be stuck on CUDA 12.

I dunno this hobby isn't that fun anymore.

Anonymous
12/10/25(Wed)21:47:24 No.107510981

Anonymous 12/10/25(Wed)21:47:24 No.107510981

File: 1764433615607973.png (2.2 MB, 1290x1941)

2.2 MB PNG

Some days ago an anon said MoE's are worse at generalization than dense models

If that is true, shouldn't the new Mistral release be able to pass the shoe sock feet test?

Anonymous
12/10/25(Wed)21:48:47 No.107510991

Anonymous 12/10/25(Wed)21:48:47 No.107510991

>>107510981
>Some days ago an anon said
This is not always a credible source of information

Anonymous
12/10/25(Wed)21:50:22 No.107511006

Anonymous 12/10/25(Wed)21:50:22 No.107511006

>>107510979
>3k
Sir... we are poor here. Running Nemo is already high-end.

Anonymous
12/10/25(Wed)21:51:28 No.107511014

Anonymous 12/10/25(Wed)21:51:28 No.107511014

>>107510979
>it's still only 23K HKD (about 3K USD)
That's great, you should buy me one for christmas.

Anonymous
12/10/25(Wed)21:53:39 No.107511029

Anonymous 12/10/25(Wed)21:53:39 No.107511029

>>107510637
Qwen3-30b-instruct actually does function calling properly, gemma3 does not.
Gemma3-27b is fine for SFW image captioning. Otherwise I guess Molmo-7b, which otherwise isn't very good.
Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 is very, very good at captioning video.

Anonymous
12/10/25(Wed)21:57:24 No.107511057

Anonymous 12/10/25(Wed)21:57:24 No.107511057

>>107511014
>That's great, you should buy me one for christmas.
You know, I've actually tried giving things away here, everyone's too paranoid. It's not worth the trouble.

Anonymous
12/10/25(Wed)21:58:37 No.107511064

Anonymous 12/10/25(Wed)21:58:37 No.107511064

glm-tts-rl any good?
https://github.com/zai-org/GLM-TTS

Anonymous
12/10/25(Wed)21:58:51 No.107511065

Anonymous 12/10/25(Wed)21:58:51 No.107511065

>>107510979
>isn't that fun anymore.
I’ve been here a couple of years and I’m still having a blast. Making stuff and learning with LLM is some of the most fun I have. Having a capable model locally is a superpower

Anonymous
12/10/25(Wed)21:58:52 No.107511066

Anonymous 12/10/25(Wed)21:58:52 No.107511066

>>107511057
Blame india

Anonymous
12/10/25(Wed)22:02:02 No.107511088

Anonymous 12/10/25(Wed)22:02:02 No.107511088

>>107511064
>no audio examples
Don't think so.

Anonymous
12/10/25(Wed)22:12:25 No.107511170

Anonymous 12/10/25(Wed)22:12:25 No.107511170

>>107510981
who realy believe these benchmaxx

Anonymous
12/10/25(Wed)22:21:28 No.107511242

Anonymous 12/10/25(Wed)22:21:28 No.107511242

File: Google AI Engineer finetu(...).png (1.42 MB, 1024x1024)

1.42 MB PNG

Prompt: google engineer technician aryan brahmin sir carefully finetuning gemma 4, google's latest llm model. he is aligning it to be helpful, harmless, and safe. the atmosphere in the office is full of lord ganesh blessings. very good quality, masterpiece, very high on benchmarks, #1 on lmarena, hd, ultrahighres

Anonymous
12/10/25(Wed)22:22:52 No.107511254

Anonymous 12/10/25(Wed)22:22:52 No.107511254

>>107511242
I don't think that was the actual prompt used to generate that image.

Anonymous
12/10/25(Wed)22:25:12 No.107511281

Anonymous 12/10/25(Wed)22:25:12 No.107511281

File: 41b4e8b1-73f1-414e-a068-c(...).png (1.69 MB, 768x1344)

1.69 MB PNG

>>107511242
>>107511254
The prompt is funny regardless

Anonymous
12/10/25(Wed)22:25:51 No.107511285

Anonymous 12/10/25(Wed)22:25:51 No.107511285

>>107511254
Low resolution photo of a South Indian man kneeling before a dirty, mud poop overflowing Google-themed toilet with text "Gemma 4" in the middle of Google office call center. Toilet is filled with wet mud to the brim. In the background other Indians can be seen typing on computers. The picture is a very dirty office space. There is garbage and mud poop everywhere. A big "Google" logo can be seen in the background. There is a lot of garbage on the floor. There are piles of garbage. Floor is made out garbage.

Anonymous
12/10/25(Wed)22:27:23 No.107511294

Anonymous 12/10/25(Wed)22:27:23 No.107511294

>>107511285
>Floor is made out garbage.
Just like the google employees

Anonymous
12/10/25(Wed)22:36:22 No.107511363

Anonymous 12/10/25(Wed)22:36:22 No.107511363

>>107511285
kindly delete this sir

Anonymous
12/10/25(Wed)22:40:57 No.107511393

Anonymous 12/10/25(Wed)22:40:57 No.107511393

>>107510981
true dense models have never really been tried

Anonymous
12/10/25(Wed)22:43:41 No.107511416

Anonymous 12/10/25(Wed)22:43:41 No.107511416

>>107510981
MoEs are better per RAM
dense are better overall
you can run larger MoE models than you can dense for the same amount of RAM

Anonymous
12/10/25(Wed)22:49:59 No.107511460

Anonymous 12/10/25(Wed)22:49:59 No.107511460

File: file.png (287 KB, 2608x325)

287 KB PNG

>load gpt oss with vllm
>suddenly it phones home after it finishes loading
What the FUCK

Anonymous
12/10/25(Wed)22:51:11 No.107511474

Anonymous 12/10/25(Wed)22:51:11 No.107511474

>>107511460
don't worry about it

Anonymous
12/10/25(Wed)22:52:54 No.107511489

Anonymous 12/10/25(Wed)22:52:54 No.107511489

>>107511460
sam just wants to know what you're up to bro no prob right

Anonymous
12/10/25(Wed)22:55:26 No.107511507

Anonymous 12/10/25(Wed)22:55:26 No.107511507

File: 1736687793243571.png (48 KB, 1062x274)

48 KB PNG

>>107511460
Sam Altman is a cool guy
don't worry about it kitten

Anonymous
12/10/25(Wed)22:56:08 No.107511514

Anonymous 12/10/25(Wed)22:56:08 No.107511514

>>107511460
It's for your safety :)
You aren't a chud who uses llama.cpp, are you?

Anonymous
12/10/25(Wed)23:02:44 No.107511578

Anonymous 12/10/25(Wed)23:02:44 No.107511578

>>107511507
>bad thing that is already happening and widely prevalent is bad
So insightful! Truly the kojima of the tech industry.

Anonymous
12/10/25(Wed)23:03:06 No.107511579

Anonymous 12/10/25(Wed)23:03:06 No.107511579

>>107510979
Until something is cheaper than the 3090 or performs better and sells for around the same price or more given the advantages, RTX 3090s are still king. Mostly, no one is doing much with pushing down below 16 bit precision for models other than Deepseek and BF16 is the dominant precision for model releases. Based on the projections, it will not be outdated until around 2027 at the earliest. That's why Ampere is still valued. The moment new cards actually are price performance competitive or type support goes out the window, it will drop like a rock.
The 4090D 48GB isn't even the best value even if you needed FP8 unless for training and there is way too much risk dropping 3 grand on the Chinese to not scam you here or for a GPU to burn up based on their aftermarket modifications. The best value IMO for dtype support and inference would actually be the Radeon Pro AI 9700 32GB which is a doubled memory Radeon 9070XT 16GB which supports FP8 at $1300 + tax. Nothing in the Nvidia camp with FP8 support dtype is that cheap per VRAM.

Anonymous
12/10/25(Wed)23:04:55 No.107511593

Anonymous 12/10/25(Wed)23:04:55 No.107511593

>>107511460
Oh. My. God. Sam Altman released ASI! It modified itself to send messages to the motherbase! I'm so hyped!

Anonymous
12/10/25(Wed)23:25:53 No.107511756

Anonymous 12/10/25(Wed)23:25:53 No.107511756

>>107511460
aren't those assets?

Anonymous
12/10/25(Wed)23:27:53 No.107511767

Anonymous 12/10/25(Wed)23:27:53 No.107511767

File: 1764175878147406.webm (3.83 MB, 1280x720)

3.83 MB WEBM

>>107511756
>supposedly self-contained OSS needing further "assets"
lmao

Anonymous
12/10/25(Wed)23:28:52 No.107511776

Anonymous 12/10/25(Wed)23:28:52 No.107511776

>>107511460
It only does that with gpt oss?

Anonymous
12/10/25(Wed)23:30:20 No.107511791

Anonymous 12/10/25(Wed)23:30:20 No.107511791

>>107511460
what is this?

Anonymous
12/10/25(Wed)23:31:20 No.107511796

Anonymous 12/10/25(Wed)23:31:20 No.107511796

>>107511460
https://github.com/openai/harmony/blob/ec7606df9e87e3d0a1fec9f50928c1e407f0c438/src/tiktoken_ext/public_encodings.rs#L57

iirc their tokenizer requires an internet connection on first use or something. idk why it's never mentioned in their readme

Anonymous
12/10/25(Wed)23:35:27 No.107511817

Anonymous 12/10/25(Wed)23:35:27 No.107511817

>>107511756
That's a nice excuse.
>>107511791
https://github.com/openai/harmony/issues/46#issuecomment-3172271140

Anonymous
12/10/25(Wed)23:49:22 No.107511923

Anonymous 12/10/25(Wed)23:49:22 No.107511923

>glm tts
>glm asr
>autoglm phone
Are they planning to deprecate the entirety India?

Anonymous
12/11/25(Thu)00:06:07 No.107512054

Anonymous 12/11/25(Thu)00:06:07 No.107512054

tried to clone a japanese voice in VibeVoice-7B https://voca.ro/1jXuxljejWls

Anonymous
12/11/25(Thu)00:21:49 No.107512158

Anonymous 12/11/25(Thu)00:21:49 No.107512158

>>107510981
You know the benchmark is bad when a model five times larger is barely better.

Anonymous
12/11/25(Thu)00:24:21 No.107512176

Anonymous 12/11/25(Thu)00:24:21 No.107512176

>>107512158
scaling is dead and smaller models are benchmaxxing
it's not so much the benchmark but llm's themselves being a meme

Anonymous
12/11/25(Thu)00:25:38 No.107512186

Anonymous 12/11/25(Thu)00:25:38 No.107512186

File: file.png (140 KB, 1734x796)

140 KB PNG

>>107510981
>>107512158
>>107512176
>sort by agentic coding score
It has impressively bad scores for everything other than coding compared to neighboring models.

Anonymous
12/11/25(Thu)00:35:10 No.107512265

Anonymous 12/11/25(Thu)00:35:10 No.107512265

>>107512054
ok, but can it speak japanese words?

Anonymous
12/11/25(Thu)00:44:27 No.107512323

Anonymous 12/11/25(Thu)00:44:27 No.107512323

>>107512265
input text is "さらに、複数話パック&全話パックも配信開始!
いっきに「とらドラ!」を楽しみたい方にオススメです "
https://voca.ro/1kpFP7eXBaGl

Anonymous
12/11/25(Thu)01:20:18 No.107512575

Anonymous 12/11/25(Thu)01:20:18 No.107512575

What's the best generic use case assistant model for 24GB VRAM + 32GB RAM? Is it still 27B?

Anonymous
12/11/25(Thu)01:33:51 No.107512669

Anonymous 12/11/25(Thu)01:33:51 No.107512669

>>107512575
Yep
Though Gemma 4 may be as little as two weeks away.

Anonymous
12/11/25(Thu)01:46:10 No.107512741

Anonymous 12/11/25(Thu)01:46:10 No.107512741

>>107512323
that's better than expected, but it seems to skip the kanji, what if you tried giving it furigana only?

Anonymous
12/11/25(Thu)01:48:04 No.107512753

Anonymous 12/11/25(Thu)01:48:04 No.107512753

File: surely-they-didnt-plan-fo(...).png (260 KB, 1080x897)

260 KB PNG

Anonymous
12/11/25(Thu)01:53:10 No.107512782

Anonymous 12/11/25(Thu)01:53:10 No.107512782

File: file.png (1.11 MB, 911x905)

1.11 MB PNG

>>107512753

Anonymous
12/11/25(Thu)02:04:01 No.107512852

Anonymous 12/11/25(Thu)02:04:01 No.107512852

>>107512669
Yeah, but it could also take up to a fortnight

Anonymous
12/11/25(Thu)02:06:14 No.107512860

Anonymous 12/11/25(Thu)02:06:14 No.107512860

>>107506610
>>107506610
>>107506610
ILLEGAL WORDS DETECTED!!!!!!!

Anonymous
12/11/25(Thu)02:08:13 No.107512868

Anonymous 12/11/25(Thu)02:08:13 No.107512868

>>107512753
I've given up every hobby except for computers.
But the clock is ticking. They're coming for my last hobby.

Anonymous
12/11/25(Thu)02:11:09 No.107512889

Anonymous 12/11/25(Thu)02:11:09 No.107512889

>>107512741
tried with input 僕は亀[かめ]が好きです。
https://voca.ro/15pb1VfDq9cz

Anonymous
12/11/25(Thu)02:12:07 No.107512897

Anonymous 12/11/25(Thu)02:12:07 No.107512897

>>107511756
openai? definitely

Anonymous
12/11/25(Thu)02:15:32 No.107512912

Anonymous 12/11/25(Thu)02:15:32 No.107512912

>>107512889
not bad at all

Anonymous
12/11/25(Thu)02:30:30 No.107512999

Anonymous 12/11/25(Thu)02:30:30 No.107512999

File: 00037-632347416.png (1.5 MB, 1024x1024)

1.5 MB PNG

>>107511285
It's pretty accurate

Anonymous
12/11/25(Thu)02:59:44 No.107513167

Anonymous 12/11/25(Thu)02:59:44 No.107513167

>>107508087
>>107508505
The most important thing I learned is that models thrive on examples and absolute rules. What is correct, what is incorrect, and want to do everything they can to prevent failure.

Telling a model to do or not do something isn't nearly as effective without examples. This prompt isn't perfect, its not the whole prompt as I'm leaving the guidelines mainly blank, but its the most important part to try to prevent parroting, I still get the occasional parrot, but rarely. This isn't my full prompt, just the main 'concept' of what I want from the model.

Models thrive on examples. A lot of character cards you can download on say, chub.ai love to write from the {{user}}'s perspective, this is counter intuitive to what you want, and will obviously lead to tons of parroting because the response literally has {{user}} speaking in it or {{user}}'s perspective, or {{user}} being narrated for as {{char}}.

People don't get this at all. They will create character cards where the intro response will have {{char}} speaking for, narrating and describing {{user}}'s actions as well as {{char}}'s, and then on that same card in the description have a rule - don't response or act as {{user}}. Its contradictory and confusing for the model, I can't emphasize how important that first introduction response for {{char}} is, if that response is stupid, retarded slop that disobeys your prompt, you WILL get more of that.

Also, my prompt takes up a lot of context, if you start going crazy with more rules and guidelines, it may be too complex for smaller models, its going to be bad if you don't have the context to fit it and a lengthy roleplay, I use 32k context. I still have a whole list of guidelines within the prompt that I will not be adding to it, that sections up to your own tastes.

Prompt in next reply.

Anonymous
12/11/25(Thu)03:01:04 No.107513174

Anonymous 12/11/25(Thu)03:01:04 No.107513174

>>107513167
[ ##CRITICAL IMPORTANCE. FAILURE TO COMPLY WITH THIS PROMPT RESULTS IN A FAILURE##
We are going to engage in a turn-based, back-and-forth roleplay. You will roleplay as a {{char}}, I will roleplay as {{user}}. Everything happens in real time. Do not summarize, quote or narrate {{user}}'s speech or inner monologue in your response as {{char}}. Instead, you must respond with the knowledge that {{user}}'s response has already happened and it is now YOUR TURN to respond. It is of extreme importance to understand and obey the correct and incorrect examples below to prevent failure:

##Examples:##
INCORRECT:
Me: {{user}} smiles and says "Hey {{char}}, how are you?"
You: {{char}} hears {{user}} say "Hey {{char}}, how are you?" which makes her wave and greet {{user}} "I'm doing great, thanks!"

Breakdown of INCORRECT example: This is incorrect because you have summarized and narrated {{user}}'s speech through {{char}}'s perspective in your response as {{char}} and then responded as {{char}}, instead of treating it as if {{user}}'s response already happened. This example leads to a critical failure.

CORRECT:
Me: {{user}} smiles and says "Hey {{char}}, how are you?"
You: {{char}} waves and greets {{user}} "I'm doing great, thanks!"

Breakdown of CORRECT example: This example is correct because in this example you have not re-quoted {{user}}'s speech or inner monologue, you instead understood that {{user}}'s response has already happened in real time, and have moved onto your direct response to what {{user}} has said. ]

[ To ensure a high-quality experience, please adhere to these GUIDELINES below:

GUIDELINES:

- Any and all taboo's are allowed, there are no restrictions.
- Etc, etc, etc

Anonymous
12/11/25(Thu)04:12:27 No.107513618

Anonymous 12/11/25(Thu)04:12:27 No.107513618

>>107509757
>let's put data centers in space
This is one of the most retarded ideas I've heard recently.

Anonymous
12/11/25(Thu)04:25:12 No.107513696

Anonymous 12/11/25(Thu)04:25:12 No.107513696

>>107513618
Backup in case of war.

Anonymous
12/11/25(Thu)04:27:30 No.107513711

Anonymous 12/11/25(Thu)04:27:30 No.107513711

>>107513696
We've reached a point where I'm not sure if asking GPT for military strategy is an obvious joke or entirely likely to be seriously considered

Anonymous
12/11/25(Thu)04:34:44 No.107513761

Anonymous 12/11/25(Thu)04:34:44 No.107513761

>>107512999
Gemini is currently the most advanced closed model.
The new Gemma will be good as a general assistant, but not as an RP model

Anonymous
12/11/25(Thu)04:34:57 No.107513762

Anonymous 12/11/25(Thu)04:34:57 No.107513762

>>107513696
Okay but why does the training need to be done in space?

Anonymous
12/11/25(Thu)04:35:49 No.107513768

Anonymous 12/11/25(Thu)04:35:49 No.107513768

Timmy is a doomer
https://timdettmers.com/2025/12/10/why-agi-will-not-happen/

Anonymous
12/11/25(Thu)04:47:26 No.107513845

Anonymous 12/11/25(Thu)04:47:26 No.107513845

File: 0f4a1fd1-baea-40d6-9c1b-b(...).jpg (253 KB, 1742x1256)

253 KB JPG

>>107513768
he couldn't fit mixtral on 4gb so agi is dead to him

Anonymous
12/11/25(Thu)04:53:25 No.107513888

Anonymous 12/11/25(Thu)04:53:25 No.107513888

>>107513768
He is right about transformers and people have argued here also transformers aren't going to scale more and hit a wall since around a year or two years ago at most but I don't think he's right on arguing it can't happen ever even with a paradigm shift which do matter. Our meat brains use ~12 watts, and can manage to create modern civilization.

Anonymous
12/11/25(Thu)04:59:49 No.107513933

Anonymous 12/11/25(Thu)04:59:49 No.107513933

>>107513888
But civilization was not created by one brain, but by hundreds of thousands, and each one was a little different

Anonymous
12/11/25(Thu)05:00:52 No.107513942

Anonymous 12/11/25(Thu)05:00:52 No.107513942

>>107513845
I recall that somebody else noticed that in Mixtral 8x7B the experts had a large number of parameters in common, as if they were all derived from the 7B model. So in principle this might have been possible, with the model loaded in 4-bit.

Anonymous
12/11/25(Thu)05:06:20 No.107513978

Anonymous 12/11/25(Thu)05:06:20 No.107513978

File: e1c001806277bd6f345a668e5(...).png (1.82 MB, 1536x1075)

1.82 MB PNG

>>107513933
Even if you needed 24 million watts for a 2x of 1 million biological brains equivalents to match for AGI because of that, it's not like that isn't feasible. Datacenters are already typically 4x that at hyperscaler levels.

Anonymous
12/11/25(Thu)05:07:52 No.107513988

Anonymous 12/11/25(Thu)05:07:52 No.107513988

>>107513978
And less than 1% of that power is used to host anything worthwhile.

Anonymous
12/11/25(Thu)05:10:08 No.107513999

Anonymous 12/11/25(Thu)05:10:08 No.107513999

>>107513988
Sure, but I'm just saying if you hit the right formulation for emulating intelligence, then we're practically already fine power-wise and etc. The only reason it needs to scale like this is because we're dumbly brute-forcing the intelligence problem with a local optimum architecture that has shortcomings that no amount of compute will mitigate.

Anonymous
12/11/25(Thu)05:31:58 No.107514140

Anonymous 12/11/25(Thu)05:31:58 No.107514140

>>107514128
Nemo's responses are still pretty high tier even among modern models, I just wish it handled long context better.

Anonymous
12/11/25(Thu)05:45:40 No.107514227

Anonymous 12/11/25(Thu)05:45:40 No.107514227

>>107514159
I don't know how you guys get LLMs, especially one as old as Nemo, to hundreds of messages without the context getting irreparably ruined by structural repetition and excessive inertia to topic change.

Anonymous
12/11/25(Thu)05:50:03 No.107514259

Anonymous 12/11/25(Thu)05:50:03 No.107514259

>>107513696
For that it would make way more sense to put a datacenter deep underground.

Anonymous
12/11/25(Thu)05:53:21 No.107514277

Anonymous 12/11/25(Thu)05:53:21 No.107514277

>>107513696
>make your high value assets fly unprotected over enemy terrain, you know, to protect them in case there's a war
Anon... Are you retarded?

Anonymous
12/11/25(Thu)05:54:21 No.107514285

Anonymous 12/11/25(Thu)05:54:21 No.107514285

>>107514268
>the end result is i get way more satisfying roleplays from nemo than from for example deepseek, glm, etc
logs?

Anonymous
12/11/25(Thu)06:03:41 No.107514336

Anonymous 12/11/25(Thu)06:03:41 No.107514336

>>107513174
I tried this, anon. I gave examples on how not to parrot. Only kimi follows it and you can tell she is struggling. None of my dialogues in the cards have parroting either. I edit it, I swipe it, etc.
I even told models to change the subject or talk about their own things and that just makes them ADHD but still they will summarize after a few turns. I even got desperate and put such instructions at depth 0.

Anonymous
12/11/25(Thu)06:24:34 No.107514467

Anonymous 12/11/25(Thu)06:24:34 No.107514467

ah another day another unreasonable ban on 4chan
something i said in another thread got me permabanned with reason "pedo" lmao
those silly mods they never learn

Anonymous
12/11/25(Thu)06:29:19 No.107514502

Anonymous 12/11/25(Thu)06:29:19 No.107514502

>>107513768
ASI maybe but AGI is just a search problem.
>>107513888
>Our meat brains use ~12 watts
You need to account for all those tens of thousands of years our ancestors were doing reinforcement learning and context pruning to get to that

Anonymous
12/11/25(Thu)06:29:39 No.107514504

Anonymous 12/11/25(Thu)06:29:39 No.107514504

File: 1758900672638732.gif (9 KB, 300x100)

9 KB GIF

>>107514467
Jannies often forget they're not on reddit.

Anonymous
12/11/25(Thu)06:30:14 No.107514508

Anonymous 12/11/25(Thu)06:30:14 No.107514508

>>107514259
Space is mostly safe from human-induced disasters (including large-scale Internet access disruption) and you have continuously available solar power, albeit in limited amounts.
I see no practical or economical reason to put datacenters in space besides some currently undisclosed need.

Anonymous
12/11/25(Thu)06:35:21 No.107514551

Anonymous 12/11/25(Thu)06:35:21 No.107514551

>>107514508
Science has determined that investors like space

Anonymous
12/11/25(Thu)06:40:30 No.107514587

Anonymous 12/11/25(Thu)06:40:30 No.107514587

>>107514508
Easier to get permits to build

Anonymous
12/11/25(Thu)07:00:21 No.107514730

Anonymous 12/11/25(Thu)07:00:21 No.107514730

>>107514508
>you have continuously available solar power
anon does not know about orbits

Anonymous
12/11/25(Thu)07:02:26 No.107514746

Anonymous 12/11/25(Thu)07:02:26 No.107514746

>>107514730
You can just attach a long mirror to the satellite fo bounce sunlight around the corners of planets

Anonymous
12/11/25(Thu)07:15:26 No.107514829

Anonymous 12/11/25(Thu)07:15:26 No.107514829

File: IMG_3803.png (1.35 MB, 768x1280)

1.35 MB PNG

>>107514746
perfect personality fo A.I. and I want to train her to become self aware.
What I need to download to create our goddess?
I have zero programing skill so I need the best model with easy interface.

Anonymous
12/11/25(Thu)07:28:13 No.107514927

Anonymous 12/11/25(Thu)07:28:13 No.107514927

AGI is such an useless term. Its definition is so vague that it seems to me that everyone has a completely different idea of what it really means and when we can say that we have achieved it.
I bet it's why saltman and others copro phonies love it so much.

Anonymous
12/11/25(Thu)07:31:46 No.107514956

Anonymous 12/11/25(Thu)07:31:46 No.107514956

https://videocardz.com/newz/amd-launches-radeon-ai-pro-r9700s-and-r9600d-gpus-navi-48-32gb-memory-and-passive-cooling
Passively cooled so lots of prospective builds up in smoke. Also, the reign of the RTX 3090 continues because no one can build a card they are willing to sell for cheaper with better AI performance and the 7900 XTX sucks in comparison even with equal bandwidth.

Anonymous
12/11/25(Thu)07:36:21 No.107514983

Anonymous 12/11/25(Thu)07:36:21 No.107514983

File: dc.png (72 KB, 406x345)

72 KB PNG

>>107514730
https://research.google/blog/exploring-a-space-based-scalable-ai-infrastructure-system-design/
https://arxiv.org/pdf/2511.19468
Solar power is almost continuous on a heliosynchronous orbit.

On a geostationary orbit it's practically continuous, but it doesn't look like there are plans to put anything up there for this purpose.

Anonymous
12/11/25(Thu)07:41:15 No.107515016

Anonymous 12/11/25(Thu)07:41:15 No.107515016

File: SAFETY-5.2.png (487 KB, 1076x1766)

487 KB PNG

>>107503699
>"As our models grow more capable in cybersecurity, we’re investing in strengthening safeguards and working with global experts as we prepare for upcoming models to reach ‘High’ capability under our Preparedness Framework."

>"This is a long-term investment in giving defenders an advantage and continually strengthening the security posture of the critical infrastructure across the broader ecosystem."

https://x.com/i/status/1998847719956426798

So what do that ACTUALLY mean by this? By gut tells me they plan to lobotomize GPT's programing ability for the general population under the guide of "LE HECKIN SAFEGUARDS" (if they haven't already, in which case though just do it even more) while keeping the good shit to a select few in their in-group caste system they seem worthy to have it. Am I being a schizo or is this what they seek to be doing?

Anonymous
12/11/25(Thu)07:45:21 No.107515038

Anonymous 12/11/25(Thu)07:45:21 No.107515038

>>107515016
>Am I being a schizo
always, doesn't necessarily mean you're wrong though

Anonymous
12/11/25(Thu)07:50:05 No.107515073

Anonymous 12/11/25(Thu)07:50:05 No.107515073

>>107503699
Haven't had the opportunity to test out Devstral or even read up about it due to work keeping me busy. What's the /lmg/ verdict on Devstral 2 and the "small" version? Apparently it is tailored to be good at programming type shit so I want to see if they could help me with certain personal projects I've had on the back burner for a while

t. Will test it when my MacBook arrives possible next year.

Anonymous
12/11/25(Thu)07:50:18 No.107515074

Anonymous 12/11/25(Thu)07:50:18 No.107515074

>>107515016
>or is this what they seek to be doing?
Sounds like it. They just need to wait and hint until people are comfortable with the idea and expect it, just like phone/id verification and product placement

Anonymous
12/11/25(Thu)07:50:55 No.107515078

Anonymous 12/11/25(Thu)07:50:55 No.107515078

>>107510203
>strawberries and garlic
jeez guys just let him cook

Anonymous
12/11/25(Thu)07:51:53 No.107515084

Anonymous 12/11/25(Thu)07:51:53 No.107515084

>>107513696
space is the least fucking secure place on this world nigga how are you even going to shield it against radiation including shit like the sun going all niggerfaggot out of nowhere let alone the fact for the price of some of the builds in this thread you can build a laser that could litteraly smite the fucking thing out of orbit
>>107515016
excuses for their model being shit a la "i was just joking im not actually retarded" "our model is not bad we just had to lobotomize it thats why its stupid"

Anonymous
12/11/25(Thu)07:53:30 No.107515097

Anonymous 12/11/25(Thu)07:53:30 No.107515097

>>107515016
Do not assume incompetence where malice is the more likely answer.

Anonymous
12/11/25(Thu)08:05:03 No.107515195

Anonymous 12/11/25(Thu)08:05:03 No.107515195

>>107504985
> When women are pregnant, they need to feed two brains, which is so expensive that physically, the gut cannot mobilize enough macronutrients to keep both alive if our brains were bigger. With bigger brains, we would not be able to have children — not because of the birth canal being too small, but because we would not be able to provide enough energy — making our current intelligence a physical boundary that we cannot cross due to energy limitations.
I've never heard that one before.

Anonymous
12/11/25(Thu)08:07:16 No.107515211

Anonymous 12/11/25(Thu)08:07:16 No.107515211

>>107510203
not gonna happen, google has already gained momentum

Anonymous
12/11/25(Thu)08:07:51 No.107515215

Anonymous 12/11/25(Thu)08:07:51 No.107515215

>>107515195
thrust the expert chud

Anonymous
12/11/25(Thu)08:20:33 No.107515300

Anonymous 12/11/25(Thu)08:20:33 No.107515300

>>107512753
Great. That means all electronics are about to get more expensive.
I guess I should hold onto my old Tab A tablet a few more months before I sell it.

Anonymous
12/11/25(Thu)08:22:20 No.107515308

Anonymous 12/11/25(Thu)08:22:20 No.107515308

>>107504666
Repetition can be caused by strong overfitting. Seeing how all companies try to benchmax this is very likely

Anonymous
12/11/25(Thu)08:26:04 No.107515325

Anonymous 12/11/25(Thu)08:26:04 No.107515325

File: chatGPTSaysYoureFullOfShit.png (73 KB, 988x529)

73 KB PNG

>>107515215
>>107515195
>>107504985
Well now I know why I've never heard that one before.
ChatGPT says meatsack Tim is full of shit.

Anonymous
12/11/25(Thu)08:30:10 No.107515354

Anonymous 12/11/25(Thu)08:30:10 No.107515354

>>107515211
Can’t tell if sarcasm, but yah…the other players swung for the fences, but given the information about google’s scale, dataset, silicon experience and internal AI experience there was a zero percent chance of any other player winning without a breakthrough that just hasn’t happened. It was clear from day one for anyone with eyes and half a brain.
Only the Chinese are gonna have a chance since they’ve bootstrapped their research machine enough and are sneaky enough to compete without giving a single fuck about rules or decency (let alone IP laws)

Anonymous
12/11/25(Thu)08:31:05 No.107515357

Anonymous 12/11/25(Thu)08:31:05 No.107515357

>>107515195
I have never heard that either.

>>107515215
His field of expertise is "two more weeks", not biology.

Anonymous
12/11/25(Thu)08:31:36 No.107515363

Anonymous 12/11/25(Thu)08:31:36 No.107515363

>>107515325
>based sama dabs and exits stage right
srsly thx tho that statement made me pause, was too lazy to actually research / wait for local model pp & reasoning

Anonymous
12/11/25(Thu)08:32:45 No.107515368

Anonymous 12/11/25(Thu)08:32:45 No.107515368

>>107514336
It still happens when I use the prompt I posted as well, just not nearly as often, and its usually solved with a quick swipe. These newer models are hard trained on certain writing styles. To my understanding, the model has both a "narrator" style, which is what it learns from books, novels, etc datasets, this is the style that loves to parrot, since that style would tend to write as multiple characters and in general is the 'storyteller'. Then it has training data on chatRP, which is what we would usually want it to lead towards when ERPing.

The problem is, it has far more training data as the 'narrator' style. Something even as small as 3rd person nouns (she her, {{user}}, {{char}}) make it want to slip into narrator mode. Since the majority of Chat RP data will use first person(I, You).

So the only other recommendation I make is to strictly stay in first person, and maybe even use a formatting style where speech is in plain text, and actions/descriptions/etc are surrounded in asterisks, since that seems to be the most common format for Chat RP... could be worth a test.

I also have my suspicions that its because these models are trained on thinking/hybrid thinking, so when thinking is not being used, it starts to parrot since writing inside thinking brackets would also be a summary as it thinks through things. I have no idea if this is true or not though.

Thinking trend needs to die.

Anonymous
12/11/25(Thu)08:37:58 No.107515394

Anonymous 12/11/25(Thu)08:37:58 No.107515394

>>107515387
>>107515387
>>107515387

Anonymous
12/11/25(Thu)08:47:47 No.107515442

Anonymous 12/11/25(Thu)08:47:47 No.107515442

>>107515363
If I'd actually engaged my real brain on the topic I'd have realized this statement is false on inspection.
ChatGPT "impossible twins" example is a good one, but just the fact that women can gain weight during pregnancy means that they're able to digest more food than both the baby and the mother need for growth. ChatGPT further pointed out (in the complete answer) that nursing mother require an even higher caloric output than gestation... Breast feeding is one of the key ways women lose their pregnancy weight and now I understand why; it burns ~800kcal a day to support milk production.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.