/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 03/09/26(Mon)00:38:50 No.108328170

File: __hatsune_miku_vocaloid_a(...).jpg (3.25 MB, 2418x3753)

/lmg/ - Local Models General Anonymous 03/09/26(Mon)00:38:50 No.108328170

/lmg/ - a general dedicated to the discussion and development of local language models.

Miku Day Edition

Previous threads: >>108321632 & >>108316141

►News
>(03/04) Yuan3.0 Ultra 1010B-A68.8B released: https://hf.co/YuanLabAI/Yuan3.0-Ultra
>(03/03) WizardLM publishes "Beyond Length Scaling" GRM paper: https://hf.co/papers/2603.01571
>(03/03) Junyang Lin leaves Qwen: https://xcancel.com/JustinLin610/status/2028865835373359513
>(03/02) Step 3.5 Flash Base, Midtrain, and SteptronOSS released: https://xcancel.com/StepFun_ai/status/2028551435290554450

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/09/26(Mon)00:39:24 No.108328174

Anonymous 03/09/26(Mon)00:39:24 No.108328174

File: MikuLove.png (1.32 MB, 1024x1024)

1.32 MB PNG

►Recent Highlights from the Previous Thread: >>108321632

--A16 vs 3090 performance benchmarks with llama.cpp tensor parallelism:
>108322578 >108322596 >108322610 >108322640 >108322679 >108322861 >108323005 >108323564 >108323613 >108323688 >108323721 >108323754 >108323790 >108323811
--The Synthetic Data Playbook: Generating Trillions of the Finest Tokens:
>108323497 >108323508 >108323519 >108323530 >108323831 >108323962 >108323971 >108323557 >108323551 >108323565 >108323599 >108323872 >108323884
--Qwen3.5 27B dense model matches 122B MoE performance in benchmarks:
>108326810 >108326837 >108326876 >108326878 >108326893
--Debating MoE's speed-memory tradeoffs vs dense models:
>108326854 >108326888 >108326931 >108326959 >108327002 >108327041 >108327054
--Optimizing 256GB RAM setup for large model inference:
>108321871 >108321876 >108321927 >108321884 >108321984
--Qwen 3.5 quantization performance differences between dense and MoE models:
>108323521 >108325440
--TTS options for SillyTavern voice output:
>108321822 >108322058
--Claude Opus 4.6 benchmark contamination claims spark skepticism:
>108322721 >108322782 >108322804 >108322819 >108322920 >108322865
--Debate over AI emotional support ethics after poetic suicide response:
>108321732 >108321749 >108321837 >108321948 >108323070 >108322594 >108322624 >108322916 >108322948 >108322967 >108323139 >108323151 >108323199 >108323274 >108323406 >108324518
--Miku (free space):
>108322482 >108323847 >108323976 >108326678 >108326684 >108326942 >108327209 >108327668

►Recent Highlight Posts from the Previous Thread: >>108321820

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/09/26(Mon)00:41:45 No.108328183

Anonymous 03/09/26(Mon)00:41:45 No.108328183

based thread

Anonymous
03/09/26(Mon)00:42:18 No.108328185

Anonymous 03/09/26(Mon)00:42:18 No.108328185

Miku is BBC coded

Anonymous
03/09/26(Mon)00:44:24 No.108328194

Anonymous 03/09/26(Mon)00:44:24 No.108328194

File: 1742767792728448.jpg (2.14 MB, 3000x4000)

2.14 MB JPG

>>108328174
>>108328170
Happy Miku Day (3/9 = SanKyuu = MiKu)
https://sonicwire.com/product/virtualsinger/special/miku-v6

Anonymous
03/09/26(Mon)00:58:22 No.108328234

Anonymous 03/09/26(Mon)00:58:22 No.108328234

Mikulove

Anonymous
03/09/26(Mon)01:05:24 No.108328249

Anonymous 03/09/26(Mon)01:05:24 No.108328249

>>108328174
V4 on March 90th

Anonymous
03/09/26(Mon)01:37:27 No.108328367

Anonymous 03/09/26(Mon)01:37:27 No.108328367

>>108328183
we're so back

Anonymous
03/09/26(Mon)01:40:48 No.108328376

Anonymous 03/09/26(Mon)01:40:48 No.108328376

Haven't been on here in a while. Are the new qwen3.5-30b and glm4.7 flash models worth a shit or just a tiny 5% imrpovement

Anonymous
03/09/26(Mon)01:44:37 No.108328389

Anonymous 03/09/26(Mon)01:44:37 No.108328389

>>108328185
Why do you both love it so much?

Anonymous
03/09/26(Mon)01:47:24 No.108328399

Anonymous 03/09/26(Mon)01:47:24 No.108328399

File: ugoira,_Zenless_Zone_Zero(...).png (668 KB, 1080x1080)

668 KB PNG

dead hobby

Anonymous
03/09/26(Mon)01:48:47 No.108328402

Anonymous 03/09/26(Mon)01:48:47 No.108328402

>>108328376
the new qwen is ok for boring things, and worse for sexo

Anonymous
03/09/26(Mon)01:48:53 No.108328403

Anonymous 03/09/26(Mon)01:48:53 No.108328403

File: vcj.png (221 KB, 1524x935)

221 KB PNG

This one is going to be fun.
https://github.com/ggml-org/llama.cpp/pull/20266

Anonymous
03/09/26(Mon)01:49:12 No.108328406

Anonymous 03/09/26(Mon)01:49:12 No.108328406

>>108328376
qwen 35b is a big jump

Anonymous
03/09/26(Mon)01:54:39 No.108328423

Anonymous 03/09/26(Mon)01:54:39 No.108328423

>>108328403
>Files changed: 102
jesus christ

Anonymous
03/09/26(Mon)02:00:14 No.108328442

Anonymous 03/09/26(Mon)02:00:14 No.108328442

File: vcj_02.png (4 KB, 918x166)

4 KB PNG

>>108328423
Somehow he missed a few. I'm sure he'll rectify it immediately.

Anonymous
03/09/26(Mon)02:02:28 No.108328452

Anonymous 03/09/26(Mon)02:02:28 No.108328452

would it be possible to run a local model on my phone with 12gb of ram?

Anonymous
03/09/26(Mon)02:02:42 No.108328454

Anonymous 03/09/26(Mon)02:02:42 No.108328454

>>108328403
>PKU
Gonna give them benefit of the doubt

Anonymous
03/09/26(Mon)02:05:49 No.108328460

Anonymous 03/09/26(Mon)02:05:49 No.108328460

>>108328452
yes in theory, the new qwen 3.5 small models should
https://x.com/Alibaba_Qwen/status/2028460046510965160
i'm not sure if there's actually an app on the app store that does it yet though

Anonymous
03/09/26(Mon)02:14:59 No.108328495

Anonymous 03/09/26(Mon)02:14:59 No.108328495

>>108328452
I think you can compile llama.cpp on termux.

Anonymous
03/09/26(Mon)02:18:18 No.108328500

Anonymous 03/09/26(Mon)02:18:18 No.108328500

which local model is best for divine revelation and interpreting the word of god?

Anonymous
03/09/26(Mon)02:20:41 No.108328506

Anonymous 03/09/26(Mon)02:20:41 No.108328506

>>108328500
https://huggingface.co/PleIAs/Monad
https://huggingface.co/PleIAs/Baguettotron

Anonymous
03/09/26(Mon)02:39:32 No.108328577

Anonymous 03/09/26(Mon)02:39:32 No.108328577

>>108328402
they are literally all bad for gooning
text based cooming is for losers ngltbqhwyfam

Anonymous
03/09/26(Mon)02:42:56 No.108328588

Anonymous 03/09/26(Mon)02:42:56 No.108328588

>>108328170
It reads Sankyu retard

Anonymous
03/09/26(Mon)02:47:55 No.108328606

Anonymous 03/09/26(Mon)02:47:55 No.108328606

DeepSeek V2->V3 felt like a really special moment because I could/do trust V3 to move my projects along reliably and I, personally, don't have that same feeling with V4 so I guess I'm a kinda disappointed.
It's definitely a stronger coder / problem solver. It is more likely to fix a problem, and fix it quickly, compared to V3 (i have no doubt about this) but it just feels like its more likely to break my code without realizing. There's a laziness about it - even when i have explicit end criteria for a turn, it will end prematurely. When it claims success, i don't believe it. Often i challenge it and its like, 'i should have lead with...' or 'what i said was partially true... (no, mate, it was totally false). I dont trust it.
I certainly am surprised to see the model being glazed over as much as it has been. I, for one, was hoping for a bit more.

Anonymous
03/09/26(Mon)02:49:49 No.108328618

Anonymous 03/09/26(Mon)02:49:49 No.108328618

is it safe to pull now? did they fix all piotr's autoparser bugs?

Anonymous
03/09/26(Mon)02:50:16 No.108328619

Anonymous 03/09/26(Mon)02:50:16 No.108328619

>>108328606
V4 isn't out bwo

Anonymous
03/09/26(Mon)02:50:46 No.108328622

Anonymous 03/09/26(Mon)02:50:46 No.108328622

>>108328618
>fix all piotr's autoparser bugs
kek
agent people deserve him

Anonymous
03/09/26(Mon)02:54:03 No.108328635

Anonymous 03/09/26(Mon)02:54:03 No.108328635

File: 1764224308189880.png (110 KB, 823x1109)

110 KB PNG

>>108328622
I just want to use MCP to do really relevant web searches like 'are 4chan memes relevant in 2026'

Anonymous
03/09/26(Mon)02:54:42 No.108328637

Anonymous 03/09/26(Mon)02:54:42 No.108328637

>>108328619
dumbass

Anonymous
03/09/26(Mon)02:55:13 No.108328638

Anonymous 03/09/26(Mon)02:55:13 No.108328638

>>108328635
mcp is an outdated concept unc

Anonymous
03/09/26(Mon)02:55:50 No.108328641

Anonymous 03/09/26(Mon)02:55:50 No.108328641

>>108328638
agent swarms are cringe

Anonymous
03/09/26(Mon)02:59:05 No.108328653

Anonymous 03/09/26(Mon)02:59:05 No.108328653

Whats the best gooning local llm now
Qwen 3.5 heretic is so disappointing

Anonymous
03/09/26(Mon)03:01:42 No.108328665

Anonymous 03/09/26(Mon)03:01:42 No.108328665

>>108328653
general move towards curated pretraining means there is nothing much to reveal with uncensoring

Anonymous
03/09/26(Mon)03:03:24 No.108328668

Anonymous 03/09/26(Mon)03:03:24 No.108328668

>>108328665
And that's good thing!

Anonymous
03/09/26(Mon)03:05:39 No.108328678

Anonymous 03/09/26(Mon)03:05:39 No.108328678

>>108328606
Are you talking about V4 full or V4 lite? I thought lite was the only one they were demoing?

Anonymous
03/09/26(Mon)03:13:43 No.108328712

Anonymous 03/09/26(Mon)03:13:43 No.108328712

>>108328678
nta but I heard they are demoing it in some select circles, mostly people who work in ml

Anonymous
03/09/26(Mon)03:16:24 No.108328719

Anonymous 03/09/26(Mon)03:16:24 No.108328719

>>108328665
Damn so it's better to go back to 2025 models like stheno or tiger gemma?

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.