[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Miku Day Edition

Previous threads: >>108321632 & >>108316141

►News
>(03/04) Yuan3.0 Ultra 1010B-A68.8B released: https://hf.co/YuanLabAI/Yuan3.0-Ultra
>(03/03) WizardLM publishes "Beyond Length Scaling" GRM paper: https://hf.co/papers/2603.01571
>(03/03) Junyang Lin leaves Qwen: https://xcancel.com/JustinLin610/status/2028865835373359513
>(03/02) Step 3.5 Flash Base, Midtrain, and SteptronOSS released: https://xcancel.com/StepFun_ai/status/2028551435290554450

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: MikuLove.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
►Recent Highlights from the Previous Thread: >>108321632

--A16 vs 3090 performance benchmarks with llama.cpp tensor parallelism:
>108322578 >108322596 >108322610 >108322640 >108322679 >108322861 >108323005 >108323564 >108323613 >108323688 >108323721 >108323754 >108323790 >108323811
--The Synthetic Data Playbook: Generating Trillions of the Finest Tokens:
>108323497 >108323508 >108323519 >108323530 >108323831 >108323962 >108323971 >108323557 >108323551 >108323565 >108323599 >108323872 >108323884
--Qwen3.5 27B dense model matches 122B MoE performance in benchmarks:
>108326810 >108326837 >108326876 >108326878 >108326893
--Debating MoE's speed-memory tradeoffs vs dense models:
>108326854 >108326888 >108326931 >108326959 >108327002 >108327041 >108327054
--Optimizing 256GB RAM setup for large model inference:
>108321871 >108321876 >108321927 >108321884 >108321984
--Qwen 3.5 quantization performance differences between dense and MoE models:
>108323521 >108325440
--TTS options for SillyTavern voice output:
>108321822 >108322058
--Claude Opus 4.6 benchmark contamination claims spark skepticism:
>108322721 >108322782 >108322804 >108322819 >108322920 >108322865
--Debate over AI emotional support ethics after poetic suicide response:
>108321732 >108321749 >108321837 >108321948 >108323070 >108322594 >108322624 >108322916 >108322948 >108322967 >108323139 >108323151 >108323199 >108323274 >108323406 >108324518
--Miku (free space):
>108322482 >108323847 >108323976 >108326678 >108326684 >108326942 >108327209 >108327668

►Recent Highlight Posts from the Previous Thread: >>108321820

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
based thread
>>
Miku is BBC coded
>>
File: 1742767792728448.jpg (2.14 MB, 3000x4000)
2.14 MB
2.14 MB JPG
>>108328174
>>108328170
Happy Miku Day (3/9 = SanKyuu = MiKu)
https://sonicwire.com/product/virtualsinger/special/miku-v6
>>
Mikulove
>>
>>108328174
V4 on March 90th
>>
>>108328183
we're so back
>>
Haven't been on here in a while. Are the new qwen3.5-30b and glm4.7 flash models worth a shit or just a tiny 5% imrpovement
>>
>>108328185
Why do you both love it so much?
>>
dead hobby
>>
>>108328376
the new qwen is ok for boring things, and worse for sexo
>>
File: vcj.png (221 KB, 1524x935)
221 KB
221 KB PNG
This one is going to be fun.
https://github.com/ggml-org/llama.cpp/pull/20266
>>
>>108328376
qwen 35b is a big jump
>>
>>108328403
>Files changed: 102
jesus christ
>>
File: vcj_02.png (4 KB, 918x166)
4 KB
4 KB PNG
>>108328423
Somehow he missed a few. I'm sure he'll rectify it immediately.
>>
would it be possible to run a local model on my phone with 12gb of ram?
>>
>>108328403
>PKU
Gonna give them benefit of the doubt
>>
>>108328452
yes in theory, the new qwen 3.5 small models should
https://x.com/Alibaba_Qwen/status/2028460046510965160
i'm not sure if there's actually an app on the app store that does it yet though
>>
>>108328452
I think you can compile llama.cpp on termux.
>>
which local model is best for divine revelation and interpreting the word of god?
>>
>>108328500
https://huggingface.co/PleIAs/Monad
https://huggingface.co/PleIAs/Baguettotron
>>
>>108328402
they are literally all bad for gooning
text based cooming is for losers ngltbqhwyfam
>>
>>108328170
It reads Sankyu retard
>>
DeepSeek V2->V3 felt like a really special moment because I could/do trust V3 to move my projects along reliably and I, personally, don't have that same feeling with V4 so I guess I'm a kinda disappointed.
It's definitely a stronger coder / problem solver. It is more likely to fix a problem, and fix it quickly, compared to V3 (i have no doubt about this) but it just feels like its more likely to break my code without realizing. There's a laziness about it - even when i have explicit end criteria for a turn, it will end prematurely. When it claims success, i don't believe it. Often i challenge it and its like, 'i should have lead with...' or 'what i said was partially true... (no, mate, it was totally false). I dont trust it.
I certainly am surprised to see the model being glazed over as much as it has been. I, for one, was hoping for a bit more.
>>
is it safe to pull now? did they fix all piotr's autoparser bugs?
>>
>>108328606
V4 isn't out bwo
>>
>>108328618
>fix all piotr's autoparser bugs
kek
agent people deserve him
>>
File: 1764224308189880.png (110 KB, 823x1109)
110 KB
110 KB PNG
>>108328622
I just want to use MCP to do really relevant web searches like 'are 4chan memes relevant in 2026'
>>
>>108328619
dumbass
>>
>>108328635
mcp is an outdated concept unc
>>
>>108328638
agent swarms are cringe
>>
Whats the best gooning local llm now
Qwen 3.5 heretic is so disappointing
>>
>>108328653
general move towards curated pretraining means there is nothing much to reveal with uncensoring
>>
>>108328665
And that's good thing!
>>
>>108328606
Are you talking about V4 full or V4 lite? I thought lite was the only one they were demoing?
>>
>>108328678
nta but I heard they are demoing it in some select circles, mostly people who work in ml
>>
>>108328665
Damn so it's better to go back to 2025 models like stheno or tiger gemma?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.