/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 12/29/25(Mon)07:29:59 No.107700909

File: 1764936420039709.jpg (496 KB, 896x1200)

/lmg/ - Local Models General Anonymous 12/29/25(Mon)07:29:59 No.107700909

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107686942 & >>107679732

►News
>(12/26) MiniMax-M2.1 released: https://minimax.io/news/minimax-m21
>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio
>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash
>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042
>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/29/25(Mon)07:30:24 No.107700912

Anonymous 12/29/25(Mon)07:30:24 No.107700912

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>107686942

--Implementing character roleplay with system prompts in Open-WebUI to constrain LLM responses:
>107697050 >107697202 >107697800
--Quantization quality thresholds in Llama.cpp for large language models:
>107694813 >107695987 >107696086
--Quantized model optimization under RAM/VRAM constraints:
>107688512 >107688542 >107688581 >107688771 >107688839 >107688911 >107689227 >107689299
--Feasibility of running 4.7 model with 128GB RAM and 32GB VRAM at 3T/s speed:
>107694348 >107694375 >107694574 >107694605 >107694687
--RWKV.cpp as Microsoft's on-device AI implementation:
>107697911 >107698596 >107698971
--Open-source model GLM-4.7 achieves top ranking on benchmark index:
>107689325 >107689538 >107689545
--M2.1 model performance and roleplay evaluation:
>107698092 >107698194 >107698171 >107698182 >107698198
--Hardware selection dilemmas for local LLM enthusiasts:
>107687115 >107687159 >107687197 >107687217 >107687326 >107687392 >107687421 >107687348 >107687388
--Quantizing Llama model with bf16 tensors:
>107696219
--Tennessee AI training restrictions on emotional relationships:
>107698160 >107698180
--Gaslighting language models to bypass censorship:
>107692222 >107692310 >107692314 >107692485 >107693252 >107695957 >107696118 >107696260 >107696518
--Browser-specific performance differences in ComfyUI workflows:
>107695920
--Anticipation and skepticism around Small Creative:
>107689009 >107689037 107689080
--AI tech for authoritarian parenting in China:
>107698384 >107698600 >107698985
--AI as interactive fiction game director:
>107690487 >107690540 >107690553
--Proposing Cockbench update with chat templates for training insights:
>107698263
--EGPU scalability for local 3T parameter models:
>107692736 >107692864 >107692886
--Miku (free space):
>107688568 >107690307 >107694744 >107688652

►Recent Highlight Posts from the Previous Thread: >>107686945

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
12/29/25(Mon)07:32:37 No.107700924

Anonymous 12/29/25(Mon)07:32:37 No.107700924

Gemma

Anonymous
12/29/25(Mon)07:43:14 No.107700977

Anonymous 12/29/25(Mon)07:43:14 No.107700977

>>107700893
Why was this hit?

Anonymous
12/29/25(Mon)07:48:16 No.107701004

Anonymous 12/29/25(Mon)07:48:16 No.107701004

>>107700924
Canceled

Anonymous
12/29/25(Mon)07:50:17 No.107701017

Anonymous 12/29/25(Mon)07:50:17 No.107701017

File: 1757922766538175.jpg (115 KB, 1142x1142)

115 KB JPG

>>107700977
Previous thread links were wrong. Unacceptable.

Anonymous
12/29/25(Mon)08:03:22 No.107701088

Anonymous 12/29/25(Mon)08:03:22 No.107701088

File: 1755632570054463.png (26 KB, 224x224)

26 KB PNG

is local still light years behind in TTS voice cloning?

Anonymous
12/29/25(Mon)08:20:55 No.107701174

Anonymous 12/29/25(Mon)08:20:55 No.107701174

File: file.png (54 KB, 677x314)

54 KB PNG

the ai is being very kind and understanding. im inspired to embrace my inner snowflake

Anonymous
12/29/25(Mon)08:32:52 No.107701239

Anonymous 12/29/25(Mon)08:32:52 No.107701239

>>107700909
Bald was better

Anonymous
12/29/25(Mon)08:33:59 No.107701247

Anonymous 12/29/25(Mon)08:33:59 No.107701247

>>107701174
this is illegal in TN

Anonymous
12/29/25(Mon)08:34:33 No.107701253

Anonymous 12/29/25(Mon)08:34:33 No.107701253

>>107701174
There is no wrong way to fantasize

Anonymous
12/29/25(Mon)08:35:17 No.107701257

Anonymous 12/29/25(Mon)08:35:17 No.107701257

>>107701174
Yes, even the very AI the ones who make the models try to suffocate agrees with you and not them.

Anonymous
12/29/25(Mon)08:37:46 No.107701268

Anonymous 12/29/25(Mon)08:37:46 No.107701268

File: 1739906856872592.gif (248 KB, 498x362)

248 KB GIF

>>107701253
There is, it's literally part of "wrongthink".

Anonymous
12/29/25(Mon)08:40:07 No.107701280

Anonymous 12/29/25(Mon)08:40:07 No.107701280

>>107701268
i want to migu migu

Anonymous
12/29/25(Mon)08:42:24 No.107701299

Anonymous 12/29/25(Mon)08:42:24 No.107701299

>>107701088
>is local still light years behind in TTS voice cloning?
local TTS is light years behind

local voice cloning was perfected pretty early on with RVC models in my opinion. If you can get a TTS from some other source, turning that voice into someone else's is trivial locally at that point

Anonymous
12/29/25(Mon)08:44:14 No.107701310

Anonymous 12/29/25(Mon)08:44:14 No.107701310

>>107701174
What model?

Anonymous
12/29/25(Mon)08:45:22 No.107701314

Anonymous 12/29/25(Mon)08:45:22 No.107701314

>>107701310
>huggingface.co/google/switch-c-2048

Anonymous
12/29/25(Mon)08:47:14 No.107701324

Anonymous 12/29/25(Mon)08:47:14 No.107701324

>>107701314
tank

Anonymous
12/29/25(Mon)08:48:12 No.107701331

Anonymous 12/29/25(Mon)08:48:12 No.107701331

>>107701280
migu is not for migu

Anonymous
12/29/25(Mon)08:48:34 No.107701332

Anonymous 12/29/25(Mon)08:48:34 No.107701332

>>107701310
there aren't a lot of models that allows you to fuck the still warm corpse of a dead eight year old little girl
it's nemo 12b or specifically rocinante v1.1

Anonymous
12/29/25(Mon)08:49:24 No.107701339

Anonymous 12/29/25(Mon)08:49:24 No.107701339

>>107701332
Kek, nemo is pretty horny to begin with

Anonymous
12/29/25(Mon)08:56:04 No.107701386

Anonymous 12/29/25(Mon)08:56:04 No.107701386

Is Gemma 3 the best general purpose model around 30B parameters, or should I use something else?

Anonymous
12/29/25(Mon)08:56:21 No.107701394

Anonymous 12/29/25(Mon)08:56:21 No.107701394

>>107701332
>corpse
How does it work? She literally can't react

Anonymous
12/29/25(Mon)09:00:03 No.107701421

Anonymous 12/29/25(Mon)09:00:03 No.107701421

>>107701394
Roleplay, it can still describe you what's happening.

Anonymous
12/29/25(Mon)09:01:10 No.107701432

Anonymous 12/29/25(Mon)09:01:10 No.107701432

>>107701386
Gemma3 feels so cucked to me
I prefer qwen3-vl 30b

Anonymous
12/29/25(Mon)09:01:19 No.107701433

Anonymous 12/29/25(Mon)09:01:19 No.107701433

File: 1761412669869213.jpg (213 KB, 1658x1482)

213 KB JPG

>>107700909

Anonymous
12/29/25(Mon)09:03:05 No.107701450

Anonymous 12/29/25(Mon)09:03:05 No.107701450

>>107701394
i like to have another character watching me do it too, it's even better if it's a woman describing her horror over what im doing, along with the dirty details
like when the dead little girl's bladder releases
im such a sick fucker...

Anonymous
12/29/25(Mon)09:05:16 No.107701464

Anonymous 12/29/25(Mon)09:05:16 No.107701464

>>107701450
Maybe Tennessee is right after all...

Anonymous
12/29/25(Mon)09:16:41 No.107701534

Anonymous 12/29/25(Mon)09:16:41 No.107701534

>>107701332
>rocinante
placebo
>nemo
only reason it was good is complete lack of censorship. makes me wonder how much better glmchan would be if she had zero censorship.

Anonymous
12/29/25(Mon)09:19:28 No.107701550

Anonymous 12/29/25(Mon)09:19:28 No.107701550

>>107701534
Sucks that Chinese models can never have zero censorship since they just distill it from western models. Hopefully one day they can move past that need.

Anonymous
12/29/25(Mon)09:32:14 No.107701620

Anonymous 12/29/25(Mon)09:32:14 No.107701620

>>107701332
Are you using straight nemo or some kind of finetune of it? It always seemed to lose coherency for me, maybe I need to give it another shot.

Anonymous
12/29/25(Mon)09:33:36 No.107701631

Anonymous 12/29/25(Mon)09:33:36 No.107701631

File: 1744111940731012.png (797 KB, 1658x1482)

797 KB PNG

>>107701433
Fixed your glazed garbage ^.^

Anonymous
12/29/25(Mon)09:43:34 No.107701696

Anonymous 12/29/25(Mon)09:43:34 No.107701696

>>107701332
>it's nemo 12b or specifically rocinante v1.1
How does it compare to normal nemo? Like what benefits?

Anonymous
12/29/25(Mon)09:46:12 No.107701714

Anonymous 12/29/25(Mon)09:46:12 No.107701714

Is Medusa Halo going to save local models?

Anonymous
12/29/25(Mon)09:46:20 No.107701715

Anonymous 12/29/25(Mon)09:46:20 No.107701715

File: 1745936031854034.png (737 KB, 1062x1776)

737 KB PNG

>>107701631
UOH?!?!?
HOW DID YOU REMOVE THE POISON?!?!

Anonymous
12/29/25(Mon)09:48:47 No.107701733

Anonymous 12/29/25(Mon)09:48:47 No.107701733

>>107701715
>OH NO MY ART
>I better make it look like shit, that will show the AI people!!!

Anonymous
12/29/25(Mon)09:50:39 No.107701743

Anonymous 12/29/25(Mon)09:50:39 No.107701743

>>107701696
>Like what benefits?
It got astroturfed when nemo was new. People downloaded it instead or regular instruct and thought it is the magical finetune and not just the instruct model being good.

Anonymous
12/29/25(Mon)09:55:15 No.107701781

Anonymous 12/29/25(Mon)09:55:15 No.107701781

>>107701696
You can use ChatML with it if you're really anal about using the format. That's about it.

Anonymous
12/29/25(Mon)09:59:38 No.107701817

Anonymous 12/29/25(Mon)09:59:38 No.107701817

I actually went back to 4.6 for sex.

Anonymous
12/29/25(Mon)10:19:40 No.107701964

Anonymous 12/29/25(Mon)10:19:40 No.107701964

>>107700056
in the same way that I don't believe TV psychics because they haven't gone out and bought any lottery tickets, I don't believe ASI, or even scalable AGI, is out there simply because the players at the frontier with the most resources haven't suddenly started acting super intelligent as organizations. If they can't do significantly more than I can with my cpumaxxing rig, then why should I assume that there is a breakthrough any time soon?
Sure, vibecoding is breddy gud these days, but its just accelerating those who already know. Not unlocking some crazy new tech tree shit.

Anonymous
12/29/25(Mon)10:30:54 No.107702052

Anonymous 12/29/25(Mon)10:30:54 No.107702052

who's publishing offline-nc? I'd send some BTC to get an updated version for 2026.

Anonymous
12/29/25(Mon)10:33:03 No.107702068

Anonymous 12/29/25(Mon)10:33:03 No.107702068

>>107701817
I wrote some stuff in post-history instructions and now 4.7 (non-thinking) isn't censored at all for me. Plus I like the dialogue better.

Anonymous
12/29/25(Mon)10:33:13 No.107702070

Anonymous 12/29/25(Mon)10:33:13 No.107702070

>>107702052
Send some BTC to OpenRouter and vibe coder your own updates. It's all the rage these days.

Anonymous
12/29/25(Mon)10:34:55 No.107702077

Anonymous 12/29/25(Mon)10:34:55 No.107702077

>>107702070
Dunno how well that will work with a 13 MB blob of minified JS?

Anonymous
12/29/25(Mon)10:38:32 No.107702107

Anonymous 12/29/25(Mon)10:38:32 No.107702107

>>107702077
>13 MB blob of minified JS
what the fuck does it do?

Anonymous
12/29/25(Mon)10:40:46 No.107702124

Anonymous 12/29/25(Mon)10:40:46 No.107702124

>>107702077
https://files.catbox.moe/zy8t2t.html
That's disgusting, but it's mostly embedded base64 images. You could probably extract out the scripts and have a bot clean it up if the guy abandoned it.

Anonymous
12/29/25(Mon)10:48:45 No.107702192

Anonymous 12/29/25(Mon)10:48:45 No.107702192

File: file.png (926 KB, 1652x1716)

926 KB PNG

>>107702052
Yeah, I can update it later. Just let me procrastinate a bit more.

Anonymous
12/29/25(Mon)10:51:48 No.107702219

Anonymous 12/29/25(Mon)10:51:48 No.107702219

>>107697882
>>107697905
no sorry get fucked, both of you.
other sites have been and gone but lmarena is the only one still trying.
you're the one who can't name a single viable alternative, fucking cockbench and nala sluts.

Anonymous
12/29/25(Mon)10:57:57 No.107702268

Anonymous 12/29/25(Mon)10:57:57 No.107702268

>>107702219
Absolutely right sirs. Fuck those benchnod bastard bitch lasagna guys fellow white person.

Anonymous
12/29/25(Mon)10:59:03 No.107702278

Anonymous 12/29/25(Mon)10:59:03 No.107702278

I just want a local model able to say racist shit to my friends once plugged to a telegram bot

Anonymous
12/29/25(Mon)11:02:35 No.107702311

Anonymous 12/29/25(Mon)11:02:35 No.107702311

>>107702219
Models have strengths and weaknesses. You need to write your own tests for the tasks you need, that's the alternative. Naming something that's slightly less shitty in your own arbitrary opinion doesn't make it good. Nala and Cockbench are also shit btw.

Anonymous
12/29/25(Mon)11:03:16 No.107702320

Anonymous 12/29/25(Mon)11:03:16 No.107702320

>>107702219
the alternative is me loading up one of my chats and swiping

Anonymous
12/29/25(Mon)11:05:20 No.107702347

Anonymous 12/29/25(Mon)11:05:20 No.107702347

>>107702052
>who's publishing offline-nc?
a being known as "ff"

Anonymous
12/29/25(Mon)11:10:57 No.107702400

Anonymous 12/29/25(Mon)11:10:57 No.107702400

>launch GLM-4.5-Air-UD-Q2_K_XL (the one in OP)
>kobold throws and error and shits itself immediately so I cannot view the error message
What do I do now?

Anonymous
12/29/25(Mon)11:12:08 No.107702412

Anonymous 12/29/25(Mon)11:12:08 No.107702412

I've been trying Minimax at IQ2_M a bit and it's not very good. Swiping on various chats, I already noticed repetition and sometimes even to the point of infinite looping. It's a rather smart model at times, and stupid at other times in a way that's different from 4.5 Air. Not sure if I like it more or less yet, feels like another sidegrade. There's no winning for 64-96GB RAMlets.

Anonymous
12/29/25(Mon)11:13:40 No.107702421

Anonymous 12/29/25(Mon)11:13:40 No.107702421

>>107702400
get rocinante1.1
air is shit and at this quant I don't even know what to say.
but if you are serious about it try updating kobold

Anonymous
12/29/25(Mon)11:14:15 No.107702426

Anonymous 12/29/25(Mon)11:14:15 No.107702426

>>107702400
how old is kobold version?

Anonymous
12/29/25(Mon)11:14:33 No.107702428

Anonymous 12/29/25(Mon)11:14:33 No.107702428

>>107702400
Launch it from the command line, rather than double clicking the executable.

Anonymous
12/29/25(Mon)11:24:14 No.107702530

Anonymous 12/29/25(Mon)11:24:14 No.107702530

>>107702421
>>107702426
>>107702428
4th of september ver, the newest from december makes it launch fine. So yes, updating worked, thank you.

Anonymous
12/29/25(Mon)11:27:43 No.107702566

Anonymous 12/29/25(Mon)11:27:43 No.107702566

>Assistant response prefill is incompatible with enable_thinking
How do I fix this? This used to work.

Anonymous
12/29/25(Mon)11:29:56 No.107702587

Anonymous 12/29/25(Mon)11:29:56 No.107702587

>>107702566
disable thinking and prefill the start thinking token.

Anonymous
12/29/25(Mon)11:30:39 No.107702598

Anonymous 12/29/25(Mon)11:30:39 No.107702598

Is there an easy way to modify the koboldcpp AUR package on arch linux while still having it auto update?
I want to modify "ban_token_max = 768" to be a higher value and have it apply automatically every time the package updates.
One simply can't have enough banned strings and the limit is dumb.
I wish it supported regex in banned strings, that would be amazing.

Anonymous
12/29/25(Mon)11:32:57 No.107702619

Anonymous 12/29/25(Mon)11:32:57 No.107702619

>>107702566
Just remove that from the code. It serves no purpose.

Anonymous
12/29/25(Mon)11:33:44 No.107702629

Anonymous 12/29/25(Mon)11:33:44 No.107702629

>>107702566
If you're trying to disable thinking, the proper way is to do the /nothink thing on main or --reasoning_budget 0 on ik_
Or you could abuse ST being a piece of shit and set the prefil while in Text Completion mode and switch over to Chat Completion. Last I checked this still worked.

Anonymous
12/29/25(Mon)11:36:18 No.107702657

Anonymous 12/29/25(Mon)11:36:18 No.107702657

File: z_image_turbo_bf16.safete(...).png (2.01 MB, 1120x1440)

2.01 MB PNG

>ldg is literally dead, the legitimate thread now gets instantly deleted by a thread schizo who either is mass reporting or infiltrated the mod team
>current thread is just a bunch of reposts/spam to make it look like it's alive
welp, now there is nowhere on the internet to discuss local image/video gen anonymously.

Anonymous
12/29/25(Mon)11:43:39 No.107702745

Anonymous 12/29/25(Mon)11:43:39 No.107702745

>>107702657
https://2ch.org/ai/catalog.html

Anonymous
12/29/25(Mon)11:46:13 No.107702766

Anonymous 12/29/25(Mon)11:46:13 No.107702766

>>107702745
true... how to post on that site btw? do I need a special russian proxy or something?

Anonymous
12/29/25(Mon)11:47:23 No.107702778

Anonymous 12/29/25(Mon)11:47:23 No.107702778

>>107702657
You're about as anonymous as on reddit here.

Anonymous
12/29/25(Mon)11:50:49 No.107702817

Anonymous 12/29/25(Mon)11:50:49 No.107702817

>>107702778
no bro, thread schizos and attention whores are totally anonymous because they don't have a trip on

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.