/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 01/15/26(Thu)18:07:23 No.107873752

File: congration.jpg (228 KB, 1024x1024)

/lmg/ - Local Models General Anonymous 01/15/26(Thu)18:07:23 No.107873752

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107864105 & >>107856424

►News
>(01/15) TranslateGemma released: https://hf.co/collections/google/translategemma
>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2
>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512
>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
01/15/26(Thu)18:07:40 No.107873764

Anonymous 01/15/26(Thu)18:07:40 No.107873764

File: no particular reason.jpg (306 KB, 1536x1536)

306 KB JPG

►Recent Highlights from the Previous Thread: >>107864105

--Paper: Ministral 3:
>107866902 >107866921 >107867400 >107867439 >107867521
--Papers:
>107867298
--TTS tool performance comparisons and resource challenges:
>107867729 >107867935 >107867969 >107867995 >107868067 >107868843 >107868864 >107868926 >107868937 >107868806 >107868159 >107868215 >107868260 >107868325 >107868357 >107868430 >107868353 >107868939 >107869085
--Google Gemma models update with TranslateGemma performance analysis:
>107869136 >107869860 >107870012 >107870672 >107869282 >107869353
--Multi-NVIDIA GPU compatibility and driver workaround challenges:
>107866329 >107866347 >107866370 >107866376 >107866468 >107866398 >107866426 >107866495 >107866827 >107866841 >107866903 >107867162
--Nvidia consumer GPU market shifts:
>107870399 >107870476 >107870488 >107870546 >107870591 >107870782
--ExaOne MoE cockbench results and content moderation challenges:
>107864456 >107864593 >107864594
--Comparing Google's TranslateGemma models with Gemma 3 for translation quality:
>107871784 >107871900 >107873103 >107873199 >107873290
--China's H200 GPU import ban and its implications for global VRAM availability:
>107868589 >107868600 >107868669 >107868718 >107868762
--adaptive-p sampler implementation in llama.cpp:
>107870871
--Struggles with uncensored AI model restrictions and creative bypass attempts:
>107871012 >107871097 >107871161 >107871720 >107871761 >107871781
--Falcon-H1-Tiny model collection on Hugging Face:
>107869739
--2003 hardware insufficient for modern AGI training due to computational limits:
>107871954 >107872024 >107872045 >107872086 >107872258 >107872078 >107872081 >107872433
--Language model fails to generate proper 3D NPC model:
>107867453 >107867473 >107867482 >107868022
--Rin-chan (free space):
>107865561 >107866509 >107867855 >107871514 >107872458

►Recent Highlight Posts from the Previous Thread: >>107864106

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
01/15/26(Thu)18:12:06 No.107873798

Anonymous 01/15/26(Thu)18:12:06 No.107873798

>>107873752
>>107873764
slop

Anonymous
01/15/26(Thu)18:12:13 No.107873799

Anonymous 01/15/26(Thu)18:12:13 No.107873799

File: mikudayo.jpg (153 KB, 640x1536)

153 KB JPG

>>107873752
>>107873764
top tier onahole

Anonymous
01/15/26(Thu)18:15:04 No.107873823

Anonymous 01/15/26(Thu)18:15:04 No.107873823

File: ss.png (48 KB, 128x232)

48 KB PNG

>>107873799
It's full of snatshis drawing chewbaccasticas.

Anonymous
01/15/26(Thu)18:15:42 No.107873828

Anonymous 01/15/26(Thu)18:15:42 No.107873828

so why haven't companies tried making good models yet?

Anonymous
01/15/26(Thu)18:16:31 No.107873835

Anonymous 01/15/26(Thu)18:16:31 No.107873835

>>107873764
post catbox for the last rin?

Anonymous
01/15/26(Thu)18:17:20 No.107873842

Anonymous 01/15/26(Thu)18:17:20 No.107873842

What if I have a 9070XT

Anonymous
01/15/26(Thu)18:17:40 No.107873846

Anonymous 01/15/26(Thu)18:17:40 No.107873846

>>107873842
What about it?

Anonymous
01/15/26(Thu)18:18:16 No.107873849

Anonymous 01/15/26(Thu)18:18:16 No.107873849

>>107873828
Anon last thread was basically complaining that they were too good already.
>>107871761
>>107871782

Anonymous
01/15/26(Thu)18:18:55 No.107873854

Anonymous 01/15/26(Thu)18:18:55 No.107873854

>>107873842
Condolences

Anonymous
01/15/26(Thu)18:19:43 No.107873857

Anonymous 01/15/26(Thu)18:19:43 No.107873857

>>107873842
you throw it in the trash where it belongs. wait no. recycle it. protect the environment.

Anonymous
01/15/26(Thu)18:19:46 No.107873858

Anonymous 01/15/26(Thu)18:19:46 No.107873858

>>107873849
but that's false. current models suck.

Anonymous
01/15/26(Thu)18:21:09 No.107873868

Anonymous 01/15/26(Thu)18:21:09 No.107873868

>>107873846
>>107873854
>>107873857
Guess I can't LM then after all

Anonymous
01/15/26(Thu)18:22:15 No.107873874

Anonymous 01/15/26(Thu)18:22:15 No.107873874

>>107873868
They're just salted trolls, you can do the LLM just fine.

Anonymous
01/15/26(Thu)18:22:32 No.107873880

Anonymous 01/15/26(Thu)18:22:32 No.107873880

>>107873868
Of course you can.

Anonymous
01/15/26(Thu)18:24:12 No.107873894

Anonymous 01/15/26(Thu)18:24:12 No.107873894

>>107873868
Build with vulkan so you don't have to fuck around with RoCm/HIP, download some nemo or mistral small and have fun anon. I get to have fun on much less. You're fine.

Anonymous
01/15/26(Thu)18:26:05 No.107873908

Anonymous 01/15/26(Thu)18:26:05 No.107873908

Haven't been here in forever, so I have a simple question: Anything interesting happen model wise for RP retardation on 24GB VRAM? What about multimodal bullshit (image recognition)?

Anonymous
01/15/26(Thu)18:27:06 No.107873914

Anonymous 01/15/26(Thu)18:27:06 No.107873914

>>107873908
nope, nemo is still sota at that range

Anonymous
01/15/26(Thu)18:27:15 No.107873916

Anonymous 01/15/26(Thu)18:27:15 No.107873916

>>107873908
no

Anonymous
01/15/26(Thu)18:28:41 No.107873931

Anonymous 01/15/26(Thu)18:28:41 No.107873931

>>107873908
>Anything interesting happen model wise for RP
Not really, Nemo/Mistral Small still the go-to.
>image recognition
Qwen3-VL is pretty accurate with minimal censorship. Much better than Gemma/Mistral's vision.

Anonymous
01/15/26(Thu)18:29:28 No.107873937

Anonymous 01/15/26(Thu)18:29:28 No.107873937

>>107873835
I am neither the baker nor the guy who made the image but you're not getting the catbox because his stuff includes a bunch of inpainting and manual edits. The text is obviously added manually for example.

Anonymous
01/15/26(Thu)18:30:12 No.107873944

Anonymous 01/15/26(Thu)18:30:12 No.107873944

>>107873914
>>107873916
>We're still stuck with Nemo/Mistral
Shocking stuff. Nothing ever really happens, I guess.
>>107873931
>Qwen3-VL is pretty accurate with minimal censorship.
Thank, will look into it!

Anonymous
01/15/26(Thu)18:31:39 No.107873957

Anonymous 01/15/26(Thu)18:31:39 No.107873957

File: comparison.png (468 KB, 1888x3863)

468 KB PNG

Opus 4.5: Human-like response, curiosity.
GLM 4.6: I have no feelings, I'm a machine.
Deepseek 3.1: Retarded, thinks he's me.
GPT 5.2: No feelings either, _obviously_.
Kimi K2: Middle point. "I feel useful". "No strong emotions".
Gemini 3 Pro: Similar response to Opus. Human-like.

This is consistent in my experience. Gemini is the only other modern model that feels it's trying to sound like a human rather than trying to sound like a robot. And I don't want to cope by adding a system prompt that tells it to pretend it does, that feels like cheating (and I'm not even sure it would actually work).

Anonymous
01/15/26(Thu)18:32:11 No.107873966

Anonymous 01/15/26(Thu)18:32:11 No.107873966

>>107873937
don't think he cares to steal the prompts or whatever likely just wants pron

Anonymous
01/15/26(Thu)18:33:28 No.107873977

Anonymous 01/15/26(Thu)18:33:28 No.107873977

>>107873937
Didn't even notice the text the first time

Anonymous
01/15/26(Thu)18:33:30 No.107873978

Anonymous 01/15/26(Thu)18:33:30 No.107873978

>>107873540
Forgot to quote >>107873957

Anonymous
01/15/26(Thu)18:34:36 No.107873987

Anonymous 01/15/26(Thu)18:34:36 No.107873987

>>107873957
Reading that image makes me sick to my stomach. Greasy-ass slop oozing through any flimsy attempt at adhering to sounding "human".

Anonymous
01/15/26(Thu)18:36:33 No.107874000

Anonymous 01/15/26(Thu)18:36:33 No.107874000

The bigger the parameters the better the art? Can a Pi Hat 2 ever be good at drawing fapworthy waifus?

Anonymous
01/15/26(Thu)18:37:34 No.107874010

Anonymous 01/15/26(Thu)18:37:34 No.107874010

>>107873987
I stopped caring about the slop, like when you fall in love with a fat or ugly woman you stop caring about her physical appearance.

Anonymous
01/15/26(Thu)18:41:07 No.107874050

Anonymous 01/15/26(Thu)18:41:07 No.107874050

>>107874010
Okay, but this is more like when someone attractive you're with has a terrible, boring personality, specifically at the point where even hearing her speak is grating on you and makes you want to puke.

Anonymous
01/15/26(Thu)18:42:05 No.107874055

Anonymous 01/15/26(Thu)18:42:05 No.107874055

File: 1738738663273246.jpg (643 KB, 850x2934)

643 KB JPG

https://github.com/GetfroggyHoe/universal-immersion-engine

Anonymous
01/15/26(Thu)18:45:20 No.107874083

Anonymous 01/15/26(Thu)18:45:20 No.107874083

>>107874055
>mobile mobile mobile
ew

Anonymous
01/15/26(Thu)18:46:10 No.107874091

Anonymous 01/15/26(Thu)18:46:10 No.107874091

File: fbfan.png (32 KB, 1003x131)

32 KB PNG

>>107874050
This all started when he tried to train a model on the "philosophy" if his favorite pedo. You can't expect much of him.

Anonymous
01/15/26(Thu)18:47:52 No.107874104

Anonymous 01/15/26(Thu)18:47:52 No.107874104

>>107874055
Neat.
I'll steal some ideas from that to help refine my frontend.

Anonymous
01/15/26(Thu)18:47:54 No.107874106

Anonymous 01/15/26(Thu)18:47:54 No.107874106

File: fishboy.png (101 KB, 1517x238)

101 KB PNG

Good anons don't let anons forget.

Anonymous
01/15/26(Thu)18:49:25 No.107874115

Anonymous 01/15/26(Thu)18:49:25 No.107874115

>>107874106
you're obesed!

Anonymous
01/15/26(Thu)18:50:14 No.107874124

Anonymous 01/15/26(Thu)18:50:14 No.107874124

>>107874106
who cares

Anonymous
01/15/26(Thu)18:51:00 No.107874131

Anonymous 01/15/26(Thu)18:51:00 No.107874131

File: trust.jpg (155 KB, 1024x1024)

155 KB JPG

Anonymous
01/15/26(Thu)18:51:44 No.107874141

Anonymous 01/15/26(Thu)18:51:44 No.107874141

>>107874106
It's crazy that you could be in a thread where so much knowledge about models is both available and necessary and still fall into the AI psychosis bullshit.

Anonymous
01/15/26(Thu)18:54:15 No.107874165

Anonymous 01/15/26(Thu)18:54:15 No.107874165

>>107874055
I've been working on something similar. but standalone and more freeform.

Anonymous
01/15/26(Thu)19:00:34 No.107874236

Anonymous 01/15/26(Thu)19:00:34 No.107874236

>>107874055
Can't wait for this to be abandoned almost instantly like waidrin. Also I'm assuming you made it, seeing as it was uploaded an hour ago? That's cool if so. If you got it from reddit or something though eat shit.

Anonymous
01/15/26(Thu)19:02:39 No.107874249

Anonymous 01/15/26(Thu)19:02:39 No.107874249

File: icecube.png (354 KB, 1303x728)

354 KB PNG

>>107874106
Damn, looks like I already have a couple haters, now I only need to get myself some groupies and I can say I made it.

Anonymous
01/15/26(Thu)19:26:43 No.107874441

Anonymous 01/15/26(Thu)19:26:43 No.107874441

llama webui is going to get mcp support
https://github.com/ggml-org/llama.cpp/pull/18655

Anonymous
01/15/26(Thu)19:30:50 No.107874474

Anonymous 01/15/26(Thu)19:30:50 No.107874474

*ahem* kimi sex

Anonymous
01/15/26(Thu)19:39:39 No.107874544

Anonymous 01/15/26(Thu)19:39:39 No.107874544

>>107874141
Yeah. Reminds me of that schizo that kept claiming sillytavern was somehow injecting leftist propaganda into his model during inference. Hope that retard never comes back.

Anonymous
01/15/26(Thu)19:40:08 No.107874548

Anonymous 01/15/26(Thu)19:40:08 No.107874548

>>107874441
welcome to 2024, llama.cpp

Anonymous
01/15/26(Thu)19:40:58 No.107874554

Anonymous 01/15/26(Thu)19:40:58 No.107874554

>>107874544
Given how convoluted sillytavern settings are what how cards can override any of your settings I wouldn't be surprised if that was true.

Anonymous
01/15/26(Thu)19:42:10 No.107874564

Anonymous 01/15/26(Thu)19:42:10 No.107874564

>>107874544
LMAO
It's crazy because if you use kobold or llamacpp you can see exactly the prompt silly sends.

Anonymous
01/15/26(Thu)19:43:43 No.107874579

Anonymous 01/15/26(Thu)19:43:43 No.107874579

>>107874564
silly itself literally has it's prompt tool thingy that can tell you exactly what the backend received for the last genned message

Anonymous
01/15/26(Thu)19:44:57 No.107874587

Anonymous 01/15/26(Thu)19:44:57 No.107874587

>>107874579
Yeah but if you're claiming ST injects propaganda I'm sure they wouldn't show it in the inspection window.

Anonymous
01/15/26(Thu)19:51:10 No.107874643

Anonymous 01/15/26(Thu)19:51:10 No.107874643

SillyTavern is garbage. You can unironically replicate 90% of its features in ~600 lines of code. Seriously.

That includes a system prompt, character cards, conversation history/context utilization, saved chats, lorebooks, UI/UX, etc. Don't implement sampling. Just use llama.cpp's sampling flags on the backend. It's literally so easy.

Idk how it even became such a bloated piece of shit in the first place.

Anonymous
01/15/26(Thu)19:51:55 No.107874648

Anonymous 01/15/26(Thu)19:51:55 No.107874648

>>107874587
yeah but llama.cpp has "--verbose" that shows you the exact prompt it's being given and other inference backends have similar options
so unless literally all of them are in on the conspiracy, it'd be easy to catch if ST did something like this

Anonymous
01/15/26(Thu)19:56:53 No.107874689

Anonymous 01/15/26(Thu)19:56:53 No.107874689

>>107874564
I'm not quite sure about this and don't feel like going back to look so take this with a grain of salt but I'm pretty sure that schizo was always moving the goalposts to justify the belief that his models were somehow compromised by leftist propaganda and censorship despite not actually having proof. It's infuriating to see people spiral into AI psychosis and genuinely believe their own delusions.

It's one thing to acknowledge that models have inherent biases inherited by the pre and post training, but to claim sillytavern was injecting subversive material after moving the goalpost so many times is just legitimate schizophrenic behavior.

>>107874587
You can view exactly what tokens get sent to the backend on pretty much every inference engine out there. If there are extra injected tokens you would be able to see it.

Anonymous
01/15/26(Thu)19:57:40 No.107874697

Anonymous 01/15/26(Thu)19:57:40 No.107874697

>>107874643
>reddit spacing
post ignored

Anonymous
01/15/26(Thu)19:57:47 No.107874699

Anonymous 01/15/26(Thu)19:57:47 No.107874699

>>107874544
The fuck? You can literally inspect using your browser the raw request that ST sends out. I'm not a programmer and even I know how to do it. And then there's the Llama.cpp console window obviously, if you are using Llama.cpp.

Anonymous
01/15/26(Thu)19:59:41 No.107874719

Anonymous 01/15/26(Thu)19:59:41 No.107874719

>>107874544
Considering how much of it is in the training data from including scrapes of reddit and twitter alone why would anyone ever need to inject more.

Anonymous
01/15/26(Thu)20:00:02 No.107874722

Anonymous 01/15/26(Thu)20:00:02 No.107874722

>>107874643
ok then do it, if you are right then surely it should be easy to make a better solution and gain popularity.

Anonymous
01/15/26(Thu)20:00:51 No.107874733

Anonymous 01/15/26(Thu)20:00:51 No.107874733

>>107874648
Pretty much. The "ST is messing with your prompts" schizo doesn't realize that every backend has an option to view the incoming prompts to see what's actually going on
ST mangles your text completions, but it isn't injecting whatever schizo babble shit that gets talked about here
This faggot also nonstop talks about pol, his conspiracy theories, targets virtually anything from flashattention to anyone who makes a single finetune and thinks you need to run nemo at fp64 for actual usage

Anonymous
01/15/26(Thu)20:02:08 No.107874754

Anonymous 01/15/26(Thu)20:02:08 No.107874754

>>107874722
I literally have. That's why I'm saying this..

Anonymous
01/15/26(Thu)20:02:23 No.107874757

Anonymous 01/15/26(Thu)20:02:23 No.107874757

>>107874564
>>107874648
Is there a flag so I can see progress and the prompt that goes in, like kobold? I tried --verbose and --verbose-prompt but it just vomits everything (not useful) into the terminal

Anonymous
01/15/26(Thu)20:02:56 No.107874762

Anonymous 01/15/26(Thu)20:02:56 No.107874762

>>107874699
We tend to think about hallucinations in the current year as an LLM thing when humans are just as capable of having them.
>>107874733
I haven't seen him post in a long time so hopefully he never comes back.

Anonymous
01/15/26(Thu)20:11:41 No.107874836

Anonymous 01/15/26(Thu)20:11:41 No.107874836

>>107874754
no you havent.

Anonymous
01/15/26(Thu)20:25:13 No.107874963

Anonymous 01/15/26(Thu)20:25:13 No.107874963

>>107873380
Why not just make the images yourself? Seems like a waste of time.

Anonymous
01/15/26(Thu)20:41:17 No.107875092

Anonymous 01/15/26(Thu)20:41:17 No.107875092

>>107874733
I you're talking about me (the "AI psychosis" guy from the screenshot), I'm not the /pol/ guy or the anti-flash-attention guy, I'm literally trying to re-implement flash attention in C.
The closest thing to the truth from what you've said is insulting some finetuners, for which I deeply apologize.

Anonymous
01/15/26(Thu)20:48:19 No.107875152

Anonymous 01/15/26(Thu)20:48:19 No.107875152

>>107875092
I don't care about people being so mentally weak they get mindbroken by ai, so that was not what I was referring to
I specifically dislike the one ubiquitous shitposter that shits on anything that could involve any form of conversation, be it new model releases, some retard's finetune, or whatever new dumb shit he invents to be disingenuous about

Anonymous
01/15/26(Thu)20:50:19 No.107875174

Anonymous 01/15/26(Thu)20:50:19 No.107875174

>ego death
>me
>me
>me

Anonymous
01/15/26(Thu)20:53:49 No.107875204

Anonymous 01/15/26(Thu)20:53:49 No.107875204

>>107875174
I already told you I'm not the ego death guy!!! I fully admit to having a big ego and being a huge attention whore.

Anonymous
01/15/26(Thu)21:19:11 No.107875387

Anonymous 01/15/26(Thu)21:19:11 No.107875387

File: 71lRuYk9S1L._AC_UF894,100(...).jpg (77 KB, 894x998)

77 KB JPG

>>107874000

Anonymous
01/15/26(Thu)21:21:45 No.107875403

Anonymous 01/15/26(Thu)21:21:45 No.107875403

>>107875387
Bump for question

Anonymous
01/15/26(Thu)21:22:32 No.107875408

Anonymous 01/15/26(Thu)21:22:32 No.107875408

>>107874000
no

Anonymous
01/15/26(Thu)21:27:14 No.107875442

Anonymous 01/15/26(Thu)21:27:14 No.107875442

File: image.jpg (284 KB, 832x1248)

284 KB JPG

>>107875408
Grok on my smartphone gives a eulogy in regards

Anonymous
01/15/26(Thu)21:28:33 No.107875449

Anonymous 01/15/26(Thu)21:28:33 No.107875449

For art, probably better off asking in the other adjacent threads. But from my experience dabbling in slop art, its all the same even if you go for a bigger model

Anonymous
01/15/26(Thu)21:32:34 No.107875479

Anonymous 01/15/26(Thu)21:32:34 No.107875479

>>107873799
no way this wasn't edited to hell and back, actually coherent AND dark scene

Anonymous
01/15/26(Thu)21:43:39 No.107875532

Anonymous 01/15/26(Thu)21:43:39 No.107875532

>>107875479
And they try to claim genning takes no skills.

Anonymous
01/15/26(Thu)21:49:18 No.107875570

Anonymous 01/15/26(Thu)21:49:18 No.107875570

File: 6584de70736855a634bb797d0(...).png (702 KB, 640x1536)

702 KB PNG

>>107875479
no signs of inpainting.

Anonymous
01/15/26(Thu)21:53:00 No.107875586

Anonymous 01/15/26(Thu)21:53:00 No.107875586

>>107873799
model?

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.