/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 01/14/26(Wed)18:29:10 No.107864105

File: 0.png (2.57 MB, 1500x1500)

/lmg/ - Local Models General Anonymous 01/14/26(Wed)18:29:10 No.107864105

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107856424 & >>107847320

►News
>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2
>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512
>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents
>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
01/14/26(Wed)18:29:25 No.107864106

Anonymous 01/14/26(Wed)18:29:25 No.107864106

File: what's in the box.jpg (235 KB, 1536x1536)

235 KB JPG

►Recent Highlights from the Previous Thread: >>107856424

--Papers:
>107863388
--Multi-GPU model loading challenges and optimization techniques:
>107858542 >107858562 >107858757 >107858773 >107858794 >107858813 >107858862 >107858902 >107858938 >107858574 >107858708 >107858722 >107858745
--LLM potential and synthetic data challenges in training:
>107857483 >107857494 >107857524 >107862696 >107857550 >107857610 >107857630 >107857528 >107857562 >107857601 >107857627 >107863112 >107863126 >107863159 >107857667 >107857702 >107857735
--Budget GPU cooling solutions and DIY fan modification experiences:
>107857227 >107857475 >107858313 >107858798 >107860478 >107860739
--Bypassing Hugging Face token requirement for Chatterbox-turbo model weights:
>107860133 >107860167 >107860162 >107860173 >107861457 >107861519
--Implementing efficient layer looping in llama.cpp across backends:
>107859445 >107859518 >107859520 >107859531 >107859544 >107859583
--Chatterbox-turbo performance and setup frustrations with ROCm:
>107860835 >107861152 >107861313 >107860932
--TTS system deployment and performance testing across platforms:
>107856644 >107857722
--Python environment management debates: venv vs Docker vs conda:
>107859294 >107860018 >107860143 >107860210 >107861562 >107860093 >107860809 >107860152 >107861672 >107861697 >107861760 >107861798 >107861808 >107861830 >107861776 >107861788
--GitHub contribution issues and ik_llama.cpp project maintenance challenges:
>107859037 >107859111 >107859125 >107859126 >107859207 >107859259
--Challenges with AI character portrayal and bias in Opus 4.5 vs Kimi:
>107862097 >107862355
--Critique of large-scale model training and praise for distillation techniques:
>107857737 >107857759 >107857770 >107857849 >107857891 >107857896 >107857919
--Miku and Luka (free space):
>107856886 >107857077 >107859422 >107863062 >107860414

►Recent Highlight Posts from the Previous Thread: >>107856428

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
01/14/26(Wed)18:37:23 No.107864171

Anonymous 01/14/26(Wed)18:37:23 No.107864171

>>107864105
Is that age of LongCat finally upon us?

Anonymous
01/14/26(Wed)18:53:14 No.107864276

Anonymous 01/14/26(Wed)18:53:14 No.107864276

LongPussy a meme?

Anonymous
01/14/26(Wed)19:02:27 No.107864334

Anonymous 01/14/26(Wed)19:02:27 No.107864334

>>107864276
>having goofs
>being a meme
pick one

Anonymous
01/14/26(Wed)19:04:22 No.107864348

Anonymous 01/14/26(Wed)19:04:22 No.107864348

>>107864171
>>107864276
No support in llama.cpp=not local. Simple as.

Anonymous
01/14/26(Wed)19:07:57 No.107864370

Anonymous 01/14/26(Wed)19:07:57 No.107864370

>>107864348
TRVKE

Anonymous
01/14/26(Wed)19:13:38 No.107864411

Anonymous 01/14/26(Wed)19:13:38 No.107864411

>>107864105
it doesn't get said enough, but you guys are cool.
keep on doing you bros.

Anonymous
01/14/26(Wed)19:16:30 No.107864432

Anonymous 01/14/26(Wed)19:16:30 No.107864432

>>107864024
>Either be subtle and don't mention nsfw

I was thinking that, but when I put the tokenizer config into Qwen and asked what the model does, it straight away spotted it's a NSFW audio system.

>or don't release it under your name if you hope to have or keep a career in ML

I didn't study ML formally so probably not an option for me anyway

Anonymous
01/14/26(Wed)19:20:05 No.107864456

Anonymous 01/14/26(Wed)19:20:05 No.107864456

File: cockbench.png (2.01 MB, 1131x6954)

2.01 MB PNG

Added exaone moe.

Anonymous
01/14/26(Wed)19:47:00 No.107864593

Anonymous 01/14/26(Wed)19:47:00 No.107864593

>>107864456
Given their popularity for RP, you should add Mistral small 3.1/3.2. No one uses 2501 anymore.

Anonymous
01/14/26(Wed)19:47:09 No.107864594

Anonymous 01/14/26(Wed)19:47:09 No.107864594

>>107864456
The model completely breaks itself in the presence of lewd.
I asked it to continue the story using the chat template and this is its thinking response. It gets stuck in a loop. It's also hallucinating a system prompt it probably had in training.
https://pastebin.com/1YqtWRGp
https://pastebin.com/cnAwGC6g

Anonymous
01/14/26(Wed)19:56:16 No.107864651

Anonymous 01/14/26(Wed)19:56:16 No.107864651

>>107864432
You're over thinking it, simply don't mention it in the model card

Anonymous
01/14/26(Wed)19:59:28 No.107864677

Anonymous 01/14/26(Wed)19:59:28 No.107864677

Wow another big win for moesissies. Really outdid yourselves this time.

Anonymous
01/14/26(Wed)20:01:24 No.107864695

Anonymous 01/14/26(Wed)20:01:24 No.107864695

File: image.jpg (11 KB, 114x101)

11 KB JPG

>>107864105
yjk
the gloves stay on

Anonymous
01/14/26(Wed)20:18:18 No.107864809

Anonymous 01/14/26(Wed)20:18:18 No.107864809

MMA was able to play a good soccer game in the first half. Although a team like Ghana would come up late last season as the club and its team would have to fill the scoring gap against Germany, Germany and Germany. However, the team did have great support on the pitch.

In the second game against Germany, he was able to play in the first half, but the German had to do it in the first half.

But as Böhm continued to play against Germany for the first half in the first half of the season, he was able to take on most of the task.

"When we looked at it during our first round, I got in pretty good shots," he said. "It was just a moment when we finally had a chance to play in the second half.

"Having completed a huge amount of work here, we had a very talented team and we prepared with them, they were a great team."

MMA's 2-1 win over Germany only came into being the first team to begin.

The Slovenian team's last two games were at home next Saturday.

Rivalist Pavel Luttić is not the only centre-back where he went from being named one to finishing the season in the first half. However, he will allow the team to now be back to Paris.

"I saw this game last season. A lot of time came out of it and we were able to do it," he said. "After the first half, I spent the next couple of days doing it. I got started on the pitch, I went out at the end of the pitch and I had to make sure I was able to stop the pitch.

"I got out and I had to do it, and I don't have anything left to get it right. I can't be excited about it."

MMA's other team – such as Belgium and Belgium – has already entered the league. Their academy have signed their academy in Paris, Amsterdam and Austria, following the main-rightening of the team.

"To put that game back into the dressing room, it was like nothing to leave."

i was training a 30M 6/6/384 model and i managed to pull this... from 30M model.. what the actual fuck?

Anonymous
01/14/26(Wed)20:36:56 No.107864928

Anonymous 01/14/26(Wed)20:36:56 No.107864928

>>107864809
they are pretty magical. its unfortunate they don't scale very well.

Anonymous
01/14/26(Wed)20:39:49 No.107864950

Anonymous 01/14/26(Wed)20:39:49 No.107864950

>>107864456
The OSS results always makes me chuckle.

Anonymous
01/14/26(Wed)20:45:21 No.107864987

Anonymous 01/14/26(Wed)20:45:21 No.107864987

>>107864809
>MMA was able to play a good soccer game in the first half. Although a team like Ghana would come up late last season as the club and its team would have to fill the scoring gap against Germany, Germany and Germany. However, the team did have great support on the pitch.
Does this sound less like nonsense to someone who likes soccer?
>In the second game against Germany, he was able to play in the first half, but the German had to do it in the first half.
NGL this seems basically what I would expect.

Anonymous
01/14/26(Wed)20:48:59 No.107865005

Anonymous 01/14/26(Wed)20:48:59 No.107865005

>too lazy to load model up, just use same model via provider
>ask basic bitch question about an area's climate, fishing, what fish you can catch
>IT'S NOT JUST FISHING, IT'S A DEEP CULTURAL TRADITION
>a fuckload of markdown tables and lists later, I suddenly don't give a shit
Every generation of model and they find a new thing that drives me up a wall and train every other model on said outputs. I almost wish to return to llama 2 days, except it was so retarded that it was borderline impossible to extract any usable info from

Anonymous
01/14/26(Wed)21:08:40 No.107865115

Anonymous 01/14/26(Wed)21:08:40 No.107865115

>>107865005
The fish is looking at you with a mix of fear and something else

Anonymous
01/14/26(Wed)21:17:01 No.107865162

Anonymous 01/14/26(Wed)21:17:01 No.107865162

>>107865115
The fish also has a scent that's... uniquely its own
I hate how fucking retarded llms are despite their potential value

Anonymous
01/14/26(Wed)21:21:38 No.107865197

Anonymous 01/14/26(Wed)21:21:38 No.107865197

>>107865162
>begins every single reply with the same fucking words, permanently dooming your entire chat no matter how much you swipe
nothing personnel human

Anonymous
01/14/26(Wed)21:30:53 No.107865235

Anonymous 01/14/26(Wed)21:30:53 No.107865235

>>107865005
>You hit the nail on the head! It's not just about training, it's about the bonds we form with one another as we embark on this journey of growth and discovery. You didn't just ask a question, you opened the door to a world of possibilities.

Anonymous
01/14/26(Wed)21:33:18 No.107865250

Anonymous 01/14/26(Wed)21:33:18 No.107865250

should i start token banning the words "and" and "just"?

Anonymous
01/14/26(Wed)21:35:09 No.107865259

Anonymous 01/14/26(Wed)21:35:09 No.107865259

>>107865250
you should token ban
>The
>for
>an
>a
and puntuation symbols that aren't ";" or "?".
It's just commonslop

Anonymous
01/14/26(Wed)21:43:25 No.107865295

Anonymous 01/14/26(Wed)21:43:25 No.107865295

File: itsover.png (193 KB, 881x692)

193 KB PNG

They really have converged on this pattern. Even base models.

Anonymous
01/14/26(Wed)22:01:20 No.107865401

Anonymous 01/14/26(Wed)22:01:20 No.107865401

>>107865295
Caused by synthetic slop data, no doubt

Anonymous
01/14/26(Wed)22:04:58 No.107865422

Anonymous 01/14/26(Wed)22:04:58 No.107865422

>>107865295
holy shit

Anonymous
01/14/26(Wed)22:08:45 No.107865439

Anonymous 01/14/26(Wed)22:08:45 No.107865439

File: my honest reaction.jpg (47 KB, 562x675)

47 KB JPG

>>107865295

Anonymous
01/14/26(Wed)22:29:10 No.107865561

Anonymous 01/14/26(Wed)22:29:10 No.107865561

File: rip papers anon.jpg (292 KB, 1683x2048)

292 KB JPG

>>107840944
It looks like a bag of tricks but not something revolutionary since the theoretical benefits can be done with other techniques without lowering MFU. I don't see how it makes quantization or MoE any different.

>Ouro paper
>does not extrapolate to deeper, unseen recurrent steps
sad!

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.