/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 01/13/26(Tue)21:20:05 No.107856424

File: 1758010686171721.jpg (1.53 MB, 1298x2116)

/lmg/ - Local Models General Anonymous 01/13/26(Tue)21:20:05 No.107856424

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107847320 & >>107838898

►News
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2
>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512
>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents
>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004
>(12/31) HyperCLOVA X SEED 8B Omni released: https://hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
01/13/26(Tue)21:20:34 No.107856428

Anonymous 01/13/26(Tue)21:20:34 No.107856428

File: luka vocaloid potato chip(...).jpg (446 KB, 3112x3022)

446 KB JPG

►Recent Highlights from the Previous Thread: >>107847320

--AI companion project updates and TTS technology debates:
>107850660 >107850672 >107850928 >107850947 >107853430 >107853443 >107853483 >107853554 >107853511 >107853575 >107853593 >107853616 >107853690 >107854122 >107854331 >107850983 >107851185 >107851216 >107851258 >107851258 >107853585
--Ouro-2.6B with claimed 12B performance and Scaling Latent Reasoning via Looped Language Models:
>107850350 >107850413 >107850458 >107850486 >107850501 >107850519
--Optimizing client-side beam search with logprobs and batching challenges:
>107848617 >107850562 >107853006 >107853060 >107853196 >107853678
--Tools and methods for converting EPUB books to audiobooks:
>107848869 >107848873 >107848915 >107849013 >107849252
--LLM roleplaying tool UI comparisons: RisuAI vs Talemate tradeoffs:
>107847698 >107847879
--Model-GPU compatibility and quantization tradeoffs for consumer hardware:
>107855431 >107855475 >107855507 >107855527 >107855540 >107855562 >107855603 >107855650 >107855676 >107855692
--Conflicting mmap behaviors in model loading across Windows/Linux platforms:
>107849640 >107849759 >107849779 >107849831 >107849870 >107849925 >107849966 >107849976 >107849992
--Multi-model debate system feasibility in chat environments:
>107850478 >107851241 >107851246 >107851397 >107851424 >107851506
--Llama.cpp integration via three-vrm example code:
>107851325
--GPU options for LLMs: Ampere+ requirements vs budget constraints:
>107851370 >107851398 >107851413 >107854581 >107851423 >107851429 >107851444 >107851450 >107851463 >107851469 >107851817
--P40 limitations for modern LLM frameworks despite llama.cpp support:
>107851776 >107851795 >107851803 >107852059
--GLM-Image release:
>107856290
--Miku and Teto (free space):
>107847978 >107849177 >107849273 >107850396 >107851664 >107852635 >107852843 >107853561

►Recent Highlight Posts from the Previous Thread: >>107847322

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
01/13/26(Tue)21:29:18 No.107856463

Anonymous 01/13/26(Tue)21:29:18 No.107856463

S-so it's really here?
autoregressive imagegen for local? I'm not dreaming?

Anonymous
01/13/26(Tue)21:34:18 No.107856493

Anonymous 01/13/26(Tue)21:34:18 No.107856493

>>107856463
Worse than zit despite being newer and bigger, basically a garbage
https://huggingface.co/zai-org/GLM-Image/discussions/1

Anonymous
01/13/26(Tue)21:56:44 No.107856644

Anonymous 01/13/26(Tue)21:56:44 No.107856644

File: Capture.png (6 KB, 968x64)

6 KB PNG

>>107848869
>>107849013
>>107849252
ended up with kokoro-tts docker image and a chatgpt script to automatize the process. werks well, sounds good.
im running the model on runpod since i have some credit on there. rtx 4000 and 5080 both seem give around 3,5 sec/chunk. gonna try it locally on my 3060.

Anonymous
01/13/26(Tue)22:05:46 No.107856717

Anonymous 01/13/26(Tue)22:05:46 No.107856717

>>107856493
lmao at the shills seething in the comments
novelai must be planning to adopt that model in the future with how they're panicking about getting exposed

Anonymous
01/13/26(Tue)22:11:10 No.107856757

Anonymous 01/13/26(Tue)22:11:10 No.107856757

>>107856717
You're the *** shill, how would they have plans in the past to adopt an unknown model? Or are you suggesting Z.ai bought them out?

Anonymous
01/13/26(Tue)22:17:50 No.107856809

Anonymous 01/13/26(Tue)22:17:50 No.107856809

Hey guys! Is this the place where I ask about NSFW Roleplay AI or do I need to move to another general? Looking for a place to like, learn the AI goober things really...

Anonymous
01/13/26(Tue)22:20:31 No.107856829

Anonymous 01/13/26(Tue)22:20:31 No.107856829

>>107856809
Buy an ad.

Anonymous
01/13/26(Tue)22:27:38 No.107856861

Anonymous 01/13/26(Tue)22:27:38 No.107856861

>>107856809
this place is as good as any other.

Anonymous
01/13/26(Tue)22:28:12 No.107856867

Anonymous 01/13/26(Tue)22:28:12 No.107856867

>ouro-2.6B
verdict?

Anonymous
01/13/26(Tue)22:29:15 No.107856875

Anonymous 01/13/26(Tue)22:29:15 No.107856875

>>107856424
Newfag here. How good are local models now? If I had a mountain of GPU how close to chat gpt can I get

Anonymous
01/13/26(Tue)22:31:43 No.107856886

Anonymous 01/13/26(Tue)22:31:43 No.107856886

File: 1767551307991286.png (1.37 MB, 2175x1234)

1.37 MB PNG

>>107856717
From a practical standpoint it should be smaller or better, there's no point in making it bigger and worse. That is embarrassing. But if you think about it for a second, you'll understand why it's a good thing to do something new and still release it even if it turns out to be shit. That's how progress is made

Anonymous
01/13/26(Tue)22:44:56 No.107856983

Anonymous 01/13/26(Tue)22:44:56 No.107856983

>>107856886
>1.4MB

Anonymous
01/13/26(Tue)22:49:54 No.107857018

Anonymous 01/13/26(Tue)22:49:54 No.107857018

File: file.png (365 KB, 2053x1322)

365 KB PNG

>copy the instructions from the readme
>it doesn't work
>use the docker container
>it doesn't work
Thanks for nothing.

Anonymous
01/13/26(Tue)22:51:12 No.107857025

Anonymous 01/13/26(Tue)22:51:12 No.107857025

>>107856809
this general is technical discussion and asshats
/aicg/ is casual roleplay discussion and mouthbreathers
take your pick! oh, and I forget that a general still exists on /vg/. That one is where a lot of this started, but last I checked it was kind of a weird place to hang out. in this general, actual groundbreaking SOTA methods like CoT have been explored by an anon who was later cited in a paper, I think kalomaze was his name but some anon can correct me. that was the principle behind reasoning, which became a big part of 2025 LLM releases. and we have a resident llama.cpp dev, and he's a pretty cool guy. but if you're less interested in running models than you are in roleplaying with them, /aicg/ might be more your speed. i hope you have a fun and warm 4chan experience newfriend :)

Anonymous
01/13/26(Tue)22:54:55 No.107857050

Anonymous 01/13/26(Tue)22:54:55 No.107857050

>>107857025
Model?

Anonymous
01/13/26(Tue)22:57:15 No.107857063

Anonymous 01/13/26(Tue)22:57:15 No.107857063

>>107857025
>CoT
That was /aids/ or /aidg/, back when it was GPT-3. Kalomaze made min p.

Anonymous
01/13/26(Tue)22:59:16 No.107857077

Anonymous 01/13/26(Tue)22:59:16 No.107857077

File: e7f743c2-6dfc-4d0e-ba1e-c(...).png (1.62 MB, 768x1344)

1.62 MB PNG

>>107856983

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.