[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1758010686171721.jpg (1.53 MB, 1298x2116)
1.53 MB
1.53 MB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107847320 & >>107838898

►News
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2
>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512
>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents
>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004
>(12/31) HyperCLOVA X SEED 8B Omni released: https://hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
►Recent Highlights from the Previous Thread: >>107847320

--AI companion project updates and TTS technology debates:
>107850660 >107850672 >107850928 >107850947 >107853430 >107853443 >107853483 >107853554 >107853511 >107853575 >107853593 >107853616 >107853690 >107854122 >107854331 >107850983 >107851185 >107851216 >107851258 >107851258 >107853585
--Ouro-2.6B with claimed 12B performance and Scaling Latent Reasoning via Looped Language Models:
>107850350 >107850413 >107850458 >107850486 >107850501 >107850519
--Optimizing client-side beam search with logprobs and batching challenges:
>107848617 >107850562 >107853006 >107853060 >107853196 >107853678
--Tools and methods for converting EPUB books to audiobooks:
>107848869 >107848873 >107848915 >107849013 >107849252
--LLM roleplaying tool UI comparisons: RisuAI vs Talemate tradeoffs:
>107847698 >107847879
--Model-GPU compatibility and quantization tradeoffs for consumer hardware:
>107855431 >107855475 >107855507 >107855527 >107855540 >107855562 >107855603 >107855650 >107855676 >107855692
--Conflicting mmap behaviors in model loading across Windows/Linux platforms:
>107849640 >107849759 >107849779 >107849831 >107849870 >107849925 >107849966 >107849976 >107849992
--Multi-model debate system feasibility in chat environments:
>107850478 >107851241 >107851246 >107851397 >107851424 >107851506
--Llama.cpp integration via three-vrm example code:
>107851325
--GPU options for LLMs: Ampere+ requirements vs budget constraints:
>107851370 >107851398 >107851413 >107854581 >107851423 >107851429 >107851444 >107851450 >107851463 >107851469 >107851817
--P40 limitations for modern LLM frameworks despite llama.cpp support:
>107851776 >107851795 >107851803 >107852059
--GLM-Image release:
>107856290
--Miku and Teto (free space):
>107847978 >107849177 >107849273 >107850396 >107851664 >107852635 >107852843 >107853561

►Recent Highlight Posts from the Previous Thread: >>107847322

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
S-so it's really here?
autoregressive imagegen for local? I'm not dreaming?
>>
>>107856463
Worse than zit despite being newer and bigger, basically a garbage
https://huggingface.co/zai-org/GLM-Image/discussions/1
>>
File: Capture.png (6 KB, 968x64)
6 KB
6 KB PNG
>>107848869
>>107849013
>>107849252
ended up with kokoro-tts docker image and a chatgpt script to automatize the process. werks well, sounds good.
im running the model on runpod since i have some credit on there. rtx 4000 and 5080 both seem give around 3,5 sec/chunk. gonna try it locally on my 3060.
>>
>>107856493
lmao at the shills seething in the comments
novelai must be planning to adopt that model in the future with how they're panicking about getting exposed
>>
>>107856717
You're the *** shill, how would they have plans in the past to adopt an unknown model? Or are you suggesting Z.ai bought them out?
>>
Hey guys! Is this the place where I ask about NSFW Roleplay AI or do I need to move to another general? Looking for a place to like, learn the AI goober things really...
>>
>>107856809
Buy an ad.
>>
>>107856809
this place is as good as any other.
>>
>ouro-2.6B
verdict?
>>
>>107856424
Newfag here. How good are local models now? If I had a mountain of GPU how close to chat gpt can I get
>>
File: 1767551307991286.png (1.37 MB, 2175x1234)
1.37 MB
1.37 MB PNG
>>107856717
From a practical standpoint it should be smaller or better, there's no point in making it bigger and worse. That is embarrassing. But if you think about it for a second, you'll understand why it's a good thing to do something new and still release it even if it turns out to be shit. That's how progress is made
>>
>>107856886
>1.4MB
>>
File: file.png (365 KB, 2053x1322)
365 KB
365 KB PNG
>copy the instructions from the readme
>it doesn't work
>use the docker container
>it doesn't work
Thanks for nothing.
>>
>>107856809
this general is technical discussion and asshats
/aicg/ is casual roleplay discussion and mouthbreathers
take your pick! oh, and I forget that a general still exists on /vg/. That one is where a lot of this started, but last I checked it was kind of a weird place to hang out. in this general, actual groundbreaking SOTA methods like CoT have been explored by an anon who was later cited in a paper, I think kalomaze was his name but some anon can correct me. that was the principle behind reasoning, which became a big part of 2025 LLM releases. and we have a resident llama.cpp dev, and he's a pretty cool guy. but if you're less interested in running models than you are in roleplaying with them, /aicg/ might be more your speed. i hope you have a fun and warm 4chan experience newfriend :)
>>
>>107857025
Model?
>>
>>107857025
>CoT
That was /aids/ or /aidg/, back when it was GPT-3. Kalomaze made min p.
>>
>>107856983



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.