/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>107847320 & >>107838898►News>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004>(12/31) HyperCLOVA X SEED 8B Omni released: https://hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>107847320--AI companion project updates and TTS technology debates:>107850660 >107850672 >107850928 >107850947 >107853430 >107853443 >107853483 >107853554 >107853511 >107853575 >107853593 >107853616 >107853690 >107854122 >107854331 >107850983 >107851185 >107851216 >107851258 >107851258 >107853585--Ouro-2.6B with claimed 12B performance and Scaling Latent Reasoning via Looped Language Models:>107850350 >107850413 >107850458 >107850486 >107850501 >107850519--Optimizing client-side beam search with logprobs and batching challenges:>107848617 >107850562 >107853006 >107853060 >107853196 >107853678--Tools and methods for converting EPUB books to audiobooks:>107848869 >107848873 >107848915 >107849013 >107849252--LLM roleplaying tool UI comparisons: RisuAI vs Talemate tradeoffs:>107847698 >107847879--Model-GPU compatibility and quantization tradeoffs for consumer hardware:>107855431 >107855475 >107855507 >107855527 >107855540 >107855562 >107855603 >107855650 >107855676 >107855692--Conflicting mmap behaviors in model loading across Windows/Linux platforms:>107849640 >107849759 >107849779 >107849831 >107849870 >107849925 >107849966 >107849976 >107849992--Multi-model debate system feasibility in chat environments:>107850478 >107851241 >107851246 >107851397 >107851424 >107851506--Llama.cpp integration via three-vrm example code:>107851325--GPU options for LLMs: Ampere+ requirements vs budget constraints:>107851370 >107851398 >107851413 >107854581 >107851423 >107851429 >107851444 >107851450 >107851463 >107851469 >107851817--P40 limitations for modern LLM frameworks despite llama.cpp support:>107851776 >107851795 >107851803 >107852059--GLM-Image release:>107856290--Miku and Teto (free space):>107847978 >107849177 >107849273 >107850396 >107851664 >107852635 >107852843 >107853561►Recent Highlight Posts from the Previous Thread: >>107847322Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
S-so it's really here?autoregressive imagegen for local? I'm not dreaming?
>>107856463Worse than zit despite being newer and bigger, basically a garbagehttps://huggingface.co/zai-org/GLM-Image/discussions/1
>>107848869>>107849013>>107849252ended up with kokoro-tts docker image and a chatgpt script to automatize the process. werks well, sounds good.im running the model on runpod since i have some credit on there. rtx 4000 and 5080 both seem give around 3,5 sec/chunk. gonna try it locally on my 3060.
>>107856493lmao at the shills seething in the commentsnovelai must be planning to adopt that model in the future with how they're panicking about getting exposed
>>107856717You're the *** shill, how would they have plans in the past to adopt an unknown model? Or are you suggesting Z.ai bought them out?
Hey guys! Is this the place where I ask about NSFW Roleplay AI or do I need to move to another general? Looking for a place to like, learn the AI goober things really...
>>107856809Buy an ad.
>>107856809this place is as good as any other.
>ouro-2.6Bverdict?
>>107856424Newfag here. How good are local models now? If I had a mountain of GPU how close to chat gpt can I get
>>107856717From a practical standpoint it should be smaller or better, there's no point in making it bigger and worse. That is embarrassing. But if you think about it for a second, you'll understand why it's a good thing to do something new and still release it even if it turns out to be shit. That's how progress is made
>>107856886>1.4MB
>copy the instructions from the readme>it doesn't work>use the docker container>it doesn't workThanks for nothing.
>>107856809this general is technical discussion and asshats/aicg/ is casual roleplay discussion and mouthbreatherstake your pick! oh, and I forget that a general still exists on /vg/. That one is where a lot of this started, but last I checked it was kind of a weird place to hang out. in this general, actual groundbreaking SOTA methods like CoT have been explored by an anon who was later cited in a paper, I think kalomaze was his name but some anon can correct me. that was the principle behind reasoning, which became a big part of 2025 LLM releases. and we have a resident llama.cpp dev, and he's a pretty cool guy. but if you're less interested in running models than you are in roleplaying with them, /aicg/ might be more your speed. i hope you have a fun and warm 4chan experience newfriend :)
>>107857025Model?
>>107857025>CoTThat was /aids/ or /aidg/, back when it was GPT-3. Kalomaze made min p.
>>107856983