/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>109101986 & >>109098000►News>(06/16) GLM 5.2 released with IndexCache and 1M context: https://z.ai/blog/glm-5.2>(06/16) VibeThinker-3B released: https://hf.co/WeiboAI/VibeThinker-3B>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code>(06/12) EAGLE3 speculative decoding support merged: https://github.com/ggml-org/llama.cpp/pull/18039►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://swe-rebench.comAgentic Coding: https://deepswe.datacurve.aiContext Length: https://github.com/RecapAnon/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-samplingToken Speed Visualizer: https://shir-man.com/tokens-per-second►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>109101986--Recommending models for RTX 4050 and discussing Gemma 4 depurpling:>109104682 >109104688 >109104707 >109104756 >109104803 >109104834 >109104929 >109104838 >109104849 >109104871 >109104809 >109104856 >109104867--Feasibility and bottlenecks of pooling VRAM via RPC over gigabit networks:>109106828 >109106858 >109106872 >109106891--Using 2.5 mproj to give K2-Thinking vision capabilities:>109103511 >109104588 >109105603--DeepSeek-V4-Flash-Base GGUF reports and architecture naming issues:>109104143 >109104818 >109104965--MTP speed benchmarks for Gemma-4 using Vulkan on RX6700XT:>109102307 >109103451 >109107003 >109108056--Optimizing settings for Gemma-4 models on low-VRAM hardware:>109102301 >109102361 >109102385 >109102398 >109102405 >109102402 >109102429 >109102434--Gemma 4's tendency toward robotic prose with long system prompts:>109103211 >109103223 >109103241 >109103258 >109103266 >109103382 >109103689--Desire for smaller zai models and Gemma-4-12B performance:>109106452 >109106505 >109106547 >109106569 >109106628 >109106654 >109106674 >109106723 >109107585--Comparing Fable 5 to OSS and discussing Anthropic's ID verification:>109103890 >109103940 >109104006 >109103944 >109104032 >109104082 >109104391 >109106627 >109105684--Discussing high-quality MoE models and MoE vs dense architectures:>109104895 >109104901 >109104918 >109104962 >109105038 >109105053 >109105093 >109104992 >109105095 >109105118 >109105388 >109105428 >109106706 >109106766--Viability of running llama.cpp across mixed Metal and ROCm devices:>109106205 >109106300 >109106340 >109106329 >109106347--Discussing a neural network that converts images into playable games:>109107514 >109107570--Logs:>109103511 >109104424 >109104803 >109104809 >109106627--Miku (free space):>109103689►Recent Highlight Posts from the Previous Thread: >>109101988Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
gemmaballz
>no DSv4 yet>no M3 sparse attention yet>no one looking at PRsmore like LAMEa.cpp amirite
so what comes after Mythos?a new super model capable of what exactly?and after that? if this is an eternal race, what are the future capabilities? programming languages are limited and all the security holes should get filled at some point.then what does the model do? and how does it improve? to do WHAT? invent new programming languages so it can hack it and then it can create the shields for whatever it invented?i don't fully get it
>>109108395>so what comes after Mythos?Thread
>>109108395>so what comes after Mythos?legends
anyone using gemma31 for translations, especially long ones (5000-10000+)is it good?
anons, when some of you say you're using multiple agents, do you mean : - sequentially, basically every iteration checking the one before for anything wrong- at the same time?
Relative noob here, just perfected my SillyTavern frontend.What CLI do you guys use for your Gemmy? Gemini is telling me to use Aider.
>>109108422pi.dev, then whatever plugins you like. only one I've been using is pi-fff with the override for better find and grep
>>109108388good thing forks exist and you can literally use them right now before waiting months for those fags to implement it
>>109108422opencode is good
70b dense
>>109108472>2 t/s is slow>waits 5 hours for a (you)
>>109108388Using the PR, I'm liking how DS v4 lite writes its in-character thinking, and story completion but I can't stand how I need to wait 10 seconds for each story continuation to begin in mikupad even without changing any tokens in the prompt after the previous generation. Sucks to be poor running GPU+CPU.
>>109108346>currybookcringe
>>109108531So far I'm liking it at high temp for rp/stories and how much more efficient and nicer the thinking is compared to some other models.
loli feet
405b dense
>>109108449But anon, I use AMD.
>Gemma-4-125B-IT>Still can't do tool calls sucessfully
Who with a big rig is using a q4+ quant of glm5.2? Is it worthwhile vs k2.7-code for code, planning and logic work?I'm running low on disk space to be quanting yet another model if it isn't a pretty significant jump.
I am requesting the Ace song guy to train on Tupac, we are long overdue for a Hit Em Up part 2 and general Tupac revival.