/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>107668478 & >>107660171►News>(12/26) MiniMax-M2.1 released: https://www.minimax.io/news/minimax-m21>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>107668478--MiniMax M2.1 open-source release and performance comparisons:>107670819 >107670883 >107670920 >107670942 >107670949 >107672201 >107672262 >107672284 >107672312 >107672331 >107672351 >107672350 >107671849 >107671941 >107672070 >107672185--Critique of AI-generated code in open-source projects:>107677585 >107677617 >107677642 >107677665 >107677632 >107677647 >107678712 >107679099 >107677898 >107678528 >107678559 >107678656 >107677639--LLM coding assistance and GPU performance debates in software development:>107672336 >107672434 >107672518 >107675192 >107675246 >107675253 >107675587 >107676351 >107679607 >107674915 >107675005--Miku SVG design challenges and feedback loop experiments:>107672556 >107672566 >107672755 >107672790 >107672831 >107672847 >107672860 >107672914 >107673073 >107672996 >107674108 >107674478 >107675763--Quantized model performance comparison debate: GLM4.7 vs 4.5 Air under standardized metrics:>107677093 >107677281 >107677372 >107677412 >107677455 >107677473 >107677524 >107677649 >107677685--Quantization and precision conversion challenges:>107668841 >107677216 >107677285 >107678095 >107678235 >107678390--MTP feature underperforms in speeding up token generation despite theoretical benefits:>107670614 >107671516 >107671534 >107671544 >107671557 >107671707 >107671749--Benchmarking llama.cpp vs ik_llama performance:>107674698 >107675709 >107675728 >107676008--TTS model recommendations and usability challenges:>107669227 >107669313--A/B test comparison showing improved response quality:>107668821--m2.1 model improvements and roleplay experience:>107677496--GreedyNalaTests update:>107675251--Miku (free space):>107668993 >107669827 >107670524 >107671006 >107671016 >107672556 >107672566 >107672755 >107672831 >107672914 >107678060 >107678129►Recent Highlight Posts from the Previous Thread: >>107668479Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
>>107679741I do not like this miqu, its bald patina mocks me.
when will there be resources for local models on mobile?>le mobile shitterwhy yes, i'd rather keep my digital sovereignty while on the go.>le connect to your home setup with tailscalei could be off grid or be somewhere with shitty/nonexistent internet access
>>107679803>i could be off grid or be somewhere with shitty/nonexistent internet accessimagine being such a limp wristed pencil necked cuck you cant lug a ~10 kilo server in your backpack desu you dont deserve ai then
>>107679761
>>107679803If I was off grid then my main concern wouldn't be not having access to LLMs
So now that ZAI has fallen, what is there even left to look forward to in this hobby? The latest assistantslopped qwen...?
So guys after using 4.7 some more I am sad to say I think I understand what happened. 4.7 is an affront to the coomers. GLM-sex is probably worse because the model basically repeats the same phrases a lot. In comparison I could use 4.6 for basically 2 months without a hint of boredom. And when I used 4.7 for truly degenerate shit which is SFW roleplay.... It is much better. They really improved the model for roleplay... just not for sex.Not cool zai.
>>107679821>10kiloAll you need is raspberry pi, RTX 6000 and a PSU
Also with those crazy 300W power banks we can eventually go full mobile with out servers and stop relying on 3kW outlets
>>107679870>his top priority in the middle of the woods isn't jerking it to miku erp slopsmells like faggot in here...