/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>107668478 & >>107660171►News>(12/26) MiniMax-M2.1 released: https://www.minimax.io/news/minimax-m21>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>107668478--MiniMax M2.1 open-source release and performance comparisons:>107670819 >107670883 >107670920 >107670942 >107670949 >107672201 >107672262 >107672284 >107672312 >107672331 >107672351 >107672350 >107671849 >107671941 >107672070 >107672185--Critique of AI-generated code in open-source projects:>107677585 >107677617 >107677642 >107677665 >107677632 >107677647 >107678712 >107679099 >107677898 >107678528 >107678559 >107678656 >107677639--LLM coding assistance and GPU performance debates in software development:>107672336 >107672434 >107672518 >107675192 >107675246 >107675253 >107675587 >107676351 >107679607 >107674915 >107675005--Miku SVG design challenges and feedback loop experiments:>107672556 >107672566 >107672755 >107672790 >107672831 >107672847 >107672860 >107672914 >107673073 >107672996 >107674108 >107674478 >107675763--Quantized model performance comparison debate: GLM4.7 vs 4.5 Air under standardized metrics:>107677093 >107677281 >107677372 >107677412 >107677455 >107677473 >107677524 >107677649 >107677685--Quantization and precision conversion challenges:>107668841 >107677216 >107677285 >107678095 >107678235 >107678390--MTP feature underperforms in speeding up token generation despite theoretical benefits:>107670614 >107671516 >107671534 >107671544 >107671557 >107671707 >107671749--Benchmarking llama.cpp vs ik_llama performance:>107674698 >107675709 >107675728 >107676008--TTS model recommendations and usability challenges:>107669227 >107669313--A/B test comparison showing improved response quality:>107668821--m2.1 model improvements and roleplay experience:>107677496--GreedyNalaTests update:>107675251--Miku (free space):>107668993 >107669827 >107670524 >107671006 >107671016 >107672556 >107672566 >107672755 >107672831 >107672914 >107678060 >107678129►Recent Highlight Posts from the Previous Thread: >>107668479Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
>>107679741I do not like this miqu, its bald patina mocks me.
when will there be resources for local models on mobile?>le mobile shitterwhy yes, i'd rather keep my digital sovereignty while on the go.>le connect to your home setup with tailscalei could be off grid or be somewhere with shitty/nonexistent internet access
>>107679803>i could be off grid or be somewhere with shitty/nonexistent internet accessimagine being such a limp wristed pencil necked cuck you cant lug a ~10 kilo server in your backpack desu you dont deserve ai then
>>107679761
>>107679803If I was off grid then my main concern wouldn't be not having access to LLMs
So now that ZAI has fallen, what is there even left to look forward to in this hobby? The latest assistantslopped qwen...?
So guys after using 4.7 some more I am sad to say I think I understand what happened. 4.7 is an affront to the coomers. GLM-sex is probably worse because the model basically repeats the same phrases a lot. In comparison I could use 4.6 for basically 2 months without a hint of boredom. And when I used 4.7 for truly degenerate shit which is SFW roleplay.... It is much better. They really improved the model for roleplay... just not for sex.Not cool zai.
>>107679821>10kiloAll you need is raspberry pi, RTX 6000 and a PSU
Also with those crazy 300W power banks we can eventually go full mobile with out servers and stop relying on 3kW outlets
>>107679870>his top priority in the middle of the woods isn't jerking it to miku erp slopsmells like faggot in here...
Forgive the retarded question, but been out of the game for awhile and want to make some decisions.I need to replace some parts that died, and given the cost of ram all I could source was a 32 gig kit of cl30 ddr 5 ram.Historically my "ai box" has had a 2x48gb kit of slightly worse timings, the main hardware in there is 2 3090s, and my question is if I took the 96 gig kit for my main build and gave it 32, how much would that hurt things? I know most stuff goes in the vram, but as far as recent models are there better quants or anything to where I won't notice that missing ram?
>>107680048r1/kimi2 or bust too thats what like 7 rtx atleast ? that takes too much electricity that would force you to carry 80+ kg of solar panels batteries etc which actually becomes a problem
>>107680494We don't know. The current meta is stacking hundreds of RAM and run MoE models on it, but with the prices it's not sustainable. Let's wait for a miracle like zit to happen in the llm space
>>107680557You only need mistral nemo though