/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>103286673 & >>103278810►News>(11/22) LTX-Video: Real-time video generation on a single 4090: https://github.com/Lightricks/LTX-Video>(11/21) Tülu3: Instruct finetunes on top of Llama 3.1 base: https://hf.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5>(11/20) LLaMA-Mesh weights released: https://hf.co/Zhengyi/LLaMA-Mesh>(11/18) Mistral and Pixtral Large Instruct 2411 released: https://mistral.ai/news/pixtral-large>(11/12) Qwen2.5-Coder series released: https://qwenlm.github.io/blog/qwen2.5-coder-family►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/tldrhowtoquant►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/leaderboard.htmlCode Editing: https://aider.chat/docs/leaderboardsContext Length: https://github.com/hsiehjackson/RULERJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>103286673--Paper: Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers:>103297845 >103297860 >103297890--Papers:>103296533 >103297733 >103297788 >103297918--Testing and comparing LLMs, abliteration, and imatrix:>103293087 >103293117 >103293133 >103293276 >103293314 >103293388 >103294589 >103294729 >103294759 >103294852--Anon seeks to replace Claude 2 with a local model:>103297270 >103297321 >103297441 >103297490 >103297520 >103297615--Anon releases unofficial SMT implementation with PEFT version:>103296930 >103297151 >103297268--Merging safetensors files and quantization discussion:>103288602 >103288654 >103291928 >103292005 >103292180--LTX-Video model discussion and testing:>103288336 >103288358 >103293709 >103293808 >103293832 >103293833 >103293979 >103294017 >103294054 >103294101--Best practices for creating character definitions in koboldAI:>103296813 >103297047 >103297081 >103297216--Anons share non-coom uses for the model, including art and programming:>103286774 >103286788 >103288316 >103286822 >103286831 >103286978 >103286998 >103287002 >103287586--Card formatting debate and character writing discussion:>103295022 >103295128 >103295138 >103295179 >103295271 >103295277 >103295250 >103295290 >103295338 >103295472--Kernel update has no effect on CPU inference performance:>103287570--Athene-V2-Chat-72B open model performance and implications:>103293224 >103293513 >103294469 >103294670--Anon shares comic-translate app for automatic comic translations:>103290835--Anon shares AmoralQA-v2 dataset and discusses its usage in models:>103287899 >103288215 >103288314--Miku (free space):>103286754 >103287503 >103289721 >103290110 >103292155 >103292194 >103292482 >103292570 >103294256 >103294336 >103295577 >103296695►Recent Highlight Posts from the Previous Thread: >>103286678Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
>>103298523nigga what
>>103298447Careful, you're gonna set off the MagnumV4 72B schizo. He won't countenance any praise for a larger model.
So with the new behemoth what the fuck am I supposed to do with the instruct format? I didn't quite understand the instructions with the prompt format drummer left.
Zzzz
:-)
ahhhh
:o
>>103298520>>103298523>>103298712>>103298713>>103298717>>103298723Large language models?
>>103298738You're in the wrong hood
Which LLM is good if I want a buddy for programming? Which LLM is good if I needed someone to reference and improve my report writing? I tend to write a lot of things in the passive voice which is annoying.
>>103298742these losers just masturbate.
>>103298770Anon, none of these questions are genuine.
>>103298742The last qwen-32B
Meow! :3
>>103298742Qwen2.5 Coder 32b instruct>24gb VRAM>Q4_K_L>16000 context lengthIts surprisingly really good. Not as good as the SOTA models, but better than literally anything else for local. Starts to derp out a bit at longer contexts like all other models. I just use it in ST for now but plan on hooking it up to aider whenever I get ollama running.
>>103299050>>103299126For both programming and writing feedback? Ideally for me it'll be trained enough and I can isolate the information on my machine.
>>103299126>aider whenever I get ollama running.the cringe... it hurts...