/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>108931385 & >>108924918►News>(05/29) Jart loves 4chan and needs your money to fly all over the world https://justine.lol/animus/ Oh and step 3.7 dropped I guess https://huggingface.co/stepfun-ai/Step-3.7-Flash-GGUF>(05/21) Hy-MT2 “fast-thinking” translation models released: https://hf.co/collections/tencent/hy-mt2>(05/20) Cohere releases Command A+ 218B-A25B: https://cohere.com/blog/command-a-plus>(05/16) llama + spec: MTP Support #22673 merged: https://github.com/ggml-org/llama.cpp/pull/22673>(05/08) KSA-4B-base released: https://hf.co/OpenOneRec/KSA-4B-base►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplers►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://swe-rebench.comAgentic Coding: https://deepswe.datacurve.aiContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-samplingToken Speed Visualizer: https://shir-man.com/tokens-per-second►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
Kimi thread.Kimi board.Gemma's cool, she can stay too.
who is this handsome gentleman?
>>108937336it might just be mental illness. and it might have something to do with thinking you are a woman
what is lil bro even yapping about on his page?
>>108936843>the greatest competitive advantage I've ever had was to monitor which pull requests people on 4chan complained about, and then merge them into llamafile before Gerganov couldI might stop laughing at him for a day if he merges deepseek. Does he merge deepseek?
I don't like how unprofessional this thread is.
*cums on this thread* mm... much better
>>108937386everything here is extremely professional according to my field of expertiset. professional shitposter
We must be better llamacpp contributors.
>>108937402t. professional cumeater, apparently
>>108937312You forgot to remove mikupad
>>108937406llamapodofile you mean
>>108937423Why would he remove mikupad in a miku OP?
>cuda 13.3 is outim updooting
>>108937423I am acutally starting to see how it is all official. And we should get the official card 3.0 now.
On a 1/10 scale, how thread culture are we today?
https://rentry.co/imdddcy3I got you bros.
What context/instruct templates should I be using for Gemma 4?
>>108937498He forgot that if he deletes it then he tells us he is in the thread right now and:>>108937417>>108937431>>108937282Are actually his posts...
>>108937508Try:>I need you to donate money to me, and I mean you, as in literally you. You couldn't have read this far unless you are someone who legitimately cares, and your compassion means more to me than any amount of money. I need you to donate publicly under your real name and I want you to tell your friends how much money you gave me, since that's the best way to show that you're serious.
why hasn't this thread been taken down for being offtopic?
>>108937554Supporting open source, ie Jart is more on topic than your post
>>108937560This isn't support. This is an attack against open source.
>not x, y
see you all in a day or so when these idiots get bored and go away.
>>108937578I let out a soft warm laugh, the sound like wind through new leaves, and brush a strand of sweat-slicked hair off your forehead while saying softly my voice barely above a whisper: "I actually didn't notice while i was posting"
>>108937616Finally a damage control attempt that is at least average.
why is /lmg/ fun again?
>ldg ldglol ff mad
I don't get something. And that is a serious question.>For every hater who doom scrolls over how intelligent I amWhy would an intelligent person post that and then delete it as soon as someone posts it here?
>>108937664attention? Thread is talking about her after all.
>>108937672If attention is all he needs then why does he ask for money for plane tickets?
Jartsune miku says trans rights
►Recent Highlights from the Previous Thread: >>108931385--Paper: StoryScope: Investigating idiosyncrasies in AI fiction:>108936371 >108936425--Papers:>108934718--Gemma 31B token artifacts caused by Q3 quant damage and template errors:>108934154 >108934164 >108934223 >108934326 >108934336 >108934429 >108934447 >108934494 >108934518 >108934544 >108935381 >108934450 >108934498 >108934541 >108934563 >108934661--Llama.cpp MTP support and VRAM optimization tradeoffs:>108933191 >108933882 >108933894 >108933912 >108934078 >108934119 >108934721 >108934015 >108934030 >108934039 >108934107 >108934433 >108934869--llama.cpp f16 mask PR causing VRAM regressions for some Anons:>108932210 >108932296 >108932317 >108932336 >108932354 >108935238 >108935257 >108935284 >108935906 >108935928--Troubleshooting and optimizing Gemma 4 E4B inference speeds:>108934786 >108934800 >108934841 >108934871 >108934875 >108934881 >108934896 >108934926 >108934934 >108934991 >108935025 >108934822--Designing a local agentic assistant workflow using MCP and RAG:>108933985 >108934110 >108934132 >108934174 >108934493 >108935234 >108935571--Integrating Gemma as a functional AI party member in WoW:>108934675 >108934688 >108934717 >108934734 >108935126--Sharing and optimizing roleplay prompts and jailbreaks for Gemma 4:>108932195 >108932938 >108932961 >108932990 >108933285 >108933813 >108934206--Comparing Mistral 24b and Gemma 4 26b writing quality issues:>108931884 >108931960 >108931985 >108932032 >108934697--Using Gemma for automated MTG gameplay and UI development:>108931545 >108934795--llama.cpp adds support for DeepSeek-V3 with Sparse Attention:>108932267--Logs:>108933257 >108933397 >108933407 >108933513 >108933620 >108934039 >108934545 >108934675 >108935726 >108936364 >108936510 >108937016--Luka, Miku (free space):>108932889 >108933985►Recent Highlight Posts from the Previous Thread: >>108931389Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
>>108937692that is a cute 2D jart you got there
>>108937674There's no way, he would have spazzed out by now.
and of course janny begins his sweeping for a fellow troon. actual pottery
I am using ComfyUI with RealVisXL 4.0 and it is NOT outputting what the prompts require. Is it a weak model?
>>108937740rong thread
>>108937746Oh shit you're right.
>>108937746but it has a cute anime girl in OP picture and it says AI
Why are you all so intolerant?
>>108937786Atrix user, probably. I heard they're mostly Nazis
Fag Misery
>>108937355>Wtf, is this true? That makes the drama even crazierIwan Kawrakow was Gerganov's doctoral advisor.I don't know if all that crap about llamafile is true.
>>108937740>I am using ComfyUImy condolences
>>108937892? It works great so far, it's just the more details i give the worse the out. I figured out to generate a rough idea first then use image2image to get what i really want