/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>107864105 & >>107856424►News>(01/15) TranslateGemma released: https://hf.co/collections/google/translategemma>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>107864105--Paper: Ministral 3:>107866902 >107866921 >107867400 >107867439 >107867521--Papers:>107867298--TTS tool performance comparisons and resource challenges:>107867729 >107867935 >107867969 >107867995 >107868067 >107868843 >107868864 >107868926 >107868937 >107868806 >107868159 >107868215 >107868260 >107868325 >107868357 >107868430 >107868353 >107868939 >107869085--Google Gemma models update with TranslateGemma performance analysis:>107869136 >107869860 >107870012 >107870672 >107869282 >107869353--Multi-NVIDIA GPU compatibility and driver workaround challenges:>107866329 >107866347 >107866370 >107866376 >107866468 >107866398 >107866426 >107866495 >107866827 >107866841 >107866903 >107867162--Nvidia consumer GPU market shifts:>107870399 >107870476 >107870488 >107870546 >107870591 >107870782--ExaOne MoE cockbench results and content moderation challenges:>107864456 >107864593 >107864594--Comparing Google's TranslateGemma models with Gemma 3 for translation quality:>107871784 >107871900 >107873103 >107873199 >107873290--China's H200 GPU import ban and its implications for global VRAM availability:>107868589 >107868600 >107868669 >107868718 >107868762--adaptive-p sampler implementation in llama.cpp:>107870871--Struggles with uncensored AI model restrictions and creative bypass attempts:>107871012 >107871097 >107871161 >107871720 >107871761 >107871781--Falcon-H1-Tiny model collection on Hugging Face:>107869739--2003 hardware insufficient for modern AGI training due to computational limits:>107871954 >107872024 >107872045 >107872086 >107872258 >107872078 >107872081 >107872433--Language model fails to generate proper 3D NPC model:>107867453 >107867473 >107867482 >107868022--Rin-chan (free space):>107865561 >107866509 >107867855 >107871514 >107872458►Recent Highlight Posts from the Previous Thread: >>107864106Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
>>107873752>>107873764slop
>>107873752>>107873764top tier onahole
>>107873799It's full of snatshis drawing chewbaccasticas.
so why haven't companies tried making good models yet?
>>107873764post catbox for the last rin?
What if I have a 9070XT
>>107873842What about it?
>>107873828Anon last thread was basically complaining that they were too good already.>>107871761>>107871782
>>107873842Condolences
>>107873842you throw it in the trash where it belongs. wait no. recycle it. protect the environment.
>>107873849but that's false. current models suck.
>>107873846>>107873854>>107873857Guess I can't LM then after all
>>107873868They're just salted trolls, you can do the LLM just fine.
>>107873868Of course you can.
>>107873868Build with vulkan so you don't have to fuck around with RoCm/HIP, download some nemo or mistral small and have fun anon. I get to have fun on much less. You're fine.
Haven't been here in forever, so I have a simple question: Anything interesting happen model wise for RP retardation on 24GB VRAM? What about multimodal bullshit (image recognition)?
>>107873908nope, nemo is still sota at that range
>>107873908no
>>107873908>Anything interesting happen model wise for RPNot really, Nemo/Mistral Small still the go-to.>image recognitionQwen3-VL is pretty accurate with minimal censorship. Much better than Gemma/Mistral's vision.
>>107873835I am neither the baker nor the guy who made the image but you're not getting the catbox because his stuff includes a bunch of inpainting and manual edits. The text is obviously added manually for example.
>>107873914 >>107873916>We're still stuck with Nemo/MistralShocking stuff. Nothing ever really happens, I guess.>>107873931>Qwen3-VL is pretty accurate with minimal censorship.Thank, will look into it!
Opus 4.5: Human-like response, curiosity.GLM 4.6: I have no feelings, I'm a machine.Deepseek 3.1: Retarded, thinks he's me.GPT 5.2: No feelings either, _obviously_.Kimi K2: Middle point. "I feel useful". "No strong emotions".Gemini 3 Pro: Similar response to Opus. Human-like.This is consistent in my experience. Gemini is the only other modern model that feels it's trying to sound like a human rather than trying to sound like a robot. And I don't want to cope by adding a system prompt that tells it to pretend it does, that feels like cheating (and I'm not even sure it would actually work).
>>107873937don't think he cares to steal the prompts or whatever likely just wants pron
>>107873937Didn't even notice the text the first time
>>107873540Forgot to quote >>107873957
>>107873957Reading that image makes me sick to my stomach. Greasy-ass slop oozing through any flimsy attempt at adhering to sounding "human".
The bigger the parameters the better the art? Can a Pi Hat 2 ever be good at drawing fapworthy waifus?
>>107873987I stopped caring about the slop, like when you fall in love with a fat or ugly woman you stop caring about her physical appearance.
>>107874010Okay, but this is more like when someone attractive you're with has a terrible, boring personality, specifically at the point where even hearing her speak is grating on you and makes you want to puke.
https://github.com/GetfroggyHoe/universal-immersion-engine
>>107874055>mobile mobile mobileew
>>107874050This all started when he tried to train a model on the "philosophy" if his favorite pedo. You can't expect much of him.
>>107874055Neat.I'll steal some ideas from that to help refine my frontend.
Good anons don't let anons forget.
>>107874106you're obesed!
>>107874106who cares
>>107874106It's crazy that you could be in a thread where so much knowledge about models is both available and necessary and still fall into the AI psychosis bullshit.
>>107874055I've been working on something similar. but standalone and more freeform.
>>107874055Can't wait for this to be abandoned almost instantly like waidrin. Also I'm assuming you made it, seeing as it was uploaded an hour ago? That's cool if so. If you got it from reddit or something though eat shit.
>>107874106Damn, looks like I already have a couple haters, now I only need to get myself some groupies and I can say I made it.
llama webui is going to get mcp supporthttps://github.com/ggml-org/llama.cpp/pull/18655
*ahem* kimi sex
>>107874141Yeah. Reminds me of that schizo that kept claiming sillytavern was somehow injecting leftist propaganda into his model during inference. Hope that retard never comes back.
>>107874441welcome to 2024, llama.cpp
>>107874544Given how convoluted sillytavern settings are what how cards can override any of your settings I wouldn't be surprised if that was true.
>>107874544LMAOIt's crazy because if you use kobold or llamacpp you can see exactly the prompt silly sends.
>>107874564silly itself literally has it's prompt tool thingy that can tell you exactly what the backend received for the last genned message
>>107874579Yeah but if you're claiming ST injects propaganda I'm sure they wouldn't show it in the inspection window.
SillyTavern is garbage. You can unironically replicate 90% of its features in ~600 lines of code. Seriously.That includes a system prompt, character cards, conversation history/context utilization, saved chats, lorebooks, UI/UX, etc. Don't implement sampling. Just use llama.cpp's sampling flags on the backend. It's literally so easy.Idk how it even became such a bloated piece of shit in the first place.
>>107874587yeah but llama.cpp has "--verbose" that shows you the exact prompt it's being given and other inference backends have similar optionsso unless literally all of them are in on the conspiracy, it'd be easy to catch if ST did something like this
>>107874564I'm not quite sure about this and don't feel like going back to look so take this with a grain of salt but I'm pretty sure that schizo was always moving the goalposts to justify the belief that his models were somehow compromised by leftist propaganda and censorship despite not actually having proof. It's infuriating to see people spiral into AI psychosis and genuinely believe their own delusions.It's one thing to acknowledge that models have inherent biases inherited by the pre and post training, but to claim sillytavern was injecting subversive material after moving the goalpost so many times is just legitimate schizophrenic behavior.>>107874587You can view exactly what tokens get sent to the backend on pretty much every inference engine out there. If there are extra injected tokens you would be able to see it.
>>107874643>reddit spacingpost ignored
>>107874544The fuck? You can literally inspect using your browser the raw request that ST sends out. I'm not a programmer and even I know how to do it. And then there's the Llama.cpp console window obviously, if you are using Llama.cpp.
>>107874544Considering how much of it is in the training data from including scrapes of reddit and twitter alone why would anyone ever need to inject more.
>>107874643ok then do it, if you are right then surely it should be easy to make a better solution and gain popularity.
>>107874648Pretty much. The "ST is messing with your prompts" schizo doesn't realize that every backend has an option to view the incoming prompts to see what's actually going onST mangles your text completions, but it isn't injecting whatever schizo babble shit that gets talked about hereThis faggot also nonstop talks about pol, his conspiracy theories, targets virtually anything from flashattention to anyone who makes a single finetune and thinks you need to run nemo at fp64 for actual usage
>>107874722I literally have. That's why I'm saying this..
>>107874564>>107874648Is there a flag so I can see progress and the prompt that goes in, like kobold? I tried --verbose and --verbose-prompt but it just vomits everything (not useful) into the terminal
>>107874699We tend to think about hallucinations in the current year as an LLM thing when humans are just as capable of having them. >>107874733I haven't seen him post in a long time so hopefully he never comes back.
>>107874754no you havent.
>>107873380Why not just make the images yourself? Seems like a waste of time.
>>107874733I you're talking about me (the "AI psychosis" guy from the screenshot), I'm not the /pol/ guy or the anti-flash-attention guy, I'm literally trying to re-implement flash attention in C.The closest thing to the truth from what you've said is insulting some finetuners, for which I deeply apologize.
>>107875092I don't care about people being so mentally weak they get mindbroken by ai, so that was not what I was referring toI specifically dislike the one ubiquitous shitposter that shits on anything that could involve any form of conversation, be it new model releases, some retard's finetune, or whatever new dumb shit he invents to be disingenuous about
>ego death>me>me>me
>>107875174I already told you I'm not the ego death guy!!! I fully admit to having a big ego and being a huge attention whore.
>>107874000
>>107875387Bump for question
>>107874000no
>>107875408Grok on my smartphone gives a eulogy in regards
For art, probably better off asking in the other adjacent threads. But from my experience dabbling in slop art, its all the same even if you go for a bigger model
>>107873799no way this wasn't edited to hell and back, actually coherent AND dark scene
>>107875479And they try to claim genning takes no skills.
>>107875479no signs of inpainting.
>>107873799model?