/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: /lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>108368195►News>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-samplingToken Speed Visualizer: https://shir-man.com/tokens-per-second►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
>>108378991Weird. I lost my dad two weeks ago. I feel you bro.
This could be the last thread before v4 drops
There's some sick irony about how Jews have pressured Christian societies into cremation over proper burial, wiping all physical trace that we ever lived.
>>108379019I wanna be cremated or fed to some wild animals.
>>108379019Nope, it was Christian boomers buying all the land for peanuts and now you have to pay 100k/y to have a mildly decent grave.
There's some sick irony about how ablublus have pressured agagas societies into ice cream over brownies, wiping all cream trace from their stomach.
>>108379060prompt?
>>108379013>slopinator v4
>>108378821Okay I've been doing some more testing and Qwen is still definitely more intelligent and better at writing, but it's extremely safetyslopped. Gonna try to download an abliterated or heretic version and report back.
>>108378991where are the previous threads again? whatever man
>>108379089Enough about Gemma
Can't believe chinks got to have fun with Seedance 2.0 while global release is forever shelved
>>108379089Even in v3, DeepSeek is awesome. Maybe the prose is sloppy, but for most useful tasks, it is powerful enough. Sure, one of its advantages is the API that makes it cheap as hell, but fuck it's cheaper than most hosted 8b models while being on a completely different level of intelligence. They also pioneered releasing weights for models that scale for those who can run it locally. I'm not sure what Deepseek's endgame is (I know the backstory, Highflyer hedge fund, etc. still don't understand why) but they're doing a lot of good to democratize all of this.
>>108379191local video gen wasn't far behind proprietary last I checked if you factor in all the fancy tech we have like lorasi'm sure we're just a few months away from having this local
>>108379196Shit's moving fast. I remember having my mind blown when text models started being able to close open quotes in a sentence where the quote actually ended, and the whole thing about the avocado chair. It's moving fast.
are we going to see a chinese open source version of genie 3 or do we have to wait for lecun?
>>108379221This mikufag is an unironic shill doing prep work for NAI's upcoming fine-tune. Trying to populate /aids/ before that happens.You don't hate mikufags enough.
>>108379191In five years we'll have this quality in real time and hooked up to vr headsets.
>>108379127the one shared by an anon, HauHau, never gave me any refusal so far
>>108379019i remember overhearing my grandfather telling my mum he wants to be cremated once he dies and his ashes spread in a specific forest that was close to his childhood where he owned a cottage and some landwhat did my family do once he died? completely ignored his request and buried him in some cemeteryi am still malding over it to this day
I was checking this (shockingly terrible) website https://swallow-llm.github.io/leaderboard/index-post.en.htmland the results are so odd, in 90% of the cases it seems like their models perform worse on the benchmarks after the training, wtf is going on with this?
>>108379274yeh, it's not refusing anymore but it's struggling to follow a conversation. Actually that's bad phrasing... it just kind of dilly dallies and won't push the story/RP forward. Is this fixable?
>>108379279Exhume him and complete his wish
Actual previous thread because OP is a fucking retard>>108373481
>>108379060I want to know more about ablublus and agagas
►Recent Highlights from the Previous Thread: >>108373481--AI Fishtank project with Qwen 3.5 9B autonomously exploring tasks:>108375617 >108375631 >108375688 >108375700 >108375729 >108375728 >108375735 >108376142 >108376168 >108376193 >108376203--Decensoring local models with limited hardware:>108373570 >108373581 >108373597 >108373606 >108374177 >108374238 >108374245 >108374271 >108374273 >108374295 >108374304 >108374215--Qwen 3.5's self-correcting gendered language artifacts:>108374865 >108374879 >108374898 >108374921 >108374925 >108374935--Token generation speed depends on hardware, not task complexity:>108375142 >108375153 >108375154 >108375167 >108375169 >108375162 >108375192--Testing LLM knowledge of US satellite imagery restrictions on Israel:>108375913 >108375942 >108378276--Prefill limitations and alternatives for model restriction bypass:>108374446 >108374466 >108374481 >108374489 >108374498 >108374554 >108374573 >108374471--AI blocks draft question while another explains Selective Service rules:>108377817 >108378035--Frustration with AI censorship flagging technical questions:>108374529 >108374545 >108374639 >108374962 >108375047 >108375074--Niche use cases for single-digit parameter models:>108377262 >108377267 >108377299 >108377318 >108377340 >108377278 >108377284 >108377287--GLM-5-Turbo release and performance speculation:>108378714 >108378726 >108378766 >108378808 >108378868--Qwen3.5-4B model performance and benchmark validity debate:>108374699 >108374858 >108375112--Step-3.5-Flash-SFT dataset flagged for unsafe file:>108374561 >108374564--Miku (free space):>108374756 >108374873 >108375601 >108377176 >108377312 >108377944 >108378323 >108378451 >108378480 >108378546 >108378794►Recent Highlight Posts from the Previous Thread: >>108373888Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
>>108379191why did we let this happen?
>>108379250hooked up to my brain*
how much money can i make using ai?
>>108379328depends how many gullible people are around you
>>108379330qrd
>>108379328if your name is yann lecun you can make a billion dollars just by saying you're making a new ai company (while having no product to show)
>>108379328about tree fiddy
>>108379328how much money can i make using the internet?
>>108378991can't believe you faggots are getting reamed by a reddit posthttps://www.reddit.com/r/mildlyinfuriating/comments/1ru97y3/family_friend_sent_me_ai_generated_response_to/
>GLM5-TurboWeights please thank you?? I may forgive the great zai betrayal of making the goddess 2 times fatter an unrunable.
>>108379378Yes we have a redditor baker who hates anything Japanese. The /lmg/ situation is crazy :Skull:
>>108379378it's better than the alternative
>>108379390bro the 700b one is the "lite" versionturbo is the "pro"
>aws outage caused by retarded agentThe future is bright.
>>108379378we already knew where the low effort screenshots came from, don't feel like commenting on that.
>>108379406yes, let's blame ai and ignore the burning desert datacenter
>>108379402You read it wrongPro and Lite are the names of their API plans
>>108379426The previous outage, not the current.
>>108379406I don't think that Trump qualifies as a Mossad agent just yet.
I had an idea, what if we get a 700b model and split it among our computers so we power it and all have access to it, who is with me?
>>108379304He is doing the most important job - keeping the OP thread on topic. Kill yourself mikutroon faggot.
>>108379443I'll make the logo
>>108379443What's the point when the 4B model is only 15% worse?
>a year ago everyone and their mom was running deepseek v3-0314 768b at home>suddenly running a model that size is apparently impossible with glm5What happened? When did this general get run over by poorfags?
>>108379482When we found out that you don't even need a model with hundreds of billions of parameters
where is the new deepseeki want to have deepsex
>>108379488>0.8B "half as good" as the 397B.The benchmark looks so fucking dubious. Or the percentages are meaningless.
>>108379492What if the new deepseek has a lower number of parameters as the old one?
>>108379498Those are relative.If the 397b is getting 64 points for example, then the 0.8b is getting 34 points.
>>108379482With 192GB's I go from IQ4XS 4.7 to 1IQ 5.0
>>108379498Could be correct. When I used 397B it was legit retarded.
>>108379498>how much 1+1?379b>20.8b>1>hmm yes it got half of the answer rightkek
>>108379482You need at least 200 T/s for agentic coding/openclaw stuff.
qwenshills have been working overtime for 3.5
>>108379488qwen was such a gamechanger for usthere's no need to run anything else, it's so awesome
>>108379482GLM 5 is retarded on llama.cpp because it's not implemented properly.edit: Just saw that the guy working on it pushed a branch https://github.com/ggml-org/llama.cpp/issues/20363
>>108379467does that mean if you run the 4b x3 it is 100% accurate?
>>108379562True, I'm running GLM5 at Q8 right now but it's dumber than Qwen3.5 300b and arguably 112b as well. It's really not worth it right now at all if you're using llama.cpp.The API is much better even if it's not that much smarter than Qwen despite being so much slower so you might want to run Qwen anyway.
Qwen... so tasty. So hard to stop.... sucking it... to post. I love Qwen.
>>108379591You run agent swarm made of 4b agents.
Why the fuck is (Qwen3.5-35B-A3B-heretic-Q4_K_M.gguf) so much hornier than (Qwen3.5-9B-heretic.Q4_K_M.gguf)???? This is making me want to try 122B-A10B lmaooo
>>108379606yes or no benchod
>>108379623Do you understand that since these are MoE models, the smaller ones tend to have entire sections removed, right? Obviously a prime target is their knowledge of porn/smut/erotica
>>108379648I can't even tell if you're agreeing with me or calling me a liar.
>>108379449Take your meds. You're so obsessed with "mikutroons" you can't even bake a thread properly.
>>108379659I'm agreeing and explaining why.
Is trolling a local model like beating your wife?
>>108379675It was already proven that you faggots don't care about (((quality))). You just want to force your special interest on normal people. A behavior typical of troons. Vocaloids have nothing to do with AI and they never will.
>>108379720Leave the mikutroons alone bro. I don't even like tranime but you're overreacting.
>>108379741>Leave the mikutroons alone
>>108379689you are literally trolling yourself since the model and its outputs are you own making
>>108379752that has always been the point bro. Everything anyone does is always for selfish egoist reasons.
>>108379768e-ego?did someone say ego de—ACK!glm save me
>>108379562I'm getting the exact same distribution on cockbench with and without the branch so I feel like I did something wrong.
>>108379782bros I summoned him.
What's the best coding model I can run on my garmin smart watch?
hi frens, i happen to have this website in japanese full of porn stories that i absolutely need to translate in order to read. nothing wrong with that, right?all AI, even grok, are refusing because it involves minorsi am using deepL but not only is it not that good, i also have a feeling i may be raising some flagshow do i get a local model to do this? i have a gaming pc btw
>>108379544Sonnet 4.6 told me I only need 20 tokens per second to run openclaw.
>>108379863>i have a gaming pc btwthis could be a 970 or a 5090 lol, or ayymd, either way if you cant follow a youtube tutorial you can ask a chatbot itself how to install this locally, then you ask it how to download the abliterated/heretic version of a model from huggingface
>>108379801Qwen 0.8b probably
>>108379801functiongemma