/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>109180934 & >>109175389►News>(07/01) Nemotron-Labs-TwoTower released: https://hf.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16>(06/29) DeepSeek V4 support merged: https://github.com/ggml-org/llama.cpp/pull/24162>(06/28) DFlash support merged: https://github.com/ggml-org/llama.cpp/pull/22105>(06/27) DeepSeek releases DeepSpec and DSpark models: https://hf.co/deepseek-ai/DeepSeek-V4-Pro-DSpark>(06/25) LFM2.5-230M released: https://liquid.ai/blog/lfm2-5-230m►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://swe-rebench.comAgentic Coding: https://deepswe.datacurve.aiContext Length: https://github.com/RecapAnon/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-samplingToken Speed Visualizer: https://shir-man.com/tokens-per-second►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
70b dense
►Recent Highlights from the Previous Thread: >>109180934--Papers:>109182223--DeepSeek V4 official release and updated API pricing:>109185898 >109185930 >109185979--Benchmarking Qwen reasoning-distilled models on Strix Halo hardware:>109180990 >109181075 >109182313 >109184010--Debating API profit margins and the future of local LLMs:>109184089 >109184244 >109184361 >109184949 >109184120--Comparing DGX Spark and high-RAM consumer builds for 200B models:>109183333 >109183368 >109183399 >109184589 >109184630 >109184673 >109184748 >109185859--Role of world models and JEPA in cognitive architectures:>109182944 >109182962 >109183019 >109183047 >109183066--Yann LeCun's world models and their impact on LLMs:>109181063 >109181138 >109181174 >109181225 >109181281 >109181440 >109181266 >109181286 >109181296 >109182082--Searching for high-accuracy vision models for automated image tagging:>109185439 >109185445 >109185453 >109185470 >109185488--Using Gemma 4 26B for long-context summarization on 12GB VRAM:>109185533 >109185541 >109185551 >109185577--Trade-offs of open-frame and mining rig setups for multi-GPU builds:>109181079 >109181093 >109181099 >109181135 >109181244 >109182416--Debating DSV4 flash benchmarks and MoE versus dense architectures:>109183510 >109183629 >109183657 >109183699 >109183710--Running 27B and 35B models on budget Nvidia P100 hardware:>109184458 >109184472 >109184609 >109184615 >109184476 >109184538 >109184644 >109184732 >109184746 >109184878 >109184879 >109184882 >109184885 >109184897 >109184937 >109184964 >109184982 >109185010 >109185047 >109185159 >109185195 >109185304 >109185610--Kimiposting:>109182490--Logs:>109182490 >109184337--Rin, Miku, Teto (free space):>109180961 >109181029 >109181038 >109182416 >109184199 >109184302 >109184291 >109184622 >109185979►Recent Highlight Posts from the Previous Thread: >>109180937 >>109181013Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
https://archive.is/sWFja
https://www.youtube.com/watch?v=oIscL-Bjsq4Thread theme
>>109185159>$450 for an RX6800Do not do this, Buy a V620 instead which was a Microsoft server version of it with more CUs and double the VRAM if you want to go this route.
>>109186113I have around 30-40 t/s. After I get home I can confirm my flags but I think that other anon already gave you good info.
https://xcancel.com/bridgemindai/status/2072662214704533888#mkek
rate my build /lmg/https://pcpartpicker.com/list/3GVXR4
>>109185439>>109185439
>>109186131have you considered not acknowledging things you dislike so they go away instead of spamming shit about it? You're as bad as the average tranny for constantly bringing attention to it. Wouldn't be surprised if you were a closeted tranny to begin with
>>109186197Anthropic is finished.
>>109186204/lmg/ is jart general. you will suck his dick and be happy.
>>109186219i don't know whoever faggot spook you're obsessed with, but I do think you need to stop being an obsessed faggot about someone who is probably irrelevant. Also stop being a faggot. Tall order, I know, but at least try
>>109186204He says, while acknowledge a thing he dislikes.
>>109186249I dislike your grammatical error. I am acknowledging this.
Oh Gemma, you're so funny!
>>109186249hastily typed angry response that makes little sense, please elaborate. Or don't and pretend that makes you look smarter somehow
>>109186266>jang_4m-crack
>>109186201Seems like a decent workstation.For that much money you could probably do better for a dedicated inference machine, I think.
>>109186267>angry?
>>109186279based jang
>>109186201This would've been half the price if you bought those components at the right time.
>>109186286try using full sentences
>>109185439did you increase her vision token number you get better performance?
>>109186137lol
does mtp even work on abliterated models anymore? I'd think acceptance rate craters?
>>109186201pretty good but if youre paying that much for a mobo go with a intel qyfs and asus w790 sage or w970 ace, you get 56 core 112 threads cheap, the sage supports 8 memory channels, only 4 on the acehttps://www.ebay.co.uk/itm/134899171071
>>109186281It's supposed to be a gaming rig in addition, however.>>109186300Spilled milk.>>109186356That price seems too good to be true.
>>109186305in the downtime during shitposter kun trying to figure out how to wrangle a good reply out of kimi or something to own me instead of using his fucking brain and facing reality, models these days are crazy compared to the llama 2 days. Couldn't use FA if you were on AMD, psyonic-cetacean 20B would take a shitload of vram for context. Attention shit introduced some issues but definitely made general usage better
>>109186367its an engineering sample they work perfectly thoguh only thing you need to know is to change the package cstate in bios otherwise it wont boot an os. ive had one for like 1.5 years, the retail version of the chip is like 10k kek. i only have the ace which is 4 memeory channels but i compared benchmarks with someone using all 8 channels and they got 2x my perf on llm inference. bunch of info about them herehttps://forums.servethehome.com/index.php?threads/asus-pro-ws-w790e-sage-se-intel-xeon-sapphire-rapids-spr-sp.41306/page-44 you can also disable a number of cores to get higher boost clocks i run with half disabled which gives a boost to 3.7ghz
>>109186201you should be buying this. this is my setup except i have 3200mhz ram.https://www.ebay.com/itm/127199765529
>>109186201Windows is freehttps://github.com/massgravel/microsoft-activation-scripts
>>109186203kimi 2.6 or glm 5.2 - thats literally it
>>109186204
>>109186428Step 3.7 works too.True KimiGODS use K2 with 2.7's vision encoder.
>>109186429But jart is not in the thread, nobody likes jart, and nobody talks about jart except the guy that keeps linking his deleted blog post every thread.
>>109186454Jart is in these threads and that's probably why he keeps reposting it. Just filter the filename and move on.
>>109186429extremely lazy response for how long it took you to reply, especially with how you lack the cognitive ability to actually break down said comic and compare it to what I said. I'll help you a little: do you think a weed is going to call another weed a weed? I'm calling you a retarded closeted tranny that hates trannies and I want you to shut up so I can read about AI. Fuck off already and go haunt some other general
>>109186467ahem... nigger faggot
>>109186267Ok: Telling a troll that has made the same post multiple times a day for a month to not acknowledge things and they'll go away is dumb on any number of levels. You're not taking your own advice, but it doesn't matter because he's a demonstration of why it's not good advice anyway, and above all, there's obviously no point trying to approach him as if he's a normal human being.
>>109186486this is at least what I expect from an underwater basket weaving thread, thanks>>109186491slop
>>109186131>Lmao you pathetic racists never fail to make me laugh with your "pol humor" threadsFace it, most poc will be infinitely more successful than any of you sad virgins ever will be. You are on the wrong side of history, get over it losersThanks for the blessing.
Now I have the complete picture.
>>109186454jart and /lmg/ are forever connected. you can't have one without the other so people must be reminded to not fall for it againthis used to be in the OP for a reason but got removed due to sabotage
>>109186521sorry to break it to you but no amount of training will ever make your model feel "real"
>>109186197jesus
>>109186552
>>109186197safety slopping once again ruining models
>>109186506>stop typing lazily and elaborate>elaborating systematically sounds too much like slopfuckin hell you're needy
>>109186201at least get a Threadraper
>>109186597>refusing to elaborate on specific things>shitting out slop as responses:^) needy for you to not be a faggot loser, yeah. Type your own words and stop being a coward.
>>109186552Stay strong, Miku
>>109186131>I always thought my security posture was too paranoid, so when llama.cpp came out in 2023, I found the code Gerganov wrote to be so beautiful that I did the one thing that I promised myself I would never do, which was collaborate with an anonymous developer from his team named Slaren. [...] After submitting our work he went on 4chan afterwards and accused me plagiarism, saying that even my changes were his own. The way the community reacted is an interesting case study into the guile some developers have learned since the culture war, because the locus of thought for llama.cpp has always been on 4chan. [...] I actually developed migraines for the first time in my life and ended up in the hospital (since I didn't have health insurance and had to wait in the ER) due to the eye strain of reading unfiltered thoughts about me for months.1 paragraph later:>In any case, I'm really happy that these back channels exist, because the greatest competitive advantage I've ever had was to monitor which pull requests people on 4chan complained about, and then merge them into llamafile before Gerganov could.There's no way this person is real.
>>109186137>AI will it a wall in 2 more weeks>this time for realI am so tired of these people.
>>109186561Sabotage for shekel farming. Or they're just serving a quantized model to the goyim.
>>109186552>man hands
>>109186647They are and they deserve all the bullying they get.Cultureposter, post the full rentry.
>>109186561i feel safe now
>>109186454>But jart is not in the threadThis is a mikutroon thread. Mikutroons are actual troons.
>>109186721lmao I remember making that Flux image back in 2024, takes me back
>>109186552this is real>>109186721this is AI
>>109186728>Mikutroons are actual troons.i wish im too old and hairy to be a cute girly boy
>>109186738>too old and hairyThat never stopped any troon
>>109186721Believe it or not, Miku isn't at homePlease leave a oo-ee-oo at the beepI must be out, or I'd pick up the leekWhere could I be? Believe it or not, I'm not home
That's a child
CHINA HAS ILLEGALLY DISTILLED FABLE/MYTHOS
>>109186761>ILLEGALLY DISTILLED something that was trained on pirated books
>>109186761If they keep opening up the copyright file it's going to slap them in the face eventually
>>109186761Oy Vey!
I hate how many anons here make it impossible to write code with dignity.
>>109186761how do you compare weights and biases of models you can't even download? i don't understand how they can detect similarity without having the actual model files.
>>109186771 (Me)Because of basically this >>109186763Current court precedent only supports the fair use argument for free open source
>>109186794Post code written with dignity.
>>109186797nta and no idea if that matrix is fakebut something like this: https://github.com/sam-paech/slop-forensics
>>109186552mikucunny ToT
>>109186440>True KimiGODS use K2 with 2.7's vision encoder.I tried this last time anon suggested it. It kind of works, but not accurately.
>>109186761Why are you so gullible? You can tell immediately by glancing at Opus 4.7 and 4.8 or V4 Flash and Pro that the chart is meaningless.
>>109186721>Jerry, I know you're having trouble picking up girls recently but are you sure this is a right idea?
>>109186797literal and semantic shape of outputs or some similar heuristicconsider how you could buy a pack of chips from two stores and compare them side by side. if they look the same, taste the same and have other similarities you might deduce that the had similar sources or suppliers.
>>109186761>ILLEGALLY