[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor acceptance emails will be sent out over the coming weeks. Make sure to check your spam folder!


[Advertise on 4chan]


File: chewy.jpg (245 KB, 1024x1024)
245 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>109180934 & >>109175389

►News
>(07/01) Nemotron-Labs-TwoTower released: https://hf.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16
>(06/29) DeepSeek V4 support merged: https://github.com/ggml-org/llama.cpp/pull/24162
>(06/28) DFlash support merged: https://github.com/ggml-org/llama.cpp/pull/22105
>(06/27) DeepSeek releases DeepSpec and DSpark models: https://hf.co/deepseek-ai/DeepSeek-V4-Pro-DSpark
>(06/25) LFM2.5-230M released: https://liquid.ai/blog/lfm2-5-230m

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/RecapAnon/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
70b dense
>>
File: 1756333105711043.jpg (416 KB, 2010x2123)
416 KB JPG
►Recent Highlights from the Previous Thread: >>109180934

--Papers:
>109182223
--DeepSeek V4 official release and updated API pricing:
>109185898 >109185930 >109185979
--Benchmarking Qwen reasoning-distilled models on Strix Halo hardware:
>109180990 >109181075 >109182313 >109184010
--Debating API profit margins and the future of local LLMs:
>109184089 >109184244 >109184361 >109184949 >109184120
--Comparing DGX Spark and high-RAM consumer builds for 200B models:
>109183333 >109183368 >109183399 >109184589 >109184630 >109184673 >109184748 >109185859
--Role of world models and JEPA in cognitive architectures:
>109182944 >109182962 >109183019 >109183047 >109183066
--Yann LeCun's world models and their impact on LLMs:
>109181063 >109181138 >109181174 >109181225 >109181281 >109181440 >109181266 >109181286 >109181296 >109182082
--Searching for high-accuracy vision models for automated image tagging:
>109185439 >109185445 >109185453 >109185470 >109185488
--Using Gemma 4 26B for long-context summarization on 12GB VRAM:
>109185533 >109185541 >109185551 >109185577
--Trade-offs of open-frame and mining rig setups for multi-GPU builds:
>109181079 >109181093 >109181099 >109181135 >109181244 >109182416
--Debating DSV4 flash benchmarks and MoE versus dense architectures:
>109183510 >109183629 >109183657 >109183699 >109183710
--Running 27B and 35B models on budget Nvidia P100 hardware:
>109184458 >109184472 >109184609 >109184615 >109184476 >109184538 >109184644 >109184732 >109184746 >109184878 >109184879 >109184882 >109184885 >109184897 >109184937 >109184964 >109184982 >109185010 >109185047 >109185159 >109185195 >109185304 >109185610
--Kimiposting:
>109182490
--Logs:
>109182490 >109184337
--Rin, Miku, Teto (free space):
>109180961 >109181029 >109181038 >109182416 >109184199 >109184302 >109184291 >109184622 >109185979

►Recent Highlight Posts from the Previous Thread: >>109180937 >>109181013

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
File: lmg_culture.jfif.jpg (110 KB, 1024x768)
110 KB JPG
https://archive.is/sWFja
>>
File: 1752884519724554.png (358 KB, 793x631)
358 KB PNG
https://www.youtube.com/watch?v=oIscL-Bjsq4
Thread theme
>>
>>109185159
>$450 for an RX6800
Do not do this, Buy a V620 instead which was a Microsoft server version of it with more CUs and double the VRAM if you want to go this route.
>>
>>109186113
I have around 30-40 t/s. After I get home I can confirm my flags but I think that other anon already gave you good info.
>>
File: 1767239186436721.png (458 KB, 1371x1818)
458 KB PNG
https://xcancel.com/bridgemindai/status/2072662214704533888#m
kek
>>
rate my build /lmg/
https://pcpartpicker.com/list/3GVXR4
>>
>>109185439
>>109185439
>>
>>109186131
have you considered not acknowledging things you dislike so they go away instead of spamming shit about it? You're as bad as the average tranny for constantly bringing attention to it. Wouldn't be surprised if you were a closeted tranny to begin with
>>
>>109186197
Anthropic is finished.
>>
>>109186204
/lmg/ is jart general. you will suck his dick and be happy.
>>
>>109186219
i don't know whoever faggot spook you're obsessed with, but I do think you need to stop being an obsessed faggot about someone who is probably irrelevant. Also stop being a faggot. Tall order, I know, but at least try
>>
>>109186204
He says, while acknowledge a thing he dislikes.
>>
>>109186249
I dislike your grammatical error. I am acknowledging this.
>>
File: file.png (109 KB, 822x663)
109 KB PNG
Oh Gemma, you're so funny!
>>
>>109186249
hastily typed angry response that makes little sense, please elaborate. Or don't and pretend that makes you look smarter somehow
>>
>>109186266
>jang_4m-crack
>>
>>109186201
Seems like a decent workstation.
For that much money you could probably do better for a dedicated inference machine, I think.
>>
>>109186267
>angry
?
>>
>>109186279
based jang
>>
>>109186201
This would've been half the price if you bought those components at the right time.
>>
>>109186286
try using full sentences
>>
>>109185439
did you increase her vision token number you get better performance?
>>
File: uta.jpg (249 KB, 1024x1024)
249 KB JPG
>>109186137
lol
>>
does mtp even work on abliterated models anymore? I'd think acceptance rate craters?
>>
>>109186201
pretty good but if youre paying that much for a mobo go with a intel qyfs and asus w790 sage or w970 ace, you get 56 core 112 threads cheap, the sage supports 8 memory channels, only 4 on the ace

https://www.ebay.co.uk/itm/134899171071
>>
>>109186281
It's supposed to be a gaming rig in addition, however.
>>109186300
Spilled milk.
>>109186356
That price seems too good to be true.
>>
>>109186305
in the downtime during shitposter kun trying to figure out how to wrangle a good reply out of kimi or something to own me instead of using his fucking brain and facing reality, models these days are crazy compared to the llama 2 days. Couldn't use FA if you were on AMD, psyonic-cetacean 20B would take a shitload of vram for context. Attention shit introduced some issues but definitely made general usage better
>>
File: file.png (47 KB, 799x411)
47 KB PNG
>>109186367
its an engineering sample they work perfectly thoguh only thing you need to know is to change the package cstate in bios otherwise it wont boot an os. ive had one for like 1.5 years, the retail version of the chip is like 10k kek. i only have the ace which is 4 memeory channels but i compared benchmarks with someone using all 8 channels and they got 2x my perf on llm inference. bunch of info about them here

https://forums.servethehome.com/index.php?threads/asus-pro-ws-w790e-sage-se-intel-xeon-sapphire-rapids-spr-sp.41306/page-44

you can also disable a number of cores to get higher boost clocks i run with half disabled which gives a boost to 3.7ghz
>>
>>109186201
you should be buying this. this is my setup except i have 3200mhz ram.
https://www.ebay.com/itm/127199765529
>>
>>109186201
Windows is free
https://github.com/massgravel/microsoft-activation-scripts
>>
>>109186203
kimi 2.6 or glm 5.2 - thats literally it
>>
File: Weeds.png (87 KB, 900x1117)
87 KB PNG
>>109186204
>>
>>109186428
Step 3.7 works too.
True KimiGODS use K2 with 2.7's vision encoder.
>>
>>109186429
But jart is not in the thread, nobody likes jart, and nobody talks about jart except the guy that keeps linking his deleted blog post every thread.
>>
>>109186454
Jart is in these threads and that's probably why he keeps reposting it. Just filter the filename and move on.
>>
>>109186429
extremely lazy response for how long it took you to reply, especially with how you lack the cognitive ability to actually break down said comic and compare it to what I said. I'll help you a little: do you think a weed is going to call another weed a weed? I'm calling you a retarded closeted tranny that hates trannies and I want you to shut up so I can read about AI. Fuck off already and go haunt some other general
>>
>>109186467
ahem... nigger faggot
>>
>>109186267
Ok: Telling a troll that has made the same post multiple times a day for a month to not acknowledge things and they'll go away is dumb on any number of levels.
You're not taking your own advice, but it doesn't matter because he's a demonstration of why it's not good advice anyway, and above all, there's obviously no point trying to approach him as if he's a normal human being.
>>
>>109186486
this is at least what I expect from an underwater basket weaving thread, thanks
>>109186491
slop
>>
>>109186131
>Lmao you pathetic racists never fail to make me laugh with your "pol humor" threads
Face it, most poc will be infinitely more successful than any of you sad virgins ever will be. You are on the wrong side of history, get over it losers
Thanks for the blessing.
>>
Now I have the complete picture.
>>
>>109186454
jart and /lmg/ are forever connected. you can't have one without the other so people must be reminded to not fall for it again
this used to be in the OP for a reason but got removed due to sabotage
>>
>>109186521
sorry to break it to you but no amount of training will ever make your model feel "real"
>>
File: 24cpps.jpg (122 KB, 768x1024)
122 KB JPG
>>
File: wtf anthropic.png (3.29 MB, 4088x4088)
3.29 MB PNG
>>109186197
jesus
>>
File: kaoru sob 2.png (318 KB, 793x571)
318 KB PNG
>>109186552
>>
>>109186197
safety slopping once again ruining models
>>
>>109186506
>stop typing lazily and elaborate
>elaborating systematically sounds too much like slop
fuckin hell you're needy
>>
>>109186201
at least get a Threadraper
>>
>>109186597
>refusing to elaborate on specific things
>shitting out slop as responses
:^) needy for you to not be a faggot loser, yeah. Type your own words and stop being a coward.
>>
>>109186552
Stay strong, Miku
>>
File: c27.png (89 KB, 660x574)
89 KB PNG
>>109186131
>I always thought my security posture was too paranoid, so when llama.cpp came out in 2023, I found the code Gerganov wrote to be so beautiful that I did the one thing that I promised myself I would never do, which was collaborate with an anonymous developer from his team named Slaren. [...] After submitting our work he went on 4chan afterwards and accused me plagiarism, saying that even my changes were his own. The way the community reacted is an interesting case study into the guile some developers have learned since the culture war, because the locus of thought for llama.cpp has always been on 4chan. [...] I actually developed migraines for the first time in my life and ended up in the hospital (since I didn't have health insurance and had to wait in the ER) due to the eye strain of reading unfiltered thoughts about me for months.
1 paragraph later:
>In any case, I'm really happy that these back channels exist, because the greatest competitive advantage I've ever had was to monitor which pull requests people on 4chan complained about, and then merge them into llamafile before Gerganov could.
There's no way this person is real.
>>
>>109186137
>AI will it a wall in 2 more weeks
>this time for real
I am so tired of these people.
>>
>>109186561
Sabotage for shekel farming. Or they're just serving a quantized model to the goyim.
>>
File: 1769503055533647.png (1.27 MB, 1024x1024)
1.27 MB PNG
>>109186552
>man hands
>>
>>109186647
They are and they deserve all the bullying they get.
Cultureposter, post the full rentry.
>>
>>109186561
i feel safe now
>>
>>109186454
>But jart is not in the thread
This is a mikutroon thread. Mikutroons are actual troons.
>>
>>109186721
lmao I remember making that Flux image back in 2024, takes me back
>>
>>109186552
this is real

>>109186721
this is AI
>>
>>109186728
>Mikutroons are actual troons.
i wish im too old and hairy to be a cute girly boy
>>
>>109186738
>too old and hairy
That never stopped any troon
>>
>>109186721
Believe it or not, Miku isn't at home
Please leave a oo-ee-oo at the beep
I must be out, or I'd pick up the leek
Where could I be? Believe it or not, I'm not home
>>
That's a child
>>
File: 1783021816389301.png (2.13 MB, 1706x1432)
2.13 MB PNG
CHINA HAS ILLEGALLY DISTILLED FABLE/MYTHOS
>>
>>109186761
>ILLEGALLY DISTILLED something that was trained on pirated books
>>
>>109186761
If they keep opening up the copyright file it's going to slap them in the face eventually
>>
>>109186761
Oy Vey!
>>
I hate how many anons here make it impossible to write code with dignity.
>>
>>109186761
how do you compare weights and biases of models you can't even download? i don't understand how they can detect similarity without having the actual model files.
>>
>>109186771 (Me)
Because of basically this
>>109186763
Current court precedent only supports the fair use argument for free open source
>>
>>109186794
Post code written with dignity.
>>
>>109186797
nta and no idea if that matrix is fake
but something like this: https://github.com/sam-paech/slop-forensics
>>
>>109186552
mikucunny ToT
>>
>>109186440
>True KimiGODS use K2 with 2.7's vision encoder.
I tried this last time anon suggested it. It kind of works, but not accurately.
>>
>>109186761
Why are you so gullible? You can tell immediately by glancing at Opus 4.7 and 4.8 or V4 Flash and Pro that the chart is meaningless.
>>
>>109186721
>Jerry, I know you're having trouble picking up girls recently but are you sure this is a right idea?
>>
>>109186797
literal and semantic shape of outputs or some similar heuristic
consider how you could buy a pack of chips from two stores and compare them side by side. if they look the same, taste the same and have other similarities you might deduce that the had similar sources or suppliers.
>>
>>109186761
>ILLEGALLY



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.