[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1757063045674968.png (1.12 MB, 1080x1024)
1.12 MB
1.12 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: /lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108368195


►News
>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
>>108378991

Weird. I lost my dad two weeks ago. I feel you bro.
>>
This could be the last thread before v4 drops
>>
There's some sick irony about how Jews have pressured Christian societies into cremation over proper burial, wiping all physical trace that we ever lived.
>>
>>108379019
I wanna be cremated or fed to some wild animals.
>>
>>108379019
Nope, it was Christian boomers buying all the land for peanuts and now you have to pay 100k/y to have a mildly decent grave.
>>
There's some sick irony about how ablublus have pressured agagas societies into ice cream over brownies, wiping all cream trace from their stomach.
>>
>>108379060
prompt?
>>
>>108379013
>slopinator v4
>>
>>108378821
Okay I've been doing some more testing and Qwen is still definitely more intelligent and better at writing, but it's extremely safetyslopped. Gonna try to download an abliterated or heretic version and report back.
>>
>>108378991
where are the previous threads again? whatever man
>>
>>108379089
Enough about Gemma
>>
File: 1772824825605954.mp4 (3.91 MB, 834x1112)
3.91 MB
3.91 MB MP4
Can't believe chinks got to have fun with Seedance 2.0 while global release is forever shelved
>>
>>108379089
Even in v3, DeepSeek is awesome. Maybe the prose is sloppy, but for most useful tasks, it is powerful enough. Sure, one of its advantages is the API that makes it cheap as hell, but fuck it's cheaper than most hosted 8b models while being on a completely different level of intelligence. They also pioneered releasing weights for models that scale for those who can run it locally. I'm not sure what Deepseek's endgame is (I know the backstory, Highflyer hedge fund, etc. still don't understand why) but they're doing a lot of good to democratize all of this.
>>
>>108379191
local video gen wasn't far behind proprietary last I checked if you factor in all the fancy tech we have like loras
i'm sure we're just a few months away from having this local
>>
>>108379196
Shit's moving fast. I remember having my mind blown when text models started being able to close open quotes in a sentence where the quote actually ended, and the whole thing about the avocado chair. It's moving fast.
>>
are we going to see a chinese open source version of genie 3 or do we have to wait for lecun?
>>
>>108379221
This mikufag is an unironic shill doing prep work for NAI's upcoming fine-tune. Trying to populate /aids/ before that happens.
You don't hate mikufags enough.
>>
>>108379191
In five years we'll have this quality in real time and hooked up to vr headsets.
>>
>>108379127
the one shared by an anon, HauHau, never gave me any refusal so far
>>
>>108379019
i remember overhearing my grandfather telling my mum he wants to be cremated once he dies and his ashes spread in a specific forest that was close to his childhood where he owned a cottage and some land
what did my family do once he died? completely ignored his request and buried him in some cemetery
i am still malding over it to this day
>>
I was checking this (shockingly terrible) website https://swallow-llm.github.io/leaderboard/index-post.en.html
and the results are so odd, in 90% of the cases it seems like their models perform worse on the benchmarks after the training, wtf is going on with this?
>>
>>108379274
yeh, it's not refusing anymore but it's struggling to follow a conversation. Actually that's bad phrasing... it just kind of dilly dallies and won't push the story/RP forward. Is this fixable?
>>
>>108379279
Exhume him and complete his wish
>>
Actual previous thread because OP is a fucking retard
>>108373481
>>
>>108379060
I want to know more about ablublus and agagas
>>
File: ffff.png (515 KB, 832x1050)
515 KB
515 KB PNG
►Recent Highlights from the Previous Thread: >>108373481

--AI Fishtank project with Qwen 3.5 9B autonomously exploring tasks:
>108375617 >108375631 >108375688 >108375700 >108375729 >108375728 >108375735 >108376142 >108376168 >108376193 >108376203
--Decensoring local models with limited hardware:
>108373570 >108373581 >108373597 >108373606 >108374177 >108374238 >108374245 >108374271 >108374273 >108374295 >108374304 >108374215
--Qwen 3.5's self-correcting gendered language artifacts:
>108374865 >108374879 >108374898 >108374921 >108374925 >108374935
--Token generation speed depends on hardware, not task complexity:
>108375142 >108375153 >108375154 >108375167 >108375169 >108375162 >108375192
--Testing LLM knowledge of US satellite imagery restrictions on Israel:
>108375913 >108375942 >108378276
--Prefill limitations and alternatives for model restriction bypass:
>108374446 >108374466 >108374481 >108374489 >108374498 >108374554 >108374573 >108374471
--AI blocks draft question while another explains Selective Service rules:
>108377817 >108378035
--Frustration with AI censorship flagging technical questions:
>108374529 >108374545 >108374639 >108374962 >108375047 >108375074
--Niche use cases for single-digit parameter models:
>108377262 >108377267 >108377299 >108377318 >108377340 >108377278 >108377284 >108377287
--GLM-5-Turbo release and performance speculation:
>108378714 >108378726 >108378766 >108378808 >108378868
--Qwen3.5-4B model performance and benchmark validity debate:
>108374699 >108374858 >108375112
--Step-3.5-Flash-SFT dataset flagged for unsafe file:
>108374561 >108374564
--Miku (free space):
>108374756 >108374873 >108375601 >108377176 >108377312 >108377944 >108378323 >108378451 >108378480 >108378546 >108378794

►Recent Highlight Posts from the Previous Thread: >>108373888

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>108379191
why did we let this happen?
>>
>>108379250
hooked up to my brain*
>>
how much money can i make using ai?
>>
>>108379328
depends how many gullible people are around you
>>
>>108379330
qrd
>>
>>108379328
if your name is yann lecun you can make a billion dollars just by saying you're making a new ai company (while having no product to show)
>>
>>108379328
about tree fiddy
>>
>>108379328
how much money can i make using the internet?
>>
>>108378991
can't believe you faggots are getting reamed by a reddit post
https://www.reddit.com/r/mildlyinfuriating/comments/1ru97y3/family_friend_sent_me_ai_generated_response_to/
>>
>GLM5-Turbo
Weights please thank you?? I may forgive the great zai betrayal of making the goddess 2 times fatter an unrunable.
>>
>>108379378
Yes we have a redditor baker who hates anything Japanese. The /lmg/ situation is crazy :Skull:
>>
>>108379378
it's better than the alternative
>>
File: 1764267506556291.png (30 KB, 470x218)
30 KB
30 KB PNG
>>108379390
bro the 700b one is the "lite" version
turbo is the "pro"
>>
>aws outage caused by retarded agent
The future is bright.
>>
>>108379378
we already knew where the low effort screenshots came from, don't feel like commenting on that.
>>
>>108379406
yes, let's blame ai and ignore the burning desert datacenter
>>
>>108379402
You read it wrong
Pro and Lite are the names of their API plans
>>
>>108379426
The previous outage, not the current.
>>
>>108379406
I don't think that Trump qualifies as a Mossad agent just yet.
>>
I had an idea, what if we get a 700b model and split it among our computers so we power it and all have access to it, who is with me?
>>
>>108379304
He is doing the most important job - keeping the OP thread on topic. Kill yourself mikutroon faggot.
>>
>>108379443
I'll make the logo
>>
>>108379443
What's the point when the 4B model is only 15% worse?
>>
>a year ago everyone and their mom was running deepseek v3-0314 768b at home
>suddenly running a model that size is apparently impossible with glm5
What happened? When did this general get run over by poorfags?
>>
>>108379482
When we found out that you don't even need a model with hundreds of billions of parameters
>>
where is the new deepseek
i want to have deepsex
>>
>>108379488
>0.8B "half as good" as the 397B.
The benchmark looks so fucking dubious. Or the percentages are meaningless.
>>
>>108379492
What if the new deepseek has a lower number of parameters as the old one?
>>
>>108379498
Those are relative.
If the 397b is getting 64 points for example, then the 0.8b is getting 34 points.
>>
>>108379482
With 192GB's I go from IQ4XS 4.7 to 1IQ 5.0
>>
>>108379498
Could be correct. When I used 397B it was legit retarded.
>>
>>108379498
>how much 1+1?
379b
>2
0.8b
>1
>hmm yes it got half of the answer right
kek
>>
>>108379482
You need at least 200 T/s for agentic coding/openclaw stuff.
>>
File: 1549732494900.png (58 KB, 396x462)
58 KB
58 KB PNG
qwenshills have been working overtime for 3.5
>>
>>108379488
qwen was such a gamechanger for us
there's no need to run anything else, it's so awesome
>>
>>108379482
GLM 5 is retarded on llama.cpp because it's not implemented properly.

edit: Just saw that the guy working on it pushed a branch https://github.com/ggml-org/llama.cpp/issues/20363
>>
>>108379467
does that mean if you run the 4b x3 it is 100% accurate?
>>
>>108379562
True, I'm running GLM5 at Q8 right now but it's dumber than Qwen3.5 300b and arguably 112b as well. It's really not worth it right now at all if you're using llama.cpp.
The API is much better even if it's not that much smarter than Qwen despite being so much slower so you might want to run Qwen anyway.
>>
Qwen... so tasty. So hard to stop.... sucking it... to post. I love Qwen.
>>
>>108379591
You run agent swarm made of 4b agents.
>>
File: 1756893143304264.png (28 KB, 243x271)
28 KB
28 KB PNG
Why the fuck is (Qwen3.5-35B-A3B-heretic-Q4_K_M.gguf) so much hornier than (Qwen3.5-9B-heretic.Q4_K_M.gguf)

???? This is making me want to try 122B-A10B lmaooo
>>
>>108379606
yes or no benchod
>>
>>108379623
Do you understand that since these are MoE models, the smaller ones tend to have entire sections removed, right? Obviously a prime target is their knowledge of porn/smut/erotica
>>
>>108379648
I can't even tell if you're agreeing with me or calling me a liar.
>>
>>108379449
Take your meds. You're so obsessed with "mikutroons" you can't even bake a thread properly.
>>
>>108379659
I'm agreeing and explaining why.
>>
File: 1772309399871437.png (105 KB, 608x938)
105 KB
105 KB PNG
Is trolling a local model like beating your wife?
>>
>>108379675
It was already proven that you faggots don't care about (((quality))). You just want to force your special interest on normal people. A behavior typical of troons. Vocaloids have nothing to do with AI and they never will.
>>
>>108379720
Leave the mikutroons alone bro. I don't even like tranime but you're overreacting.
>>
File: file.png (1.13 MB, 1800x1200)
1.13 MB
1.13 MB PNG
>>108379741
>Leave the mikutroons alone
>>
>>108379689
you are literally trolling yourself since the model and its outputs are you own making
>>
File: Beavis-Stirner.png (31 KB, 991x1753)
31 KB
31 KB PNG
>>108379752
that has always been the point bro. Everything anyone does is always for selfish egoist reasons.
>>
>>108379768
e-ego?
did someone say ego de—ACK!
glm save me
>>
>>108379562
I'm getting the exact same distribution on cockbench with and without the branch so I feel like I did something wrong.
>>
>>108379782
bros I summoned him.
>>
What's the best coding model I can run on my garmin smart watch?
>>
hi frens, i happen to have this website in japanese full of porn stories that i absolutely need to translate in order to read. nothing wrong with that, right?

all AI, even grok, are refusing because it involves minors

i am using deepL but not only is it not that good, i also have a feeling i may be raising some flags

how do i get a local model to do this? i have a gaming pc btw
>>
>>108379544
Sonnet 4.6 told me I only need 20 tokens per second to run openclaw.
>>
>>108379863
>i have a gaming pc btw
this could be a 970 or a 5090 lol, or ayymd, either way if you cant follow a youtube tutorial you can ask a chatbot itself how to install this locally, then you ask it how to download the abliterated/heretic version of a model from huggingface
>>
>>108379801
Qwen 0.8b probably
>>
>>108379801
functiongemma



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.