/g/ - a chinese quantized 9B model beats claude and chat - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
03/19/26(Thu)19:37:56 No.108410104

File: 1760841304091352.png (137 KB, 1204x879)

137 KB PNG

Anonymous 03/19/26(Thu)19:37:56 No.108410104

a chinese quantized 9B model beats claude and chatgpt

Anonymous
03/19/26(Thu)19:41:18 No.108410125

Anonymous 03/19/26(Thu)19:41:18 No.108410125

>>108410104
Wow, absolute benchmark here.

Anonymous
03/19/26(Thu)19:45:20 No.108410153

Anonymous 03/19/26(Thu)19:45:20 No.108410153

File: B64AFCCE2D62F4398A8EFA53B(...).png (117 KB, 1699x1667)

117 KB PNG

>>108410104
It's well known the 9b models are only 10% weaker than the biggest flagship models like Claude.

Anonymous
03/19/26(Thu)19:46:15 No.108410160

Anonymous 03/19/26(Thu)19:46:15 No.108410160

>>108410153
In (((benchmarks))).
In real world shit, the difference is massive.

Anonymous
03/19/26(Thu)19:52:49 No.108410189

Anonymous 03/19/26(Thu)19:52:49 No.108410189

>>108410160
(((Benchmarks))) bad, IQ tests for robots are bad.
IQ tests for humans good!

Anonymous
03/19/26(Thu)19:55:00 No.108410209

Anonymous 03/19/26(Thu)19:55:00 No.108410209

>>108410104
Test

Anonymous
03/19/26(Thu)20:35:53 No.108410464

Anonymous 03/19/26(Thu)20:35:53 No.108410464

penis

Anonymous
03/19/26(Thu)20:44:33 No.108410515

Anonymous 03/19/26(Thu)20:44:33 No.108410515

>>108410153
>107%

Anonymous
03/19/26(Thu)21:01:30 No.108410616

Anonymous 03/19/26(Thu)21:01:30 No.108410616

>heh, we trained our model on a list of gotchas that people have compiled over the past 2 years and now it doesn't fall for those specific ones
>you mad, gaijin?
;^)

Anonymous
03/19/26(Thu)21:40:31 No.108410825

Anonymous 03/19/26(Thu)21:40:31 No.108410825

>>108410616
がいじん is a japanese word dipshit

Anonymous
03/19/26(Thu)21:43:25 No.108410841

Anonymous 03/19/26(Thu)21:43:25 No.108410841

>>108410825
nipponese*

Anonymous
03/19/26(Thu)21:45:19 No.108410857

Anonymous 03/19/26(Thu)21:45:19 No.108410857

>>108410825
anon I don't care which made up gender you claim to be, they're all gay

Anonymous
03/19/26(Thu)22:27:29 No.108411049

Anonymous 03/19/26(Thu)22:27:29 No.108411049

>>108410104
can't bump: It was added to the training, all social media "tests" are added.

Anonymous
03/19/26(Thu)22:42:32 No.108411119

Anonymous 03/19/26(Thu)22:42:32 No.108411119

>>108410104
ermmm that doesnt count doe??? its like... the answers are online and stuff...
w-what? why dont all the other ones get the answer right since its in their dataset too? uhhhh t-thats racist??? what the fuck? shut the fuck up you piece of shit zipperhead chink?? ughhh i need israeli semen asap i cant take this anymore

Anonymous
03/20/26(Fri)00:05:21 No.108411521

Anonymous 03/20/26(Fri)00:05:21 No.108411521

>>108410515
my iq is also well over 100%.

Anonymous
03/20/26(Fri)00:13:06 No.108411560

Anonymous 03/20/26(Fri)00:13:06 No.108411560

>>108410189
post eyelids

Anonymous
03/20/26(Fri)00:20:20 No.108411604

Anonymous 03/20/26(Fri)00:20:20 No.108411604

>>108410189
You benchmark Musician X.

He aces piano, violin, piccolo, and string bass.

You put him to play drums.

Now, maybe he's good at drums, but also maybe not. I guess amongst human beings he's more likely to play drums well...

Anonymous
03/20/26(Fri)00:23:22 No.108411620

Anonymous 03/20/26(Fri)00:23:22 No.108411620

>>108410515
It's 107% of the performance of the 397B-A22B model. It's plausible because the big model might have a dataset that pollutes the answer. The other factor is that the active parameters on the big model are still smaller (22B) than the "active" parameters of the 27B model.

Anonymous
03/20/26(Fri)00:44:43 No.108411707

Anonymous 03/20/26(Fri)00:44:43 No.108411707

>>108410153
it says there it doesn't even beat the full qwen model, let alone beat an actual good model

Anonymous
03/20/26(Fri)03:19:36 No.108412333

Anonymous 03/20/26(Fri)03:19:36 No.108412333

File: 1768179071115745.jpg (12 KB, 250x246)

12 KB JPG

>>108410104
God I am so tired of these responses. Nigga just respond
>You will need to drive the car to a wash anyway.
Fucking there. Short and correct.

Anonymous
03/20/26(Fri)03:52:01 No.108412433

Anonymous 03/20/26(Fri)03:52:01 No.108412433

>>108410189
retard, even IQ tests aren't worth shit if the human doing it trained on the test specificaly, even more so if he had the exact answer and can just memorize them.

that's what these benchmaxxed models do, they don't get a good answer because of their problem solving capabilities but because they've been trained to remember the answer.

Anonymous
03/20/26(Fri)04:04:39 No.108412486

Anonymous 03/20/26(Fri)04:04:39 No.108412486

>>108412333
you just need to tell it to answer in a short sentence

Anonymous
03/20/26(Fri)07:54:57 No.108413362

Anonymous 03/20/26(Fri)07:54:57 No.108413362

>>108412433
>even IQ tests aren't worth shit if the human doing it trained on the test specificaly, even more so if he had the exact answer and can just memorize them.
Don't tell /pol/

Anonymous
03/20/26(Fri)07:58:49 No.108413390

Anonymous 03/20/26(Fri)07:58:49 No.108413390

>>108413362
you think /pol/ is memorizing iq tests?

Anonymous
03/20/26(Fri)08:02:18 No.108413406

Anonymous 03/20/26(Fri)08:02:18 No.108413406

>>108413390
/pol/ says IQ tests are infallible and entire countries and civilizations should be judged based on IQ test results.
Studying for an IQ test and memorizing all the answers doesn't matter, IQ tests are still infallible gospels, more religiously important than a Bible.

Anonymous
03/20/26(Fri)08:03:53 No.108413410

Anonymous 03/20/26(Fri)08:03:53 No.108413410

>>108413406
posts like this just make you look retarded

Anonymous
03/20/26(Fri)08:13:01 No.108413450

Anonymous 03/20/26(Fri)08:13:01 No.108413450

>>108410104
Wake me up when it can webcrawl.
And no, I don't mean "run out of context on the first page" and I don't mean "slice the page so it no longer can retrieve any data" and I don't mean "use Claude to crawl but end up with just a little better results because he ran out of context before it could get to the answer".
For example all of them for some reason have gigantic issues crawling Deadlock's wiki.
And why webcrawl? Because it's the only thing that's missing done well to make the local models outright more usable than the online ones like anything Claude does.
I tried a few different extensions, spent lots of time trying to get every one of them to work the way it's supposed to and even ended up nuking my local python setup (venv command issues) by accident and restoring it for a few hours afterwards.
In the end it sucks regardless of chosen model (believe me, I tried a lot of them and in many different formats), and I'm on a 12GB VRAM card.(GTX4070)

Anonymous
03/20/26(Fri)08:14:52 No.108413458

Anonymous 03/20/26(Fri)08:14:52 No.108413458

>>108410616
>ask an "artificial intelligence" a question a mentally retarded 7 year old niglet would get right
>completely fails
>ERM, UMM, T-THAT'S....
>THAT'S A CONTRIVED GOTCHA!!!!
>THAT'S NOT FAIR!!!!!
>FUCKING LUDDITE FAGGOT!!!!!

Anonymous
03/20/26(Fri)08:16:44 No.108413463

Anonymous 03/20/26(Fri)08:16:44 No.108413463

>>108410160
This is the kind of stuff that drives me up a wall. 5% in difference is already a world of difference in knowledge representation, 10% is enough to make the model borderline retarded out of fucking nowhere, and it's not like full models aren't free of their own 'tism either. 35B can be good, but downwards from that is asking to have the model shitting the bed when you least expect it.

Anonymous
03/20/26(Fri)08:25:51 No.108413499

Anonymous 03/20/26(Fri)08:25:51 No.108413499

>>108413406
>/pol/ says IQ tests are infallible and entire countries and civilizations should be judged based on IQ test results.

Reddit moment. Now step back for just a second and at least try to be less of a retard just this one time in your retarded life. Is that actually what /pol/ says? Or are they saying that IQ tests are strong indicators of the general intelligence of a particular population? Which they are. It doesn't matter how you feel about it. Stop being a retard.

Anonymous
03/20/26(Fri)08:50:15 No.108413584

Anonymous 03/20/26(Fri)08:50:15 No.108413584

>>108411119
>>108413458
the reason we're not winning at AI against Chinese labs with a few million dollars is because we need to give more money to israel. The next time you see a jew, empty your pockets. This is how we win the AI race. More billions to Israel.

Anonymous
03/20/26(Fri)08:53:24 No.108413594

Anonymous 03/20/26(Fri)08:53:24 No.108413594

>>108410104
do this one:
Solve a riddle for me
A man and a woman are standing at a river. The man has a boat. How can both be brought to the other side?

Anonymous
03/20/26(Fri)08:58:55 No.108413616

Anonymous 03/20/26(Fri)08:58:55 No.108413616

>>108410104
wow they know how to hardcode for gay little riddle benchmark questions. i am super impressed.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.