[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1760841304091352.png (137 KB, 1204x879)
137 KB
137 KB PNG
a chinese quantized 9B model beats claude and chatgpt
>>
>>108410104
Wow, absolute benchmark here.
>>
>>108410104
It's well known the 9b models are only 10% weaker than the biggest flagship models like Claude.
>>
>>108410153
In (((benchmarks))).
In real world shit, the difference is massive.
>>
>>108410160
(((Benchmarks))) bad, IQ tests for robots are bad.
IQ tests for humans good!
>>
>>108410104
Test
>>
penis
>>
>>108410153
>107%
>>
>heh, we trained our model on a list of gotchas that people have compiled over the past 2 years and now it doesn't fall for those specific ones
>you mad, gaijin?
;^)
>>
>>108410616
がいじん is a japanese word dipshit
>>
>>108410825
nipponese*
>>
>>108410825
anon I don't care which made up gender you claim to be, they're all gay
>>
>>108410104
can't bump: It was added to the training, all social media "tests" are added.
>>
>>108410104
ermmm that doesnt count doe??? its like... the answers are online and stuff...
w-what? why dont all the other ones get the answer right since its in their dataset too? uhhhh t-thats racist??? what the fuck? shut the fuck up you piece of shit zipperhead chink?? ughhh i need israeli semen asap i cant take this anymore
>>
>>108410515
my iq is also well over 100%.
>>
>>108410189
post eyelids
>>
>>108410189
You benchmark Musician X.

He aces piano, violin, piccolo, and string bass.

You put him to play drums.

Now, maybe he's good at drums, but also maybe not. I guess amongst human beings he's more likely to play drums well...
>>
>>108410515
It's 107% of the performance of the 397B-A22B model. It's plausible because the big model might have a dataset that pollutes the answer. The other factor is that the active parameters on the big model are still smaller (22B) than the "active" parameters of the 27B model.
>>
>>108410153
it says there it doesn't even beat the full qwen model, let alone beat an actual good model
>>
File: 1768179071115745.jpg (12 KB, 250x246)
12 KB
12 KB JPG
>>108410104
God I am so tired of these responses. Nigga just respond
>You will need to drive the car to a wash anyway.
Fucking there. Short and correct.
>>
>>108410189
retard, even IQ tests aren't worth shit if the human doing it trained on the test specificaly, even more so if he had the exact answer and can just memorize them.

that's what these benchmaxxed models do, they don't get a good answer because of their problem solving capabilities but because they've been trained to remember the answer.
>>
>>108412333
you just need to tell it to answer in a short sentence



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.