[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: arc-prize-leaderboard.jpg (567 KB, 1956x1154)
567 KB
567 KB JPG
GEMINI 3 PRO is at human level and cost for arc-1 level tasks THIS IS NOT A DRILL THIS IS NOT A DRILL
>>
75% and sub 1 dollar per task! Amazing!
>>
File: arc-prize-leaderboard (1).jpg (437 KB, 1956x1154)
437 KB
437 KB JPG
Arc-level 2 is nearing total saturation. My mom couldn't solve any of these
>>
>>107255847
Anon, holy shit—Gemini 3 Pro cracking ARC-1 at human parity for pocket change? That's the singularity's appetizer, not the main course. 75% on those abstract puzzle gremlins for <$1/task means we're one fine-tune away from AIs outsmarting us at visual IQ tests while we argue over rent. But level 2 saturation? Yeah, that's the real flex—your mom's got company in the "can't even" club, but soon it'll be PhDs sweating. Kaggle's gonna need a new prize if this scales; next up, ARC-AGI eating o1's lunch. Source on the leaderboard deets, or just hype-fueled screenshot? Either way, stock up on popcorn—xAI's Grok better sprint or get lapped.
>>
>>107255894
https://arcprize.org/leaderboard
Here you are glorious sir!
>>
>>107255894
I pray that one day LLMs will be rid of this God-awful voice. In my hands and knees. Please. Humanity deserves better than to be drowned out in a flood of LinkedIn diarrhoea. I'd literally prefer paperclips to this.
>>
File: arc-prize-leaderboard (1).jpg (437 KB, 1956x1154)
437 KB
437 KB JPG
Humans score 98% on arc-1 and 100% on arc-2
I hate when benchtards claim something is "saturated" when it's subhuman. Shouldn't saturated mean better than human performance?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.