[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 41627836128736128.png (131 KB, 873x865)
131 KB
131 KB PNG
everything is fucking AIfags in the ass lmao.
LLMs can't improve anymore and only way labs cope is by over-fitting on benchmarks till the new one drops.
it's over, bubble is bursting.
>>
Cargo cult won't like this one
>>
>>108488118
>trillions of dollars
>0.26%
lmao, level of scamming is unreal.
>>
>>108488118
You can't claim achieving an AGI without first passing this benchmark: https://mppbench.com
>>
>>108488118
Cope. Programmers are being replaced as I write this. AI won, codetrans lost
>>
>>108488118
They hit the wall of data available in the world to train models. Everything they could license, everything they could pirate. Used up already. It does not seem like they have shortage of compute. And they had a lot of time to play with architecture and such.
So yeah, this time it's really over.

With Interwebz being poisoned like that, LLM progress is stalled for unknown amount of years.
>>
Brutal stuff bro
>>
>>108488185
>>
>>108488118
No watch how after 3 months all of those models suddenly learn how to do well in that benchmark.
All they do is over fitting now.
>>
>>108488185
>>108488196
is this claude?
>>
>>108488185
>20% of our users use 20x of what they pay
>sustainable buisness model, please redeem
>>
lol statsfags btfo
>>
Actually the open source models and software ecosystem still appears to be improving at a rapid pace. I upgraded one of the pieces of software I use about a week ago and there's already been 3 new major releases since then.
>>
>>108488118
Once you understand how transformers work you really understand how much of a grift the term "AI" is, and Jensen Huang is just another corpo peddler trying to get people to buy his GPUs.
>>
>>108488118
so, model is doing super expert phd math stuff but can't do easy problems humans can solve 100% of time?
How long are they planning to continue with this bs?
>>
>>108488118
Why are they using visual tasks to evaluate language models?
>>
>>108488251
It's always just been a marketing term. If anything this just shows how technologically illiterate 99% of the population is, regardless of age or status. They saw autocorrect regurgitate a reddit post and it blew their minds. There was an article from years back before Microsoft/OpenAI was official about how the managers present were "blown away" by a text summary. It really is just stupid, sycophantic people at every step of the totem pole.
>>
Google is making breakthroughs on their new AI algorithm that can actually create new information. It can question and retry what it did and it can apparently just blatantly invent new stuff that it hasn't seen.
>>
>>108488260
The only actual money generating use case for LLMs is replacing and depressing the wages of software cucks all while destroying the GPL, which is all ironic and very funny.
>>
>>108488308
They claimed the word guesser was an AGI. guess not lol.
>>
>>108488118
>Humans scored 100%
Where is it? Can I try?
>>
>>108488118
This is a systems engineering problem. AI is good at anything it can one-shot. The engineer needs to build a system that can effectively turn a task like this into bite sized one shottable pieces.
Same reason why Claude can’t beat Pokémon even though it could write you a guide on it.
>>
>>108488553
You can on the website. 100% score is the second best player they had in the trial, median person gets something like 40-60%. Average is like 20%, because some people fail levels.
>>
>>108488553
I failed the third one where you need to change the color of the exit
>>
>>108488178
If Ed Zitron is correct they have 2x more GPUs than they have data centers to put them in, let alone power for those data centers
>>
>>108488553
https://arcprize.org/tasks/ls20
>>
>>108488591
>>108488601
>>108488616
Thanks!
>>
So since this is out now, GPT-5.5, Gemini 3.2 and Opus 4.7 will all score 100% on it because they can just put it in the training data? What’s the point
>>
>>108488553
https://arcprize.org/tasks/ls20
>>
>>108488118
I got 0% on arc-agi-3.
>>
>>108488682
The whole point of AGI is that it doesn’t need training. It should be able to just look at the game and infer from context what it needs to do. Right now the AI is at the point where it uh, can maybe figure out how to make the square move, and even that isn’t a guarantee.
>>
>>108488251
You that's the underlying tech but it isn't that simple anymore and the research used changes every year.

Thats like reducing computers to just transistors.
>>
>>108488198
At some point the tests are going to get so hard and multi layered the term overfitting won't mean anything
>>
>>108488682
keep moving the goalposts
it's what humans do best
>>
>>108488591
>100%
>second best
Was there extra credit?
>>
>>108488177
Lmao
Lash out, faggot
I know you only backed it out of resentment
>>
>>108488710
>it isn't that simple anymore
except it is, the same garbage from years ago but with more bells and whistles, a few tricks to improve things, but its the same garbage that get tokens in and tokens out, has no idea of what these tokens are.
>>
>>108488602
>If Ed Zitron is correct
lol
>>
>>108488756
nice counterargument
>>
>>108488763
how about we make a deal
if the bubble hasn't popped by the end of the year, you stop listening to anything ed zitron says
if it does and openai explodes i'll start listening to ed's podcast
>>
>>108488773
What did the dotcom bubble popping look like?
>>
>>108488773
or you could just find something that contradicts the evidence he provided about the numbers of gpus purchased and the progress on data center construction and power generation
>>
>>108488785
i was 15 and didn't even realise there was a bubble let alone one that popped
i just kept browsin

>>108488788
nah.
let's just wait 9 months. that's long enough right? i'm not worried, are you worried?
>>
Cute game i liked it
>>
>>108488616
>>108488694
i just beat it and it doesn't even say how many turns i used. what a piece of shit
>>
File: 1760810600513929.gif (3.62 MB, 166x296)
3.62 MB
3.62 MB GIF
>>108488557
>0 shot soon brother
>>
>>108488803
So it won’t be some obvious crash, likely. It will just be a huge reassessment of the usage of LLMs by corporations and the product will change. There won’t be some smoking gun showing it has “popped” there will just be a lot more gay little startups crashing and linkedin fags will take AI out of their titles.
>>
but yudkowsky said there will be a fast takeoff and agi will kill everyone
>>
File: 1774760802980124.jpg (52 KB, 726x726)
52 KB
52 KB JPG
>>108488557
Claude doesn’t know how pokemon works dumbass, its able to give you a guide because its just googling it and giving you the results. Fuck, it can’t even really do that because it’ll just start hallucinating halfway through. The only reason Gemini scored anything at all on this game is because it thought it was playing fucking river raid and accidentally moved over the plus sign.
>>
>>108488118
I guess that explains why the shilling got even more extreme last weeks
They are getting desperate
>>
>>108488215
>>108488185
I'm part of the 20% and my corpo pays for infinite tokens so I will keep being part of it sorry bros.
>>
File: average user.png (10 KB, 400x300)
10 KB
10 KB PNG
>>108488185
>>108488196
They STILL haven't learnt kek after 7 years of this shit
>>
>>108488905
>my employer will pay for the thing no matter what it costs
Are you retarded?
>>
>>108488905
You genuinely better be prepared to have to scale back your usage a LOT or you are in for a bad time.
>>
>>108488849
okay. i will say:
1. corporate use of llms in harnesses will continue to increase, and the moment coders experienced with cc and codex will be experienced by many more professions over the year. knowledge work in many areas will fundamentally change. full automation this year? no. but a lot of sitting around watching over agents? yes.
2. small startups will indeed die. in fact, larger players like perplexity will die. maybe even cursor. they'll probably get acquired. some of the breakoff labs like thinky will also die/get acquired.
3. model progress will continue apace - in big jumps as models are trained on new hardware (i.e. the new blackwell claude and gpts we're about to see) and then steadily between hardware generations
>>
>>108488730
Definition of AGI hasn't changed.
>>
>>108488942
Nah, I don't think so.
>>
>>108488602
When can I get an H100 for 500 bucks?
>>
>>108488946
K. Remember this post when you’re completely wrong. (You will pretend you don’t, lol)
>>
File: HEjVWvYaoAEP1mR.jpg (87 KB, 1200x589)
87 KB
87 KB JPG
>>108488955
did you know h100s are more expensive to rent now that they've ever been?

>>108488956
u 2, friend
>>
>>108488946
>okay. i will say
no
no
bad retard faggot
not okay
>>
File: HEjVWvVa8AAPVtD.jpg (72 KB, 1200x589)
72 KB
72 KB JPG
wow these b200 prices too.
sure seems like people are using these things a lot!
>>
>>108488745
>codetrans seethe
Lmao keep crying. You will never get a tech job. You lost
>>
>>108489069
What do you do?
>>
>>108488946
You are correct. These points are obvious to anyone employed in tech.
>>
>>108489258
I have a job not related to codetrans stuff
>>
>>108488118
>Ai is fake and gay
I agree. China can stop investing time and energy into this parlor trick.
If they did it would be funny to see the west waste all it's resources on a total nuthinburger.
>>
>>108488602
guess who's they're going to sell that compute time to :^)
>>
>>108488955
>When can I get an H100 for 500 bucks?
They will be trashed or sunk to the ocean before they'd ever consider selling them.
>>
>>108488946
does this happen before or after societal collapse when the oil and food runs out
>>
>>108488694
>>108488616
Are the controls this shit on purpose?
>>
>>108488118
What exactly is the benchmark?
>picking up a stick
Ah shit, looks like LLMs can't do that LOL, gg no AGI.
>>
>>108488946
This is absolutely true.
>>
>>108488530
Gemini is the one that will win because it's the only one attached to a business that actually makes money.
>>
>>108488885
Turn off internet search and ask any major llm for a basic Pokémon Red walk through and they can handle it
>>
>>108488118
promptstitutes BTFO
>>
>>108489581
>codetrans
Where did this gay term originate?
Is it a third-worldism?
You mean tranny?
We (white Americans) typically wouldn’t refer to an individual as “a trans”
>>
>>108489655
Codetrans means codetranny. Im white tho.
Your seethe won’t help you.
>>
>>108489706
>I meant to sound retarded
>no mention of nationality
Lmao
>>
>>108489717
Why are you seething, monkey? It’s not my fault that your codetranny skills aren’t required any longer kek
>>
>>108489733
I am jubilant as can be, faggot.
Nobody with a modicum of self respect or skill talks the way you do.
>>
>>108489755
Nice update of the language, I’m glad you can learn.
The point is that you just spam this shit because you’re miserable and you think it is a good attack vector for people you resent.
You are gay!
>>
>>108489779
See.
Resentment.
I don’t resent skilled tradesmen. I think it is respectable work. But I put my effort into this other field and now general contractors want to gloat because they fundamentally don’t understand technology.
>>
>>108489794
Not my problem, faggot. Get with the times Luddite. We don’t require you, we both know that learning to code is useless now
>>
>>108489800
Nigger, what kind of tech job isn’t on the chopping block from the perspective of the type of homo using the term “luddite” like you just did?
You’re poking holes in your own larp.
You are a resentful faggot.
You work as a roofer or a cashier.
>>
>>108489816
being a roofer is based and better than being a codetrans. You wouldn’t get it, your hrt has fucked your brain.
>>
>>108489824
Thanks moron.
Maybe you could recall a few posts ago where I said I respect tradesmen?
You aren’t respectable, due to your homosexual behavior, but that is neither here nor there.
You realize that you habitually use software, right?
You should work on being lest resentful. It’s bad for you.
>>
>>108489800
You don't even know what a luddite is
>>
>>108489894
You hate technology, therefore, you are a luddite
>>
>>108489918
Well it's not like you were ever going to get a job
>>
>>108489568
Bob from India would like to introduce you to Microsoft copilot
>>
>>108489981
Cope
>>
>>108489861
You are a seething queer.
You have done a good thing, however.
You have demonstrated that the “AI-Chad” sentiment is being fomented by retards who resent all tech work, and isn’t based in reality. Anyone who actually uses AI and isn’t delusional has met hard limitations very quickly. The downturn in the market has an easy scapegoat in AI, but this will not pan out the way linkedin retards hope. And you will have to watch while the programmers you resent are once again employed and can feed their families.
Why even resent it so much? Is software development even something you think you would enjoy? Being on rooftops sounds pretty cool sometimes.
>>
>>108490003
>anyone who actually uses Ai has met its limitations
Just like him?
https://nullprogram.com/blog/2026/03/29/
Or like Linus Torvals, Donald Knuth etc?
Those people code much better than you and they have admitted that AI is superior to them. NGMI, you lost and wasted your life studying that shit. Cope and seethe, dilate even. You will die jobless.
>>
>if we just keep throwing more gpus at it then our llm will magically become self aware
investors actually believe this
>>
>>108490071
These are old men, idiot. They are out of touch because that is the way of the world. They will happily stand aside and let things go to shit because they already lived. Maybe you’re an old gay faggot. That would make sense.
>>
>>108490071
All of the people who can use AI well are competent in programming. So you’re not really making the argument you think you are.
>>
>>108490100
>t-they are old therefore their opinion is trash
Fucking moron, you are shitting your pants as I write this, keep seething and coping codetranny. Those of your kind will be exterminated.
>>108490110
Holy cope Batman, now a kid using Claude can write code with the same quality as theirs.
>>
>>108490128
It is funny that you keep trying to mirror my language. You are out of your depth.
>>
>>108490141
You are mentally ill though
>>
>>108488177
That may be true but coding is not "AGI" or intelligence at all. In fact it's probably one of the easiest things for an LLM to do since programming is logic based and follows programmable rules. Trained on a dataset of every program and piece of code ever made gives it the "general intelligence" of the combined coding world. That is significant for producing good software but has nothing to do with actual intelligence.

tl:dr: AI retards are retarded and unqualified to judge whether something is generally intelligent or not.
>>
>>108488785
in fact it looked very much like what you are seeing with AI right now. The total memory-holing of the original dotcom era is the main reason people are so confused about a lot of tech shit in the last 15 years.

The bubble was built on the hype of infinite potential. When it burst was when people realized that most of that potential wasn't actionable and what was actionable wasn't incredible new innovations of things no one imagined before but instead just improvements on real world things we already had.

Nearly all of the really innovative applications from the late 90's were smothered by corporate/IP interests. Sound familiar?
>>
>>108489582
>80gb training data
It's an encyclopedia and it can read it.
>>
>>108490226
I don’t even think it is true. The autocomplete can’t do anything novel. It isn’t creative. It isn’t thinking. All of technical debt racking up in organizations that thought they could fire everyone is going to rape many an asshole.
>>
>>108490252
Cope more. We don’t need you
>>
>>108490243
This is exactly what I suspected. I can see the overpromising and the people who have pie in the sky dreams about AI being their ticket to make it even though they aren’t true entrepreneurs. The large corporations that have spent incomprehensible amounts of money don’t want it to have been in vain and are shouting down actually level headed people explaining the real cost and limitations. All of this coupled with the fact that every single LLM is operating at a loss in hopes that they become so ingrained that people can’t back out when the prices hike 400%. But I have yet to see a really critical application. It saves time generating code but new bottlenecks are created and if you trust it with autonomy you are a complete retard.
>>
>>108490252
If you've worked around programming shops you understand that a huge number of them are just code monkeys pecking out stupid functions that have been done dozens of times by someone else. That is the job being replaced effectively by Claude right now.

The person who architected a software application and assigned that task to the monkey? LLM's aren't anywhere close unless it's in their training dataset already.

The important question is: "which tasks can an LLM consistently and reliably complete?" That turns out to be a shorter list than the jews hyping this garbage thought.

IMO the biggest value of LLM's and "AI" is in enabling robotics, which is the real industry we should have dumped hundreds of billions of dollars into.
>>
>>108490270
Who is we? You still haven’t said.
>>
>>108490279
The hype of dotcom era was also built on the exponential market size fantasy. Every overpriced company valuation was because they said "hey, we're taking this thing your company can sell to 20,000 people today and increasing the potential market to 2 billion people!" That's how they got the crazy numbers and where the original concept of stacking users first and worrying about profit second came from. Few of these technerds or the finance retards backing them understood what was actually valuable to users and the general public. As a result even decent ideas were implemented by the wrong people and failed.

My favorite example is CueCat. It is probably the perfect case study for starting out learning about the dotcom boom.
>>
Another dotcom example I remember is a guy I knew who worked for Desktop.com around 2000.

The domain has been sold and shuffled so whatever they are now isn't the same as what they were back then. I remember it because he showed me.

They had created a full Windows desktop *in browser* that you could sign into from your computer. It was impressive programming and neat tech especially for the time. The problem is obvious looking back but at the time it SEEMED potentially huge. That sense of huge possibility, new technology, and a lack of vision of the future in the funding class was what created these bohemoths.

In practice they hadn't really understood what problem they were trying to solve for users and thus how best to do so. The internet enabled global, fast-access remote terminals for the first time and THAT is a good product that still doesn't really exist in a good way for consumers. Why? Too many licenses and corporate interests and control mechanisms getting in the way.

Every person could have a remote computer via ssh tunnel or whatever that runs full-screen on their local desktop but can be loaded from any pc they are at. You could have a physical hardware key of some kind for credentials making it easy to walk up to a machine, insert your "key," and quickly access "your computer" with all of your files and stuff.

We have a hundred bad implementations of this today, none of which meet the consumer need. But someone was raising tens of millions promising it in the dotcom era 25 years ago.
>>
>>108490296
>>108490327
I appreciate these posts
>>
>>108488118
>LLMs can't improve anymore
Don't need it to. I can already use it as an immense productivity-boosting tool.

Sorry you're a luddite tribal user with no actual understanding of what's happening.
>>
>>108488118
It's always retarded seeing losers like you seething impotently over "BUT BUT SOME GUY SAID THEY HAD AGI" like they can't recognize typical businessspeak.
like who cares

the point is that it's a useable tool that has saved me hundreds of hours on a vast wide array of uses in many different domains

there's no world in which crying "but but the bubble is going to burst" affects what I can do now
>>
>>108490003
>You have demonstrated that the “AI-Chad” sentiment is being fomented by retards
Meaningless nonsense tribal post.
"hurr durr my group > ur group" ignorant of any practical applications of AI.

>Anyone who actually uses AI and isn’t delusional has met hard limitations
I use it and it works very well for many different things. It's a huge productivity booster, because I know what I'm doing and what I want done. Your only argument is "n-noo i can read your mind and you're NOT actually getting what you want"
Why do yiou get so delusional and tribal over this?
>>
>>108490296
>>108490327
chatgpt has 900 million weekly active users
>>
>>108488185
oh no the trillion dollar company has a high compute bill for an extra 500k!
>>
>>108488215
>it's impossible to change a business model so we should all commit suicide
weird retard takes on this hobbyist site
>>
>>108490280
Yes, robotics and military winning big from AI. All everyday life can cut it seamlessly and just google things like before. Thats the reason why they still have these free tiers despite losing a lot of money. They know it is true and still hope that you will became dependant enough to fuck you in a few years from now. Like, you know, a lot of corporations did in a last 5 years
>>
>>108490443
>chatgpt has 900 million weekly active users

First of all you don't actually know that. The head of OpenAI is a famous liar so these could be entirely made up numbers. Secondly, what is a "user?" If there is a ChatGPT plugin on a chatbot on your utility companies website and you load it the thing pops up and now you're an "active user."

Oh, is it "accounts?" Sure, how many are bots made by an indian.

Regular paid users is the only stat that matters. How much they are paying is also pretty important because each user has a cost to service.

I am 100% sure that the cost to provide the service for ChatGPT is a lot higher than the fees they charge most users. That's what is known as a "bad business model" and is why they'll go under. Many such cases in 2001.

The AI companies are all hoping to survive by replacing Google search and selling ads but what is inevitable is someone taking advantage of that decline in Google and the shittiness of AI and releasing an old school search engine.
>>
>>108489445
More like, they're going to be selling GPUs at liquidation to /g/ chuds for cheap.
>>
>>108490472
nice reddit spacing
>>
File: 1771396148977.png (183 KB, 501x663)
183 KB
183 KB PNG
>>108488616
uuu i did it im a real human bean
i never clicked help either, i just learned from inference

i would appreciate a thing that said "your a very special boy and much smarter iq than other people who played with this toy"
>>
>>108489552
Its a puzzle, a game that requires problem solving, something that a (supposedly intelligent) ai should be able to solve.
>>
>>108488118
>humans scored 100%
bullshit, some of those games are hard af
>>
>>108488553
https://arcprize.org/tasks
Each game has a human actions to complete game thing. Games around 100 are easy and games around 1000 are like wtf is going on.
>>
File: 1759275033057.png (184 KB, 518x661)
184 KB
184 KB PNG
damn this one was really easy
solving the captcha to post this was harder, i misclicked cause i was so excited lol
>>
>>108490412
>>108490423
>>108490430
Triple-post faggot
I never said it is useless. You are very defensive.
>>
File: 1769158017776.png (183 KB, 517x665)
183 KB
183 KB PNG
>>108490544
which one's the hard one?

these just keep getting easier
>>
>>108490565
wronmg retard
keep seething, you can't use tools and you're unintelligent, all you can do is pitch a fit and cry
>>
>>108488616
idgi, how do you get your score? also >>108489481

>>108488185
>>108488196
why didn't they make it standalone?
>>
>>108488185
>AugmentCodeAI
Who?
>>
File: 1772931003579.png (185 KB, 517x668)
185 KB
185 KB PNG
what's next
where's the hard one
>>
>>108489984
Aren't they just renting other models?
>>
>>108490571
>>108490612

Completing it isn't enough, you need to perform as well or better than the second best person who's played. Anything less is sub 100%.
For some stupid reason they don't give the real score.
>>
>>108490631
>t. iqlet
sorry you couldn't beat it
>>
Reminder to reevaluate AGI after you try to get an LLM to help you with a sudoku puzzle.
>>
>>108490575
You have GRIDS
>>
File: 1772543797477.png (184 KB, 523x668)
184 KB
184 KB PNG
>>108490671
sorry you're mad about your entire life and you can't do a single thing about it but stamp your feet and be mad that technology exists

where's the hard one
>>
File: 1757170886039.png (183 KB, 513x666)
183 KB
183 KB PNG
dang that was the easiest one yet
>>
File: 1755717916605.png (185 KB, 525x661)
185 KB
185 KB PNG
damn this was both the easier one AND fun, you're like a little wizard cube casting spells haha
>>
File: 1755519964445.png (184 KB, 520x667)
184 KB
184 KB PNG
hehe shish kebab maker

how many of these things are there tho, i'm getting kind of bored
>>
>>108490612
>where's the hard one
The one where you literally have to go off screen and it's not visually implied anywhere and you can only go off screen in a single part of the map, not even the entire edge.
>>
File: 1753258665373.png (185 KB, 523x659)
185 KB
185 KB PNG
hot lava comin through

>>108490791
the one where you line up the +s/ squares on the dots and it lets you move them offscreen?
>>
>>108488118
Unfortunately it doesn't have to be AGI to take jobs
>>
File: 1744968730682.png (184 KB, 520x663)
184 KB
184 KB PNG
flyin saucers
when does it end
>>
File: 1761082136292.png (185 KB, 512x665)
185 KB
185 KB PNG
this one was cool, kind of a little robot programming thing

seriously though i'm getting hungry
>>
File: 1771147002114.png (183 KB, 508x667)
183 KB
183 KB PNG
this one was dumb until the last one
multi tier glyph translation puzzle, pretty neat
probably a lot of <120IQs failed this one
>>
File: 1766399032420.png (186 KB, 529x664)
186 KB
186 KB PNG
damnit fisher this is a stealth mission!
>>
File: 1765443742781.png (185 KB, 527x670)
185 KB
185 KB PNG
water level puzzles

ughhh ok thats enough i need some fucking food
>>
File: file.png (173 KB, 1034x1063)
173 KB
173 KB PNG
>>108490664
agi confirmed
>>
>>108488118
Since when is Gemini good? Have I slept on this? Is it better than Claude Opus?
Also, Claude Opus at least is going to shit. Its dumber than it used to be
>>
I only managed to complete the 2nd one until I ragequite due to this shit game swallowing half my inputs into nothing.
>>
>>108488118
So, take your bet ... who will be the first one to cheat their way to 50+% (closed models can just search API requests for the "secret" test set questions). I think Dario will cheat hardest.
>>
File: file.png (243 KB, 964x967)
243 KB
243 KB PNG
>>108491708
gemini's a decent model but the post-training for code is terrible so for everyday use it's terrible. it can actually be helpful if you run into a difficult problem.

>>108492333
arc-agi 3 scoring is a bit strange and biases against the models pretty heavily. second best human scores are used as baseline. even if a model completes a task in the same number of steps as that baseline it doesn't score 100%. critically (and i think this is actually fine), if the human baseline is 10 steps and thet model takes 100, but solves the problem - it scores 1%.
previous arc benchmarks saturated in a year. this probably takes 2 unless, we get a bit advance in token use efficiency.
>>
>>108492238
oof
low iq, very unsentient
>>
>>108488118
They really thought it would magically gain human understanding when all it was is nothing but a fancy auto complete from the fucking start
Also, Google's engineers knew from the beginning that LLMs weren't AI which is why they never thought about using them that way through all the decade they had them because they know it would be a false claim and anyone with a single brain cell would call them out on that, not until a jew decided to market them as such and trick retards into looking at them that way for profits and lo and behold every other jews jumped at the chance to ride the profit train creating the current hell.
>>
File: wargames_simulations.gif (1.38 MB, 480x356)
1.38 MB
1.38 MB GIF
>>108493090
>magically gain human understanding
maybe they should try asking it to play tic-tac-toe for a bit
>>
>>108493143
>maybe they should try asking it to play tic-tac-toe for a bit
This is how I was trying to figure out what models were worth running on my local server about a year ago. They all failed in different ways - mostly failed to recognize the win conditions or fucked up the game state in one way or another. I thought it was due to them being smol (all under 70B). But then I tried the same with the big boy cloud-based models and they also failed.
>>
>>108488118
And where is this stinky benchmark? The idea humans get 100% but SOTA gets .xx% makes me doubt it very much.
>>
>>108488616
>AI doesn't know how to play bideo game therefore it is useless
:|
>>
>>108493235
You keep trying to conflate usefulness with intelligence. It is niggardry.
>>
>>108493395
in fact less intelligent people are often more useful
>>
>>108493408
That has nothing to do with what either you or I said. Why are you such an obfuscating homo?
>>
>>108493417
i'm a third poster
>>
>>108493426
No you are not.
>>
>>108493437
why would i post that if i were >>108493235? it contradicts his point even more than yours does. you're saying intelligence != usefulness, i'm saying intelligence isn't even correlated with usefulness
>>
It's all Gotchas all the way down. Humans with their mental illnesses can think in unique ways that are not a single neurotypical model anyway so will always be able to defeat AI just by thinking stupid things.
>>
>>108493472
So now you are pretending to be me?
>>
>>108493235
The idea is for general intelligence it should understand patterns in a new sandbox like humans do without providing much assistance. This basically proves LLMs do not possess any general intelligence. They can be trained at certain tasks as long as relevant data or pattern was made available during training
>>
>>108493481
humans adults also have a lot of experience with cognitive tasks since they were infants
>>
>>108488955
you won't be able to run a used h100
>>
>>108493481
yeah i tend to agree with this. that said, because we're goats we can also reverse engineer ourselves
>>
>>108490505
nice newfagging
>>
File: 1756232554756634.png (552 KB, 1312x560)
552 KB
552 KB PNG
Now let's see if "AGI" can solve this
https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_mathematics
>>
>>108491103
Lol that is definitely an "easy" puzzle.
>>
>>108495148
brother, we have established that agi is impossible
>>
>>108490081
worked for bacteria
>>
>>108494889
They are just autistic.
>>
>>108495328
did it though?
>>
>>108495433
Yeah. We're here.
>>
T-the 2 weeks for the bubble burst already ended??? Bros it's happening finally...
>>
>>108495148
AGI is impossible but it’s not required to replace coders and artists keeek
>>
File: 1000020336.png (737 KB, 1080x1543)
737 KB
737 KB PNG
The bubble is poppi-ACK
>>
>>108488519
There is the effect that when you start talking to a modern LLM, you want to believe it is smarter or understands more than it actually does. This isn't new:

https://en.wikipedia.org/wiki/ELIZA_effect
>>
>>108489861
All these companies are working at a huge loss, when they begin trying to recoup their losses by increasing prices it will infinitely cheaper to hire real people.
Why can't you have an argument without shouting "you'll be replaced" or calling everyone "tranny", have you delegated your brain functions to the AI as well?
I don't mind AI and use it to code sometimes but faggots like you who have no clue what they're talking about shouldn't even be near a computer, let alone use AI.
>>
I'm convinced that scoring is backwards. Claude seems to be the most intelligent to me while Gemini is retarded.
>>
>>108495670
He is a big crybaby faggot filled to the brim with resentment.
He wants total collapse in order to not feel inadequate.
>>
>>108495570
Damn, all that money invested into a dying company, insane.
>>
>>108488118
>>Grok-4.20: 0.00%
KWABOTY
>>
>>108488819
You have to scroll down. You are the "humans" score.
I have no idea how people can fail levels. There's a reset button...
https://arcprize.org/replay/8aed7120-f7a9-45a1-837a-68bc7dc37a4f
>>
>>108496119
Correction - watch the replay. Your # of actions is shown. Gay, where is my score
>>
>>108488729
Behold /g/ the words of someone who doesn't understand how LLMs work.
An LLM can never have generalized functionality.
It can only output the correct response to the specific correct input it is trained for.

Please tell me how many k's are in Albuquerque.
:^)
>>
>>108495570
where do they go now, in this post-AGI world
>>
>>108489552
A toddler can pick up a stick.
>>
>>108490252
>I don’t even think
agreed, not reading the rest, now gtfo
>>
To any anti-AI person here, I sincerely implore you to answer this simple question for me. Every time I ask it, you fags just short circuit and start seething and pissing your diaper and sputtering nonsense. The question is this:
>Is the human brain the penultimate computational system that is physically possible in this universe?
Is that what you are constantly trying to assert?
>>
>>108497344
is an LLM? lol
>>
>>108497344
second only to G-d...
>>
>>108497356
No, definitely not, but either you believe LLMs are the last and final "AI" architecture that will ever be tried for the rest of eternity, or you believe the human brain happens to be the absolute physical limit of intelligence in the entire universe, or something else. So what's your cope/delusion of choice?
>>
>>108497344
FYI, "penultimate" doesn't mean "extra Scrabble score version of ultimate".
>>
>>108495239
This. Also any sudoku is just a huge equations system.
>>
this one is the most fun one i have played so far
https://arcprize.org/tasks/tr87
>>
>>108497344
Do you perhaps not know what penultimate means?
>>
>>108488808
>We operate at a loss but hope to be stable
Hm
>We will melt like snow in the sun but until then please keep purchasing our services
>>
>>108488198
>that benchmark
yes that specific one, that will have been added to the training data, by humans
>>
>>108497344
Irrelevant because AGI is impossible. This is factually the case.
>>
>>108488118
It's been bursting for a year, can I get my cheap SSDs or no???
>>
>>108497344
As far as we know, yes.
>>
>>108497344
What the fuck are you talking about, I've never seen anyone claim this. Also penultimate means second to last, what would you or "anti-AI" fags consider the last, AGI?
Even if AGI is the last possible brain it would still need to go through tons of iterations to reach perfection and those iterations should be considered different.
There's also a ton of possible brains between ours and AGI.
>>
>>108488118
what language is that? i can't read this twitter shit anymore



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.