[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1777190204293792.png (975 KB, 2400x1700)
975 KB PNG
Anthropic to reach 100% global GDP in 21 months
>>
>>108790352
some chance of that when it's not even a stock with an IPO
>>
every few weeks i ask claude and codex to rebuild half life and they refuse to do it
if they can’t build a silly little game from the 90’s, how will they do all this other stuff you guys talk about?
>>
>>108790375
you're using it wrong
>>
>>108790375
How well does it do?
>>
>>108790352
>log scale
lol
>>
>>108790362
>196 results
Cringe af
>>
>>108790352
why stop there? 420% of GDP in 24 months
>>
>>108790352
>be me
>use claude opus to help me resort some isometric cubes so they render in the correct order
>it adds a memory leak
Bravo Dario!
>>
>>108790362
Snailcat a cute.
>>
>>108790352
Fucking kek. You know they made this graph in complete seriousness too. God should nuke SF
>>
HOLY FUCK AI WON.
100% IS AI.
I KNEEL
>>
File: file.png (91 KB, 696x497)
91 KB PNG
>>108790352
Why would people assume Anthropic would be ahead in several years? I still remember when they hit a slump after Sonnet 3.5 during trying to get Opus/Sonnet 4 out and being behind everyone in reasoning. They released right after R1 from Deepseek even. You can't say anything about who is ahead vs behind in the race without knowing what is going on and what all the labs have. I remember still Google coming in from behind and beating everyone with Gemini 2.5 and people having to accept the reality there was a third Western frontier lab and Meta falling behind. Come back in a year or two and we'll see who is still there and on top.
>>
>>108790362
snailoids btfo
what will snailtards say when anthropic at 200%? they will lie down and just go to sleep like gay snails lol fucking idiots
>>
>>108791095
its so over for small peepee luddy fuddy squishy wishy snailz
>>
>>108790785
>slopus
should had used gpt
>>
>>108791211
should've'd*
>>
/g/ is so shit with poos now trying to force their memes
>>
File: SamAltman.png (2.1 MB, 1086x1448)
2.1 MB PNG
>>108790352
OpenAI is sweating
>>
>>108790484
Nigga, you're talking to a $current_year year equivalent of pic related. Just filter it.
>>
>>108790469
they just need to 1000+x their company bro, couldn't be easier for a company that gets 90% of its income from the video card-big tech-investment fund-AI hugbox funding
>>
File: 1773830640842121.gif (3.54 MB, 498x498)
3.54 MB GIF
>>108790352
>dat diagram
mega kek
>>
>>108791079
I am unfortunately an AI engineer. It means I orchestrate api requests to sloppa endpoints from anthropic, openai, and google (although often through intermediaries like azure or aws bedrock).
We use it for automated document review workflows and similar document-oriented automation.
We use it in various capacities for these tasks, for example generating code snippets that run in sandboxes to execute functions, or to control a system where they operate in as part of an addin.
Basically, in non-thinking mode, universally (i.e. any modality, any task), we found that 4o > 4.1 > literal garbage > opus 4.6 > opus 4.5 > haiku 4.6 > haiku 4.5 > random guessing > gemini-2.0-flash > gemini 3.0-flash (pre-cancelation) > gemini-2.5-flash > gpt-5 series (roughly whatever order you like, they all suck balls) > gemini 3.0-flash (post-cancelation) > gemini 3.1-flash (pre-GA).

For document extraction (i.e. "read this document, extract data according to this json schema"), gemini-2.0-flash > gemini-3.0-flash (pre) > gemini-2.5-flash > gemini-3.0-flash (post) > gemini-3.1 > *. I'm not even listing the others because they literally can't do it even on small documents. They fuck up so terribly it's insane and I never thought there could be such a gap.

We never test opus because costs are prohibitive even though we're b2b and pass costs on to our clients. Looking at like $10-20 to process one document when it's more like $0.1-$1 for other models.

For thinking, though, anthropic models trump everything modern, but o1 > o3 > *.

Outside our internal evaluation suites, I found that sonnet 4.7 works significantly worse than 4.6, and that 4.6 in chat mode (i.e. with all the orchestration that anthropic built, not the model quality itself) works heads and shoulders above anything else right now.

Note that those evaluations are not merely a reflection of our internal ability to optimize for one model or another either: cs teams at providers concurred and other clients confirmed
>>
>>108792895
Slight correction: whenever I say opus it should say sonnet. We don't use opus as mentioned because of costs. In fact we avoid sonnet because outside of anthropic's built-in orchestration, the models themselves are actually pretty shitty. Not as bad as gemini but at these prices we would rather bash gemini into submission. Alternatively, it would have to be clearly better than other models and then we would use it at any price because investors will foot the bill instead of clients. But that's just not the reality of it.
>>
>>108790783
Beyond cringed and look the bosnian that keeps reply to yourself nooticed one of his spam got deleted



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.