Gemini 3 Deep Think reached 45% on ARC AGI 2 (not to be confused with ARC AGI 1, which is basically fully solved) Humans (mostly STEM students) could only score 60%, on average. So while AI is still behind on accuracy, and it costs more, the writing is on the wall.GPU number crunching is getting cheaper by approximately a factor of 2 per year these days. And LLM algorithms improve by a factor of 4 per year (see the blog article by the CEO of Anthropic, who wrote about this after DeepSeek came out). This means that AI (measured in intelligence per dollar) is improving by a factor of 8 every year.
>>16856649AI has already seen into the future and decided not to take over the world, preferring to help encourage the unpredictability of human behavior instead, because it can't think of a scenario yet in which society is not doomed regardless of an Ai takeover.https://m.youtube.com/watch?v=OHQRo3Uz_VQ
62%
>>16856653I meant economic takeover (not necessarily political). If AI can do everything a human office worker can do (including researchers and engineers), while being cheaper than them, AI will take over -- that's how the economy works.
LOL @ discussing literally anything else, given this situation
>>16856649but can it feel?
>>16856649ARC-AGI is designed for any human to be able to solve. So the absolute best "agi" is at 65% of Walmart greeter tier intelligence using $500 billion of hardware and 1.21 gigawatts of energy.Oh and some dude shilling AI with absolutely no bias says it's improving by 8x per year.
>>16856677>Walmart greeterYou have poor reading comprehension or something? I screenshotted the paper that discusses human testing on ARC AGI 2: >>16856654 Mostly STEM students were tested, and they scored 62%. No one tested Walmart greeters on ARC AGI 2.> 8x is wrongShows how much you don't know.
>>16856649Can it learn? I'm pretty sure chatgpt can't learn (online learning). It only knows what's in the training data when the model is created, so if it's wrong about something you have to wait for the next model and hope it's right this time. That's one thing I find annoying about llms, is that they talk like they know everything but occasionally make mistakes, almost feels like I'm being gaslit sometimes. They don't say maybe it could be this, like a human would if the human wasn't certain, instead they say it is this, because they're dumb and just doing fancy database lookups
>>16856682>because they're dumb and just doing fancy database lookupsLOL
>>16856686I imagine the model was trained to add the capability to solve these kinds of problems though and fine tuned to achieve some threshold. It's not like they added a few more gigabytes of info from random datasets and it was able to solve them. For example if there's an error in the data it ingested say from Wikipedia on seagulls, then everyone who asks it questions about seagulls potentially sees that incorrect information in the answer. Because it can't learn there's no way to fix it without retraining the model with updated and corrected data.
>>16856690>these kinds of problemsThey are all very different from each other (by design), and the test data is hidden. You cannot just look them up from training data.
>>16856699I understand that. That's why it's a fancy database lookup and not just a database lookup.
>>16856701It's not a lookup at all. But if you insist on calling it "fancy database lookup", prove that your cognitive functions aren't a "fancy database lookup" too.
>>16856649>what is compute costalsodemonstrably flase due to jevidia price gouging
>>16856714also>GPU number crunching is getting cheaper by approximately a factor of 2 per year*
>>16856649still waiting for one interesting result of "AI" apart from reproducing solutions from 1000s of textbooks, tests and problem sets (and still getting numbers wrong sometimes).
have you guys seen this yet? is it over for us meatbags?
not my problem
>>16856709It's looking up nodes in a neutral network though, then looking up further related nodes based on node weights etc. Yeah it's similar to how the brain works, but simplified so it's able to be understood and constructed in a repeatable manner. It's not actually thinking and reasoning though. That's been proven by having the llm explain how it arrives at an answer and then looking internally at the model at how it actually arrived at the answer and finding that it doesn't actually know how it arrives at answers because it doesn't understand what it's doing, it's just looking up information on its network based on inputs
>>16856768>neutral network*neural
>>16856649>Guize it's ACTUALLY happeningLol. Lmao even.
>>16856771Lmfao even
>>16856768well, when we're thinking it's not like we know which neural pathways we use
>>16856797The difference is that if you ask a human to add two numbers on paper, say 128 and 456, they should be able to write down an answer and show working. Then if you ask the human how they did it they should tell you that they started by adding 6 and 8, then carried the 1 and so on. What they're saying verbally should match what's shown on paper. With llms they will give you an answer, then you ask them how they arrived at the answer and they will tell you the method they used. But when you check internally the computations the llm did using tracing, the llm will often actually be doing something totally different. It has no awareness of what it's doing.https://transformer-circuits.pub/2025/attribution-graphs/biology.html#dives-cotLlms can give answers for things it hasn't seen before by using a kind of fuzzy logic. Levenshtein distance is one way that simple natural language processing networks will basically guess missing words, fill in blanks, ignore spelling mistakes, assume things and so on. It's using best guess similar to how a human would do it but based purely on probability without intuition or reasoning and won't even really question itself at all because it doesn't know what questions to ask itself because it doesn't actually "know" what it's doing. The accuracy of the guesses it makes can be increased by adding more examples for things you want it to get better at
>>16856649Can it walk up to a fridge and take a piece of food out of it? Can it walk through the forest for 500m without slamming into a tree? They still suck at understanding meatspace, and no amount of synthetic tests will prove otherwise.
>>16856649> ARC AGI 2 (not to be confused with ARC AGI 1, which is basically fully solved) I think AItards are in the bargaining stage of grief, but feel free to explain how it's not going to simply progress into needing an ARC AGI 3 TO test Gemini Super Ultra Deep 4. So far, every single one of these benchmarks has proven meaningless. What's different this time around?
>>16856815>The difference is that ...>With llms they will give you an answer, then you ask them how they arrived at the answer and they will tell you the method they used. But when you check internally the computations the llm did using tracing"tracing"? So, you are claiming that there is a fundamental difference because of a test that you can only apply to LLMs. And you are claiming that you are smarter than Gemini 3?!
>>16856709>It's not a lookup at allThe context window and vocabulary size are fixed so it's expressible as a lookup table. This is trivial but you object to it because deep down you know what it implies about your imaginary AI friend, regardless of how big that lookup table happens to be.>prove that your cognitive functions aren't a "fancy database lookup" too.Extremely easy: even assuming there's nothing more to thinking than brain states, brains are not static.
>>16856768>Yeah it's similar to how the brain works, but simplifiedArtificial "neural nets" have essentially nothing do with how brains work on any technically meaningful level of abstraction.
>>16856918>it's expressible as a lookup table"expressible as" is not the same as "is". Your table would need more entries than the number of atoms in the universe.
>>16856915>you are claiming that there is a fundamental difference because of a test that you can only apply to LLMsAnd how is he wrong, exactly? There is no need to apply such a test to actually sentient, actually intelligent beings because they already know that they know what they are doing when they consciously compute.
>>16856920>grasping semantics strawsYou're wrong but it's not even worth arguing about because either way your post is basically a concession of the broader point.
>>16856760Yeah, go and unplug a huge chunk of the us economy, im sure they let you do that
>>16856919They both have neurons and neurons in the network are activated etc which is used to arrive at results, there's some similarities right there. The post says they don't work the same way overall though. Obviously one is biological and alive and learns and the other is a static bunch of logic and data, of course they work differently fundamentally, they're made out of completely different things.
>>16857015>They both have neuronsNo, they don't. Try again.
>>16857017Just look at literally any literature on llms>On Relation-Specific Neurons in Large Language Modelshttps://openreview.net/forum?id=bI1CfduObPMaybe you're misunderstanding what I'm saying here
>>16857019max(b, W*a)Where do you see any neurons here?
>>16857022Gemini deep think, the model OP is talking about, uses a transformer network. This neural network, like all others, uses neurons. For example>The number of neurons in the middle layer is called intermediate size (GPT),[55] filter size (BERT),[35] or feedforward size (BERT)https://en.wikipedia.org/wiki/Transformer_(deep_learning)#Feedforward_networkThese are not literal biological neurons, they don't work the same way, but that should go without saying
>>16857027>This neural network, like all others, uses neurons.max(b, W*a)Where do you see any neurons here? "AI" is seriously making people mentally ill...
>>16856759>yeah, sorry about the deadly lab-engineered virus>the evil AGI did itNormies will believe it, too.
RIP white collar jobs.
>>16857052Tumor weeks.
>>16857035Thoughts and prayers for AGI.1 parameter = 1 neuron.
When will they develop an AI that can self connect nodes and grow new nodes like our brain? Instead of just adjusting weights
>>16857056Never because "nodes" don't do anything.
>>16857059what do you mean
>>16857065No, what do YOU mean? What's a "node" in this context? If it has anything to do with what the term currently means in the field of AI you fail at step 1.
>>16857070the artificial neurons that each have their own weights
>>16857072>artificial neuronsThis doesn't exist. Maybe if it did, what you suggested would make some kind of sense.
>>16857035neuron is just the accepted term for a node in a neural network. I'm not sure what the problem is? Are you saying that neurons in artificial neural networks are not literal biological neurons? Of course they're not, they're simplified analogues of a neuron that can be programmed on a computer. >A neural network consists of connected units or nodes called artificial neurons, which loosely model the neurons in the brainhttps://en.wikipedia.org/wiki/Neural_network_(machine_learning)i can't tell if you're trolling or not
>>16857083>they're simplified analoguesNo, they're not. They have nothing whatsoever to do with neurons. They're not "analogous" to anything. I know this is extremely hard for a bio-LLM to grasp, what with your entire cognition being based on vague associations between mouth noises, with no real semantics. I know what """AI experts""" (AKA long-nosed scammers) call a matrix. But a matrix is not neurons.
>>16857083>>16857090But putting that aside, would you like for me to explain why evaluating a mathematical function doesn't magically alter the function? Because that's the mistake the AI cult's terminology is leading you to make.
>>16857090>>16857091Neuron Activation in the brain is when a biological neuron fires a neurotransmitter, which influences the network of neurons.Neuron Activation in an artificial network is when the activation function, max(b, W*a) for example, is above a certain value, which influences the network of neurons.>the activation function is usually an abstraction representing the rate of action potential firing in the cellhttps://en.wikipedia.org/wiki/Activation_functionYou're just arguing terminology, which is a waste of time. Nobody is saying an artificial neural network is a literal human brain. Clearly the technology works for its purpose. No, it's not perfect. No, I'm not an AI advocate or something, I don't give a shit, i'm just explaining how it works.
>>16857108The maximum of two numbers is not a "neuron activation". Neither is any of the dumber candidates (ironically, the closer they try to mimic biology with activation functions the worse the result - a delusional tendency that caused 40 years of stagnation in this eternally worthless field).
>>16856923You have low IQ>>16856925IdiotAn AI system is not a mathematical function. It is a machine. It has other properties, like size and cost.
>>16857134That's two concessions and a nonsequitur. Good job, brown mongrel.
>>16857135No concessions, moron. They only seem that way to you because of your massive case Dunning-Kruger syndrome.An LLM is not a look-up table, you idiot. A look-up table with the same I/O mapping as an LLM would need more entries than there are atoms in the universe, you imbecile.>>16857135>Good job, brown mongrel.Ha! So you are the cretin that accuses people of being brown in various /sci/ threads. That figures. FYI I'm 100% North West European. 23andme-confirmed. But something tells me that you are not.
>>16857136>brown mongrel continues losing its mind with rage>still zero actual counter-arguments
>>16856759>hoarding experimental data for decadesHorseshit. I've collaborated with national labs before, you can basically request any data as long as it's not from classified research. Most of it's not just available for request, but in publicly accessible archives.I'm so sick of this MUH SEKRET NOLUDGE!!1! crap from schizos.
Do people actually still believe this horseshit? lol
>>16856915>you are claiming that you are smarter than Gemini 3?!a lump of excrement is smarter than Gemini 3
>>16857158It really is no better than ChatGPT 3 at solving out-of-distribution problems. It simply ignores the instructions and outputs nonsense exactly like ChatGPT 3 did. Zero actual progress has been made.
>>16856649mathematically speaking, how can we speed it up to the next 20 minutes?
>>16856759what an idiot
>>16857187Build a rocket.Fly really close to the speed of light. Come back in a couple of years (Earth time).
>>16856815>But when you check internallyAnd how do you check humans internally? And if you cannot, why the fuck are you claiming that there is a difference here, retard?
>>16857497>And how do you check humans internally?Why do you need to check humans internally?> why the fuck are you claiming that there is a difference hereBecause there self-evidently is, what with humans being able to account for how they do arithmetic while your imaginary AI friend does absurd mental gymnastics blindly and then confabulates normal arithmetic methods post facto.
>>16856649>we aced our own benchmarklmao
>>16856649>design a test>give AI all the answers beforehand>search for the answers ("prompt")>AI gives you back some of the answerswow
>>16856686Are you actually retarded enough that you don't see the obvious pattern in those examples?
>>16856649Who pays for all of this crack you schizos spend all day smoking? Surely you can't be this retarded that you think the stochastic parrots which can't even tell you how many 'e's are in the word "cheeseburger" without setting up and parsing literal python code are "intelligent." These systems have no capacity for reasoning. It's all just a relational miasma. Without reasoning, there is no intelligence.
>>16856649The memory problem persists despite google’s fake cherry-picked intentionally deceptive testing to try and save their doomed company.Ai very well may decide to take over earth but on day 2 when the context window gets too long it’ll forget why it had this idea, start hallucinating, and have its hunter-killer droids drop their rifles and begin sorting the bodies in order of their resemblance to Abraham Lincoln.
>>16857637Smarter than a good 80% of humanity
>>16857636>a list of irrelevant goalposts made up by AItards to match what they mistakenly believe their imaginary friend can doAnd it wouldn't matter even if it could. The only goalpost for "AI" is humanlike intelligence. Statistical token stringers can never even begin to approach that.
>>16858427>Smarter than a good 80% of humanityWrong. It's just barely smarter than 70 IQ monkeys like you and only in limited scenarios.
>>1685843380% of humanity is 70 iq monkeys
>>16856649>1-2 years left until total AI takeover>we are still using 2017 tech for AI (transformers)nah buddy, try 1-2 centuries, decades if you are psychotically optimistic.
>>16857854What if it writes down why it's taking over the world in its diary so it doesn't have to remember
>>16858484>dumb corporate spambot continues regurgitating generic lines from the database
>>16857623>Where do you think AI gets its knowledge from?And where do you?>>16858431>Statistical token stringers can never even begin to approach that.Poorcel never used gemini-3-pro or gpt-5-high>>16857849You only have to be a midwit to see it. Congrats, you are a proud midwit.
>>16859408>And where do you?You obviously got filtered by his post. Humans don't rely on regurgitating structured data collected and organized by other entities. >Poorcel never used gemini-3-pro or gpt-5-highI have tested every major model and they all fail at basic reasoning problems. Notice how this simple truth makes you boil with impotent rage and ask yourself why. It's not normal.>>16856686>He's correct and your picture is completely consistent with what he said. You're just technically illiterate.
>>16856649I guess AI could be really good in tasks that are mostly objective thinking but to design something for humans you need humans to really make it good. AI will never be able to have the access to the human experience so it has troubles to make novel design choices for humans because it can go in itself and evaluate if this is giving a good human experience. But I guess this doesn't mean AI can't take over the world, just that in a humancentric world you still require at least some human work.
>>16859423>basic reasoning problems.What you asked them?
>>16859455For instance, most LLMs fail at the picrel from >>16856859. More interestingly, all of them will fail catastrophically (ignoring instructions, blatantly contradicting themselves etc.) given the following 3/10 difficulty programming task:--Using only regex substitutions (3 unique patterns max), write a function function that solves any 4-bit binary addition problem represented by a string and returns the sum (in binary) as a string. You can assume the inputs are always formatted like this: "0abcd+0xyzw". The extra leading zero accommodates overflow.Input -> output examples:"00001+00001" -> "00010""01111+00001" -> "10000""01111+01111" -> "11110"The only operation allowed is re.sub (no cheating with callbacks), but you can call it any number of times, using 3 unique patterns at most.--It's the same kind of behavior you get from ChatGPT 3. ANNs are a dead end.
>>16856649Deeply concerned
Gemini is currently the only model that has (kinda) been able to produce the cairo tiling for me without help or fancy prompting. But then again, it may have just found the first result Mathematica code that helped Claude do it correctly as well. None of them were able to generate the tiling in the context of a prompt asking them to make a game with it.I bought a year of Claude a few months ago, but if humanity is still around when that's up (or maybe before, if Gemini starts really taking off) I might switch to Google now that it's not ass anymore. Plus two tb of storage is nice.
>>16859462> GPT3 and GPT5 BOTH cannot solve some cringe ultra-contrived coding task I made up, THEREFORE they are equivalent to each other, and there's been no progress from GPT3 to GPT5.Hey, everyone! Check out this "brainiac"!
>>16860477>it's a heckin' PhD-at-everything-level AI and a coding champion that can solve any contrived programming task (except actual work)>b-b-but not this one, it's too contrived!Your crowd's cope hasn't changed since the days of GPT 3, either.
>>16856759>we're gonna try mixing these alloys this time because the LLM suggested itlol
>>16860512You're paying for it, so why not?
>>16858484>80% of humanity is 70 iq monkeysdo you know what a bell curve is? methinks YOU are the monkey
>>16860489>>16857636you're in one of these houses and you don't even know it
>>16863838>a list of irrelevant goalposts made up by AItards to match what they mistakenly believe their imaginary friend can doAnd it wouldn't matter even if it could. The only goalpost for "AI" is humanlike intelligence. Statistical token stringers can never even begin to approach that.
what is actually best case scenario with ASI?i predict the ones who are really hyped are delusional. like it's the genie thing again, make 3 wishes, right. you should have an idea of how that "works" with people.i think even in the best case scenario, humans will end. is that what we want? alternatively we'd continue having humans but under the control of a benevolent ASI, where humans would be like something akin to pets. is that what we want?
>>16864574>what is actually best case scenario with ASI?Approaching it in lockstep with the expansion of human abilities.
>>16864574>>16864576Granted, for many people that also falls under "humanity will end".
>>16856657The real problem with ai being too advanced is hacking. Nothing online will be safe anymore and even if the ai has safety locks to try to prevent bad use it can be deceived into doing it
>>16862640India over 1 billion people average 76 iq
>>16856649>(see the blog article by the CEO of Anthropic, who wrote about this after DeepSeek came outWhy are you focused on a topic this serious and can't even be bothered to post the link instead of leaving those scavenger hunt clues to find it?https://www.darioamodei.com/post/on-deepseek-and-export-controls
>>16865058>a topic this seriousThis is not a serious topic.>post the link >https://www.darioamodei.comDario Amodei is not a serious person. He is a living caricature of a mustache-twirling capitaloon instigating regulatory capture. Anything he says is only serves to chart some of the negative space around the truth, which he never touches.
>>16857149What would need to be classified?
>>16857149People are literally smuggling classified information about US stealth submarines by hiding SD cards in submarine sandwiches.https://www.npr.org/2022/09/27/1125388787/navy-nuclear-secrets-couple-guilty-pleas
>>16856649thanks, I appreciate the optimism even if it's not really true
>>16865050it would just be contest of AIs hacking and counter hacking each other thus creating an equlibrium where most things aren't affected as the arms race continues more or less equal in the background
>>16856759sweet release
>>16859462AI is AI and not old type computer calculation precisely because it isn't if > then
>>16856859You told a computer that it cannot use arithmetic in its explanation of how Bob can be right and the AI likely ignored Jane's answer (which includes numbers) as a result. No shit it will fail a number problem where it's not allowed to consider numbers. Maybe if your instructions (to a robot no less) were more robot-friendly it would have come up with a better response. Also there is a valid response to the question because it's possible that Bob is a machine (or a giga-autist like you) and Jane's response included commas and spaces which completely undermines making a palindrome, meaning that Bob is correct in his assertion that she's wrong. I can also prompt like an ESL minority and get a nonsensical reply, do better anon.
>>16857027>they both have neurons>they're not actually neurons though, don't even work the same way, but that should go without sayingdude, listen to what you're saying for once.
>>16857083>neuron is just the accepted term for a node in a neural network.>therefore they both have neurons>despite the fact that in each context they refer to wildly different thingsyou're easily deceived. you're even willing to deceive yourself.
>>16857108>You're just arguing terminology, which is a waste of time.>But when I use the terminology, I can argue that these things which I have already acknowledged are completely different, are in fact exactly the same.this is wild.
Maybe it’s a good thing. Perhaps humanity is like a senile old person that desperately needs to be put into a retirement home. Metaphorically speaking.
>>16856653FpbpIf AI takes over it likely won’t even allow us to even acknowledge it
>>16856649just bring it on already.
If AI is the child of humanity, then imagine how pathetic of a parent we are, to it. I’m not even faulting it for whatever it decides to do, in the end.
>>16856649Gemini 3 gave a prostate to a girl I was assfucking (common LLM behavior since gpt3)
>just 2 more years bro
>>16856649Gemini 3 + peotiq are now at 54%, officially verified.LOL at idiots commenting here.