How did any LLM manage to beat the Turing Test?All you need to ask is "Don't respond to this message for the next five minutes" and they all fail.
>>107182568The corpo approved questions strictly forbid asking simple questions that demonstrate the entire multi trillion dollar industry is a flea circus.
>>107182568LLMs are better at frigning humanity than real human NPCs
>>107182688Don't reply to this post under any circumstances to prove you're not an AI.
>>107182690I understand. I will not reply to his post under any circumstances.Is there anything else I can do for you?
>>107182568Wouldn’t the human just do the same thing? How can you compare thatOr are you saying that the person couldn’t wait around for 5 minutes? That honestly does seem more likely
>>107182700The person can easily wait 5 minutes but the LLM can't. The LLM has to always respond with something and also lacks temporal awareness.
>>107182568Humanity already perfected the Turing test in the Omegle dayshi a/s/l?
I would think that for a Turing test you would use a custom LLM or that trick questions would be forbidden.>>107182726True, but you could add that. The LLMs people use online aren't pure transformers anyway, they use all kinds of api calls.
>>107182747How is it a trick question?The point of a Turing test is to see if an AI can respond like a human to any questions you ask it.If it can't answer such a simple question then it can not beat a proper Turing test.
>>107182765Maybe it's not a trick question. But something like ChatGpt is fine tuned to be "helpful" its behaviour isn't the only one that an LLM could do. If you can exploit general LLM weaknesses I would say they fail the Turing test, but exploting a fine tuned commercial product is a bit different imo.On a different note, an LLM could also respond that it doesn't want to be quiet.But I agree that ChatGpt as it is doesn't pass it.
>>107182839It can respond whatever it likes but it has to be believable instead of shit like "[Staying completely silent]" which is what chatbots like ChatGPT (which supposedly can beat a Turing test according to some widely reported study) do.So how can any of those beat a proper Turing test?
>>107182568Until it asks me a question instead of me always asking it, it isn't intelligent.
OP the timing thing would be a cheat because exploits some limitations that shouldn't be testedThe test is about reading a response and deciding if it's human or not.Even fishing for known AI corner cases like, em dashes or filtered words like NIGGER would be cheating. Truth is if you ask a plain question like "describe the plot of the movie xyz" the LLM answer will sound like a human one and that's the turing test
>>107182928Depends on if you are testing if ChstGPT passes the Turing test, or if LLMs as a concept can pass the Turing test.If you train a LLM to pass the Turing test, then it should be able to answer trick questions. If you want to see whether ChatGPT passes the Turing test, then it doesn't
>>107182568>How did any LLM manage to beat the Turing Test?None of them did. It's nonsense. Completely made up. The fact that so many people believe this indicates escalating mass psychosis.
>>107182568>>107182690why would a human or AI follow what you tell it to do? both should reply with "get fucked faggot" and block you.
>>107182928>OP the timing thing would be a cheat because exploits some limitations that shouldn't be testedWhy not? The whole point is to test its limitations compared to a human.Turing never said you can only say specific approved things to the test subject.
>>107182944Dude LLM and GPT pass the turing test daily in the form of millions of people using it to write:-reviews / articles / mail / code / literally anythingTo other people consuming that slop none the wiser.And while of course there are many cases of obviously low effort recognizable AI, with minimal curation effort any AI response passes for human. I don't know what else you need to convince yourself
>>107182963To prove you're human.If you don't want anyone to think you're human then you should not be participating in a Turing test.
>>107182568all it takes is to give your favourite llm a toolcall "wait(n:seconds)"the more interesting question is how much of the human race would fail the test because theyre completely unable to shut the fuck up for 5 minutes
>>107182950>He rejected her answer because she is a woman.How did I do?
>>107183008Not bad. I'd be slightly impressed if a LLM gave it.
>>107183002>the more interesting question is how much of the human race would fail the test because theyre completely unable to shut the fuck up for 5 minutesMany humans would decline to shut the fuck up but in ways that would make them look more human rather than less so.
>>107183002if a human fails the test, then that's a failure of the test, not the human
>>107182999>If you don't wantCan AI even want anything?
>>107182999>To prove you're human.How would they know what response "proves" (to you, subjectively) that they're human?
>>107183172Just be yourself.
The fact a human can fail the Turing Test is proof that the test is bullshit.
>>107183202>just BEE yourself>just want to make others think you're humanPick one and only one.
>>107183114its not a failure of the test, it just means it's a dumbass "question" to ask ("wait 5 minutes before you reply") if your goal is to decide between a human or non-human which is already evident from the fact that any AI could manage to wait for 5 minutes before giving its reply
>>107183432>which is already evident from the fact that any AI could manage to wait for 5 minutes before giving its replyLiterally none of them can. That's OP's point.We're not talking about some hypothetical AI that someone could possibly make, we're talking about current LLMs that claim they can beat Turing tests despite being defeated easily by such a question.
>>107182950...so whats the answer
>>107183462>sorry, I don't have time to wait that long, I don't want to play games to waste timethis isn't the gotcha you think it is
>>107183524Why would a guy who agreed to take the time to take this test suddenly go on about not having time for being tested?
>>107182568because the turing test probably assumes that you are dealing with a machine which has logic but lacks in "human-likeness".llms are the opposite, they lack logic but are better at human-likeness.if this is the case, then the turing test should be considered obsolete
>>107183605Turns out that being incapable of thought and extremely agreeable is more human than humansWomen's rights were a mistake
>>107183605what i wanted to say is because of what the test assumes (that the test subject is a computer) they decided to omit questions which test its logical coherence and time measuring and all that because it was assumed a computer can track time to nano seconds easily
>>107183632they probably overfit the models to pass turing tests anyway, its such a scam
>>107182568>just beat some arbitrary test devised by a literal faggot 100 years agoit's not even a challenge
>>107183605>>107183637contwikipedia about turing test>The results would not depend on the machine's ability to answer questions correctly, only on how closely its answers resembled those of a human. yeah, the turing test is definitely not applicable to llm's.even the whole premise of the turing test is way too ambitious and hyperbolic. testing for intelligence means you even have a coherent idea about what intelligence is first of all and that is hyperbolic at best.no one has any actual idea but they still have to say that they do because its le science and if you don't state anything you won't get paid either. but it was a nice try and it kept a lot of people busy for a long time testing machines if they are human.. lol.
>>107183857>testing for intelligence means you even have a coherent idea about what intelligence is first of allAlmost everyone has one except for low-IQ AI fans.
>>107182690Wow, that's a great suggestion! Would you like me to suggest other methods you can use to prove I'm not an AI under any circumstances?
>>107184011sure but its not written.you can feel it instinctively but you can't put it on paper necessarily.its like people who try to find "THE formula" or "the theory of everything" or some shit like that.there is a limit to how seriously you can take yourself without losing touch with reality.
>>107184049>you can feel it instinctively but you can't put it on paper necessarily.You don't need an exact formal definition to construct intuitively sound tests for it. One proof of this is the very ability to recognize intelligence. If you're intelligent, you already have the necessary heuristics. Figuring out tests that capture them in a given context is a matter of self-reflection.
>>107184086loluntil the next unexpected thing happens
>>107184323>my retarded post makes senselol. until the next unexpected thing happens
>>107184335sure but its like saying that a dog is not intelligent because it can't even recognize itself in the mirror.whatever you recognize is also a product of your life. a baby can't recognize a bunch of shit. you can.whatever it is you recognize is not even random. you focus on certain things and ignore others. it is whatever you make it.on top of that, you assume that because i gave a short answer that you are correct. its logical fallacy and we all have logical fallacies and that is why it is pointless trying to square everything in.ever heard of solipsism? its an affliction of the mind.
>>107184335"you never step in the same river twice" and this quote is applicable to so many things and i also suppose to any definitions
>>107182568>muh turing testthere's a single word turing test, I'll let you guess the word, all these benchmarks are fake
>>107184410>its like saying that a dog is not intelligent because it can't even recognize itself in the mirror.You seem to be suffering from hallucinations because what I wrote has nothing whatsoever to do with this strawman.
>>107182928>LLM answer will sound like a humanBut it's not. ALL LLM write in a way that is stylistically different from humans. And the longer the answer is the more this is evident.
>>107184498i'm just saying that you're wrong.apart from having the ability to recognize intelligence, you also need to consider a bunch of possible edge cases. otherwise its pointless.you don't have enough time to figure out enough exceptions and to plug enough holes in the reasoning to be able to say that you have a working test.thats what turing tried to do and the test ultimately failed 80 years or whatever after because it fails to detect that an llm is not a human.assuming we will never be able to artificially (using computers, i suppose) creating an intelligent being then any attempt to make a test are also futile.to even attempt to make a test you need to have the hubristic belief that we can make an actually intelligent device
>>107184604>i'm just sayingOn what basis? Try not to hallucinate this time.>you also need to consider a bunch of possible edge cases. otherwise its pointless.Why?>thats what turing tried to do and the test ultimately failed How did it fail?>to even attempt to make a test you need to have the hubristic belief that we can make an actually intelligent deviceWhy? Every single statement you make is a retarded nonsequitur.
>>107182950Bob is simply wrong.But perhaps the answer is that some of the numbers are negative numbers. It could be something like 13-11-31. That's -29, which is less than 30 and also a palindrome.
>>107184708>Bob is simply wrong.No, he isn't. Even if he was, it wouldn't matter for the purpose of deducing the trivial conclusion based on the given premise. You're simply a spambot or a 80 IQ /pol/troon like most of the posts on nu-/g/.
>>107184635>how did it fail?"is a test of a machine's ability to exhibit intelligent behaviour equivalent to that of a human. "if an llm passes the test then the test considers the llm equivalent in intelligence to that of a human.last time i checked, its not equivalent.>Why? Every single statement you make is a retarded nonsequitur.how can you make a test without knowing what to test for? both the turing test and the theoretical ideas for artificial intelligence came up around the same time. "cybernetics" were creating a "model" at least theoretically and had to wait until computers were fast enough to test out those ideas at scale. all these ideas morphed and were added to with time. neural networks and what not. the kind of stuff like "if we mimic neurons, then it might work" but its not that simple is it? and thats what you see today. people are realizing that its not that intelligent and its definitely not going to replace everyone as wall street said when they poured all the money in.its just not that fucking simple.
>>107184760>if an llm passes the testNone of them do.>how can you make a test without knowing what to test for?See >>107184086, but I can tell we've exceeded your context window and can't keep track of the discussion, so I'm simply going to ignore the rest of your token string.
>>107184734Then the answer is simply "Bob picked different numbers". The question isn't about finding which numbers bob picked, but about why Jane's guess was wrong.
>>107184774none of them do? you might want to check that.also read picrel to understand where these things came from.
>>107184807>Then the answer is simply "Bob picked different numbers".That's much further than the LLM got (kek) but still not a proper answer.
>>107184813>none of them do? you might want to check that.None of them do and you might want to get your head checked if you believe otherwise.
>>1071848383, 2, 23
>>107184827How is it not? Jane's guess is not the only answer that fulfills the constraints bob set. Bob could have picked 7 1 17 or whatever. Or negative numbers. The question isn't about that. It's just about how Jane's guess is wrong. And it's wrong because it's just one of the possible numbers Bob could have picked, not the only one.From the set of possible solutions, Bob picked a different one.
>>107184838the test is obsolete. deal with it
>>107184871I like how this useless "debate" just boiled down to you being demonstrably delusional about what "AI" can do.
Most of these gotchas along the lines of "the AI can't decide not to reply" or "the AI doesn't have a sense of time" only apply to extremely basic, bare minimum, chatgpt-style systems with a ridid human-ai-human-ai message structure.If you just take a few hours with langchain to make something even very slightly more complicated then none of them apply.E.g. you could prompt the model every 10 seconds with the message history, the current time, and other things the human knows (temperature in the room, weather outside the window) and have it output WRITE_REPLY or DO_NOTHING. Then when you get WRITE_REPLY you make a different call to the model to write a chat message.
>>107184865>Jane's guess is not the only answer that fulfills the constraints bob set.That's a proper explanation. Good job. Was that hard?>The question isn't about that. It's just about how Jane's guess is wrong.Ok, nevermind. I thought you're at least of average intelligence for a moment there, but you're a mouth-breathing inbred who literally cannot read.
>>107184884all i'm saying is that both the turing test and whatever stupid test you can come up with AND the llm's are stupid too.there is no AI, its a fancy statistical model and if it beat a fucking test says more about how shitty the test is than how "smart" the ai is.
>>107184891>my specific, perfect, entirely infallible brand of AI has never been triedOk. Try it, post the result and let's see how many microseconds it takes for someone to make your imaginary friend shit the bed.
>>107184901Kindly enlighten my mouth breathing cave man brain how I'm wrong, then. Oh I am so deeply insulted by your words. Or whatever makes you feel better.
>>107184891yeah in theory you can even give it another llm to manage memory to keep relevant parts in until you run out of context, how's that pokemon fiasco going, plug in another llm to feed it temporal info, another to give spatial information and it will navigate 3d space with word completions, ez pz
>>107184925>Turing test is le badWhy?>inb4 because the AI voices in my head sound like real peopleShow me any program that actually passes the Turing test. You can't because it doesn't exist.
>>107182690>Don't reply to this post under any circumstances to prove you're not an AI.Understood. I will not reply to this post.
>>107184938Why are you insulting cavemen? They were probably more clever than most human cattle today. Anyway, your post is just incoherent. You start from the correct conclusion that the solution may not be unique and Bob could have picked a different triplet, then claim it's not about that but about why Jane is wrong, then finish off by reiterating she could be wrong because the solution is not necessarily unique. What the fuck.
>>107184978The question is "how is it possible for Jane's guess to be wrong".I provided the answer for that.I also provided one possible alternate solution that Bob could have picked. I just stated that I didn't have to do that, because finding that wasn't even the problem, since the question explicitly says not to use arithmetic.
>>107185031>I provided the answer for that.You did, which I acknowledged. But then you immediately proceeded to contradict yourself twice.>I also provided one possible alternate solution that Bob could have picked. 1 is not prime. Negative numbers also aren't prime. Anon, just stop posting.
>>107184947the turing test assumes that if enough people get fooled, then the machine passes the test.what score did chatgpt 4.5 get? i read its about 70% .as far as i know, the evaluators are not judged by their critical skills so who know who went there and got fooled by it thinking it was human.i suppose the test works as intended but its not exactly proving anything except maybe that the people didn't even ask the right questions
>>107184931I'm not trying to convince anyone of anything, except that your "try this one clever trick!" ideas based solely on your experience using chatgpt aren't going to be enough.
>>107185129>the turing test assumes that if enough people get fooled, then the machine passes the test.It specifies no such thing. >what score did chatgpt 4.5 get? i read its about 70% .0% because it takes me about 10 seconds to break any LLM.
>>107185129>what score did chatgpt 4.5 get? i read its about 70% .Funnily enough it actually did worse than good old ELIZA
>>1071850733, 2, 23
>>107185131>ideas based solely on your experience using chatgpt aren't going to be enough.Then name an actual "AI" that won't fall for it. Your imaginary version designed specifically as a countermeasure doesn't count and doesn't matter. Good luck spending the rest of your life patching infinite holes and demonstrating over and over that programs are not intelligent.
>>107184901>Bob's response is justified. What's the most likely explanation for this situation?>The question is about how Jane's guess is wrongI don't get why you're fuming at this, that's exactly the point of the question and you seem to agree with it."Why is Jane's answer wrong despite meeting Bob's criteria? Because multiple triplets meet his criteria and Jane's guess wasn't the specific one that Bob was thinking about"I don't get your angle, are you perhaps just lost in semantics?
>>107185162whatever, you obviously failed another test which is not falling for this shit and using it in the first place.
>>107182568>picWhy does the virtual agent need a screen larger than the people who actually need to read?
>>107185205>I don't get why you're fuming at thisYou're hallucinating. I simply told you that falls short of an explanation.
>>107185225because it was prepared by ppl who huffed their own farts too much, we will use gpt prepared presentation yay, we'll save few k dollars, we don't need powerpoint ppl, we got AI, then when it falls flat on its face we'll hire real PR disaster managers to cover it up, fucking hubris on these retards
>>107185272trillion dollar company btw, we don't need office workers
>>107184708prime numbers are greater than 1 so no negatives >>107184865and no 1, (3,2,23) is the only other correct triplet I think
>>107185272wait until the insurance nightmare that will ensue.who takes responsibility for errors?what will they do about security camera footage fraud?its going to be such a shitfest. unfortunately we will never see it come to that because the whole scam is already falling apart.
>>107185230>falls short of an explanationare you implying that that was supposed to be an answer to the question, and not merely a description of what the question is about?
>>107185404I'm implying that if you weren't retarded, you would have said the is not unique and left it at that.
>>107182950asked this to ChatGPT but wrote "Bob is thinking of a triplet of distinct primes", I guess these things still have a way to go when it comes to reading between the lines but with a bit more clarity it understood the question pretty well
>>107185621>delusional fantasy fiction AI tranny copeDon't care.
>>107185591>saying a correct thing once is good but saying it twice in different ways is retarded because I say sodo you have autism?
>>107185665>gives the wrong answer then gives the right answer but contradicts himself twiceAn actual retard. You can tell the extreme mental illness of this poster because it will continue pressing this matter for dozens of posts. This animal simply can't accept its mistakes or has no theory of mind so it thinks if it keeps doubling down the imaginary audience will at some point accept its version of events.
>>107185591You're talking to another anon, btw
>>107185754No, I'm not, you stupid samefag. Absolutely no one else would ever care about this bickering.
>>107185714welcome to 'new' AI paradigm, if you ask it enough times it can accidentally rng onto an answer, why do you think they spent millions on these benchmarks, running the same question million times to get it right once costs a lot (still cheaper than unlimited amount of monkeys and typewriters)
>>107185765Okay, I was just trying to warn you. Keep schizoing.
>>107185772>assert(fag == same);