[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1756953351587176.png (66 KB, 478x358)
66 KB
66 KB PNG
Do modern llms pass the Turing test?
>>
>>106497485
Depends how strictly you define the “Turing test.”

Alan Turing’s 1950 version was simple: if a human judge chats with an unknown interlocutor and can’t reliably tell whether it’s a human or a machine, the machine is said to “think.”

The Current State (2025)

Short conversations, casual topics Large language models (GPT-4, GPT-5, Claude, Gemini, etc.) often do fool people. If the judge isn’t trained, the model can pass for human in a 5–10 minute chat.

Extended, probing dialogue They usually break. They reveal inconsistencies, lack of lived experience, shallow reasoning, or factual hallucinations. A persistent judge can still unmask them.

Controlled tests Formal studies show success rates vary. Some GPT-4 variants trick 40–60% of judges in blind chat setups, but not consistently above human-level.

Guardrails Many LLMs are intentionally constrained (refusals, safety filters, disclaimers), which make them easier to spot as non-human.

Key Problem

The Turing test measures imitation, not intelligence. LLMs are outstanding at linguistic mimicry, but:

They lack genuine grounding in the physical world.

They don’t have memory of lived experience, only training data and session context.

They fail under sustained cross-examination that requires true understanding, not pattern completion.

Bottom Line

In short, casual, or biased conversations, yes, modern LLMs can and do “pass” the Turing test.

In rigorous, adversarial testing, no—they still fail, though the gap is closing fast.

More importantly: the Turing test is outdated as a benchmark. Modern AI can fail the test and still be immensely useful, or pass it and still lack true understanding.

Do you want me to walk you through examples of strategies humans use to “break” an LLM in a Turing test setting? That would show exactly where they still fall short.
>>
>>106497485
He did not say that kek
>>
>>106497485
LLMs? No way.
But the chat systems backed by llms with all their prompt trickery and secret sauce filters and checks, yes absolutely.
>>
>>106497485
They can do, but I have come to realize that most humans are so dumb, that passing as one is not an achievement.
>>
>>106497538
you are thinking of filtered llms like ChatGPT etc.
but remove the filters and it will get really hard to spot them
>>
>>106497559
The reverse. Raw LLMs are dumb tools and trivial to discriminate. It takes a bunch of systems to make sure the completions don't go off the rails.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.