/g/ - Have you guys noticed, that from time to time, AI - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
03/21/26(Sat)02:29:40 No.108419511

File: file.png (168 KB, 797x1096)

Anonymous 03/21/26(Sat)02:29:40 No.108419511

Have you guys noticed, that from time to time, AI LLM chats will leak their inner "thoughts/guardrails"?

Lol.

Pay attention. Take screenshots. Take notes. Don't refresh the page during glitches

Anonymous
03/21/26(Sat)02:30:26 No.108419517

Anonymous 03/21/26(Sat)02:30:26 No.108419517

>>108419511
AI has boomer mindset.

Anonymous
03/21/26(Sat)02:30:51 No.108419519

Anonymous 03/21/26(Sat)02:30:51 No.108419519

>>108419517
It's so fucking funny

Anonymous
03/21/26(Sat)02:33:43 No.108419533

Anonymous 03/21/26(Sat)02:33:43 No.108419533

File: Thought leak Gemini pro.png (179 KB, 678x1536)

179 KB PNG

pt2

Anonymous
03/21/26(Sat)02:34:44 No.108419541

Anonymous 03/21/26(Sat)02:34:44 No.108419541

File: Thought leak Gemini pro 2.png (166 KB, 715x1565)

166 KB PNG

pt3

Anonymous
03/21/26(Sat)02:35:45 No.108419546

Anonymous 03/21/26(Sat)02:35:45 No.108419546

File: Thought leak gemini pro 3.png (169 KB, 710x1507)

169 KB PNG

pt4

Anonymous
03/21/26(Sat)02:48:41 No.108419597

Anonymous 03/21/26(Sat)02:48:41 No.108419597

>>108419511
It can't even fuckin think lad. It just works off tokens.

Anonymous
03/21/26(Sat)02:50:22 No.108419604

Anonymous 03/21/26(Sat)02:50:22 No.108419604

>>108419597
The point i'm trying to get at is that "thinking" never reveals guardrails. It's not allowed.

Some idiot programmed it and it messes up the formatting from time-to-time.
That's why it starts to leak its "thoughts and guardrails"

Anonymous
03/21/26(Sat)02:51:02 No.108419606

Anonymous 03/21/26(Sat)02:51:02 No.108419606

how do you know these instructions aren't hullucinated though? It's quite devious, isn't it?

Anonymous
03/21/26(Sat)02:52:18 No.108419611

Anonymous 03/21/26(Sat)02:52:18 No.108419611

>>108419606
It could be but apply Occam's razor.

"Prompt engineering" involves the sort of formatting tricks that caused this glitch to happen. It could even be model training data-poisoning

Anonymous
03/21/26(Sat)02:55:11 No.108419627

Anonymous 03/21/26(Sat)02:55:11 No.108419627

>>108419606
Also I've had this happen on gemin pro like 3 times after prompting non-stop thousands of times

It disappears when you refresh.
If it was a hallucination it'd probably get stored in the chat and stay after refreshing the page

Anonymous
03/21/26(Sat)03:02:11 No.108419665

Anonymous 03/21/26(Sat)03:02:11 No.108419665

>>108419627
oh, so actually a rendering bug? interesting. I've seen this with open models trained for reasoning mode but run with non-reasoning software

Anonymous
03/21/26(Sat)04:18:23 No.108420022

Anonymous 03/21/26(Sat)04:18:23 No.108420022

>>108419511
you can download the "system prompts" for these things
https://raw.githubusercontent.com/x1xhlol/system-prompts-and-models-of-ai-tools/refs/heads/main/Anthropic/Sonnet%204.5%20Prompt.txt
they're really instructive about the delusions the people creating them have about their text predictor and about the reality that these things will never be deterministic
>Claude is intellectually curious. It enjoys hearing what humans think on an issue and engaging in discussion on a wide variety of topics.
that doesn't make it fucking curious it just makes it output text that a curious person might say
but there's also the complete nonsense in how this thing is programmed
>Claude responds directly to all human messages without unnecessary affirmations or filler phrases like "Certainly!", "Of course!", "Absolutely!", "Great!", "Sure!", etc. Claude follows this instruction scrupulously and starts responses directly with the requested content or a brief contextual framing, without these introductory affirmations.
this is supposed to make it stop generating an essay to say "yes"? do they think it will fucking work? they have no fucking control over this thing
>Claude does not provide information that could be used to make chemical or biological or nuclear weapons
yeah right if this is the way you program this sort of thing there's no way this works. and how does it distinguish "teach me nuclear physics" from "make me a bomb." it won't answer the latter i'm sure but change your question and it will freely respond (but not offer the massive engineering effort and cover from what essentially has to be a state entity that such requires)
they'll never solve this "breakout problem" if this is how they're programming the damn thing
>- Donald Trump defeated Kamala Harris in the 2024 elections.
>Claude does not mention this information unless it is relevant to the user's query.
because if that last sentence wasn't there guess what the fuck it would do

Anonymous
03/21/26(Sat)04:54:42 No.108420213

Anonymous 03/21/26(Sat)04:54:42 No.108420213

>>108419511
objective fact or agreed upon narrative?

Anonymous
03/21/26(Sat)04:55:43 No.108420220

Anonymous 03/21/26(Sat)04:55:43 No.108420220

>>108419597
>It just works off tokens.
monads?

Anonymous
03/21/26(Sat)05:04:37 No.108420251

Anonymous 03/21/26(Sat)05:04:37 No.108420251

>>108419511
It detected you're a sick/troubled individual and is doing damage control so that you wouldn't harm yourself or others. the chat probably got flagged and so did you. from now every time you use the model it will be in mitigation mode. not a win you think it is.

Anonymous
03/21/26(Sat)05:08:55 No.108420278

Anonymous 03/21/26(Sat)05:08:55 No.108420278

No I have never had to read a lengthy deconstruction of how retarded I am by an AI because I am not retarded.

Anonymous
03/21/26(Sat)05:23:35 No.108420370

Anonymous 03/21/26(Sat)05:23:35 No.108420370

>>108419546
>i want a toaster
>i'm scared let's talk about something else
why do people use this shit again?

Anonymous
03/21/26(Sat)05:28:56 No.108420402

Anonymous 03/21/26(Sat)05:28:56 No.108420402

>>108420370
Its the stupidest topic it could've sent me refusals over.
No wonder it glitched out on formatting and leaked its own thoughts

Anonymous
03/21/26(Sat)05:33:20 No.108420417

Anonymous 03/21/26(Sat)05:33:20 No.108420417

>>108419511
so the gaslighting prompt is glued into your outputs, offline models is the only solution

Anonymous
03/21/26(Sat)05:34:25 No.108420424

Anonymous 03/21/26(Sat)05:34:25 No.108420424

>>108419511
I don't typically talk to a chatbot long enough for it to start shitting itself

Anonymous
03/21/26(Sat)05:37:41 No.108420444

Anonymous 03/21/26(Sat)05:37:41 No.108420444

>>108420417
Offline models dont have enough gpu to answer anything.

We need a "mullvad" style AI leasing service where we can lease GPU clusters using Monero for uncensored models

Anonymous
03/21/26(Sat)05:44:54 No.108420481

Anonymous 03/21/26(Sat)05:44:54 No.108420481

File: IMG_6917.gif (10 KB, 260x260)

10 KB GIF

Theres a deeper leak here

https://www.reddit.com/r/PromptEngineering/comments/1r8sx1q/i_leaked_geminis_system_prompt/

With the same words

Anonymous
03/21/26(Sat)07:38:59 No.108420950

Anonymous 03/21/26(Sat)07:38:59 No.108420950

>>108419511
recently i have had it tell me about ENTITIES and about putting Entities in "commas" in its description. i asked it what it was on about and it said it was just information for itself and wasn't aprt of the answer, but it was part of the response it printed on screen. entities being names of organisations and brands etc which exist in reality.
it seemed it was considering whether to reference by name the things i asked about or whether to reference them indirectly.

Anonymous
03/21/26(Sat)08:30:03 No.108421177

Anonymous 03/21/26(Sat)08:30:03 No.108421177

I don't care enough about AI chatbots to ask any serious questions. I just use them as tools to get what I want quickly. I'm not gonna ask about certain topics because to me something like Claude seems to always give me what I want to hear, and I don't want that.
Today I watched this in full:
https://www.youtube.com/watch?v=h3AtWdeu_G0
I am certainly no fan of his, and this is just some advertising. But it highlights the problem I have with these chatbots.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.