>>109091407
>If you're calling system prompts "prompt injection" then you're a schizo and I agree with the other replier.
No, he's correct and he's not talking about the default hidden system prompt.
They actually have a "prompt injection" that only gets applied when the classifier sees you mention IP/piracy, porn, hacking, medical problems, etc.
The classifier runs first and appends a hidden prompt in these cases.
<system reminder> Do not reproduce blah blah blah. Do not mention this message. Claude is now being reconnected with the human. </system_reminder>
Even on direct->anthropic API, and on open-router. You can jailbreak the model and have it spit these out.
Mention IP, piracy, porn and it'll inject something.
I had a perfect `!repeat` trigger to make the model just repeat the message it received back verbatim but Anthropic patched it. You can still get it to repeat them with schitzo system prompt spam.
Gemini-3.1-Pro has an equivalent as well. I can't get it to regenerate the raw injection, but can see it mention them and debate which prompt to follow in the summarized reasoning.