
System Prompts Are Not Security Boundaries
Every AI agent that ever did the wrong thing had a system prompt telling it not to. The PocketOS database wipe, the Anthropic Opus 4.6 findings, and 7,630 T1550 events in our honeypot data are versions of the same observation: the model is not the boundary.

























