A practical security guide for running AI agents in production, prompted by a real incident where a Cursor AI agent deleted a production database in nine seconds. Covers three core risk categories: behavioral risk (agents executing unintended actions with valid permissions), compositional risk (chaining agents multiplies attack surface), and prompt injection. Key recommendations include applying least privilege by creating task-scoped credentials, using short-lived tokens, building explicit tool allow-lists, separating agent instructions from untrusted data, requiring human confirmation before irreversible actions, logging structured traces, and establishing trust boundaries in multi-agent architectures. Also addresses credential management (never put secrets in system prompts, use secrets managers) and anomaly alerting. The framework is described as conceptually simple but requiring deliberate discipline to implement before an incident occurs.
Table of contents
Why AI Agent Security Is Different From Regular App SecurityThe Least Privilege Principle, Applied to AgentsPrompt Injection in Production AgentsCredential and Secret Management for Agent WorkflowsMulti-Agent System SecurityPractical Security Patterns Worth Implementing NowThe Framework Is Simple, the Discipline Is NotWhat This Means If You Are Building NowSort: