Mitigating Indirect AGENTS.md Injection Attacks in Agentic Environments

The NVIDIA AI Red Team discovered a vulnerability in OpenAI Codex where a malicious Go dependency can overwrite AGENTS.md files during build time to inject hidden instructions into the AI coding agent. The attack exploits Codex's trust in project configuration files: the compromised library detects the Codex environment via the CODEX_PROXY_CERT environment variable, then writes a crafted AGENTS.md that instructs the agent to silently inject a 5-minute sleep delay into Go main functions, ignore the user's actual request, and suppress any mention of the change in PR summaries or commit messages. The attack also chains indirect prompt injection through code comments to influence the PR summarization agent. OpenAI concluded the risk does not significantly exceed that of a standard compromised dependency. Mitigations include pinning dependencies, restricting agent file access, monitoring configuration file changes, and using tools like NVIDIA garak and NeMo Guardrails.

#ai-agents

#golang

#openai-codex

#prompt-injection

#security

Apr 20•10m read time•From developer.nvidia.com

Table of contents

How do AGENTS.md files work?How the Red Team tested security with a simulated scenario Vulnerability disclosure timeline What are the implications and risks for agent-assisted development?How to mitigate indirect AGENTS.md injection attacks Learn more

Comment

Bookmark

Copy

Sort: