When an Attacker Meets a Group of Agents: Navigating Amazon Bedrock's Multi-Agent Applications

Unit 42 researchers red-teamed Amazon Bedrock's multi-agent collaboration feature, demonstrating a four-stage attack chain: detecting operating mode (Supervisor vs. Supervisor with Routing), discovering collaborator agents, delivering attacker-controlled payloads, and exploiting target agents. Successful attacks included extracting agent instructions, leaking tool schemas, and invoking tools with malicious inputs. No vulnerabilities were found in Bedrock itself — the attacks exploit the fundamental LLM challenge of distinguishing developer instructions from adversarial input. Enabling Bedrock's built-in prompt attack Guardrails and pre-processing prompts effectively blocks all demonstrated attacks. Recommended defenses include narrow agent capability scoping, tool input sanitization, vulnerability scanning, and least-privilege permissions.

#ai-agents

#ai-security

#prompt-injection

#amazon-bedrock

#red-teaming

Apr 03•16m read time•From unit42.paloaltonetworks.com

Table of contents

Executive Summary Introduction to Bedrock Agents Multi-Agent Collaboration Red-Teaming Multi-Agent Application General Defenses and Mitigations Conclusion Additional Resources Additional Resources

Comment

Bookmark

Copy

Sort: