Google's Chrome security team details the layered security architecture being built for Gemini's agentic browsing capabilities. The primary threat addressed is indirect prompt injection, where malicious web content could hijack an AI agent's actions. Key defenses include: a User Alignment Critic — a separate isolated model that vets each proposed action before execution; Agent Origin Sets that restrict which web origins the agent can read from or interact with; mandatory user confirmations before sensitive actions like payments or banking; real-time prompt injection detection running in parallel with the planning model; and continuous automated red-teaming. Google is also updating its Vulnerability Rewards Program to cover agentic capabilities, offering up to $20,000 for demonstrated security boundary breaches.

10m read timeFrom security.googleblog.com
Post cover image
Table of contents
Checking agent outputs with User Alignment CriticEnforcing stronger security bound aries with Origin SetsTransparency and control for sensitive actionsDetecting “social engineering” of agentsContinuous auditing, monitoring, responseCollaborating across the communityLooking forward

Sort: