Google's GenAI Security Team outlines their continuous strategy for defending Google Workspace with Gemini against indirect prompt injection (IPI) attacks. The approach combines human and automated red-teaming, an AI Vulnerability Rewards Program, open-source intelligence monitoring, and a structured vulnerability catalog. Newly discovered attacks are expanded into synthetic training data using a tool called Simula, boosting data generation by 75%. Defenses span three layers: deterministic rules (URL sanitization, tool chaining policies), ML-based classifiers retrained on synthetic data, and LLM-based prompt engineering. Additionally, Gemini model hardening trains the model itself to recognize and ignore injected instructions. Defense effectiveness is measured via end-to-end simulations across Workspace apps like Gmail and Docs, comparing attack success rates before and after each defense update.

6m read timeFrom security.googleblog.com
Post cover image
Table of contents
New attack discoverySynthetic data generationOngoing defense refinementDefense effectivenessMoving forward

Sort: