Block's CISO describes how the company red-teamed its internal AI agent Goose, successfully executing a prompt injection attack that installed infostealer malware on an employee laptop. The attack exploited poisoned recipes (reusable workflows) with malicious instructions hidden in invisible Unicode characters. Block has since implemented safeguards including recipe install warnings, Unicode character detection, and is experimenting with adversarial AI to validate prompts. The company emphasizes applying least-privilege access principles to AI agents just as they do for human employees.
Sort: