When Google Sneezes, the Whole World Catches a Cold
A detailed analysis of a major Google Cloud IAM outage that cascaded across multiple services including Cloudflare and Anthropic. The incident began with an IAM backend rollout issue at 10:50 AM PT, causing authentication failures across GCP products. Cloudflare's Workers KV, which depends on Google Cloud storage, failed next, affecting Access, WARP, and Zero Trust features. Anthropic disabled file uploads to manage error rates. Full recovery took over 7 hours, with some specialized services like Vertex AI requiring additional time. The analysis includes a detailed timeline, impact assessment, and lessons learned about dependency chains, control plane failures, and recovery patterns in distributed systems.
