On November 12, 2024, Canva experienced a critical outage due to API Gateway failures caused by a combination of deployment issues with Canva's editor, network issues with Cloudflare, and a performance bug in their telemetry library. The outage affected canva.com, causing it to be unavailable for nearly an hour. Canva implemented several mitigation strategies during the incident, including scaling API Gateway tasks and blocking traffic at the CDN level. In response, Canva has outlined improvements to their incident response process, API Gateway resilience, telemetry bug fixes, and collaboration with Cloudflare to prevent future occurrences.

9m read timeFrom canva.dev
Post cover image
Table of contents
High-level summaryBackgroundThe incidentTimeline of eventsHow we mitigated the incidentAction itemsNext steps

Sort: