Airbnb evolved their key-value store (Mussel) from simple QPS-based rate limiting to a sophisticated multi-layered traffic management system. The new approach uses resource-aware rate control that charges requests in units reflecting actual backend cost (rows, bytes, latency), implements load shedding with criticality tiers to prioritize high-value traffic during capacity constraints, and deploys hot-key detection with local caching to mitigate DDoS attacks. The system uses constant-memory algorithms like P² for latency quantiles and Space-Saving for hot-key tracking, enabling real-time adaptive protection without cross-node coordination. This evolution reduced recovery times by half and successfully handled million-QPS DDoS drills.
Table of contents
IntroductionBackground: Life with Client Quota Rate LimiterResource-aware rate controlGet Shravan Gaonkar’s stories in your inboxLoad shedding: Staying healthy when capacity evaporates or develops hotspotsHot-key detection and DDoS defenceReal-time detection in constant spaceLocal caching and request coalescingImpact in productionRetrospective and key takeaways📚 ReferencesSort: