Netflix engineered a real-time recommendation system to handle live event streaming at massive scale, serving over 100 million concurrent devices. The solution uses a two-phase approach: prefetching recommendations and metadata during natural browsing patterns before events, then broadcasting low-cardinality state updates via WebSocket when events start. This architecture solves the thundering herd problem by distributing load over time and minimizing real-time compute requirements. The system leverages GraphQL schemas, Apache Kafka, and a two-tier pub/sub architecture to deliver updates in under a minute during peak load, while adaptive traffic prioritization and cache jitter prevent unexpected traffic spikes.

9m read timeFrom netflixtechblog.com
Post cover image
Table of contents
Why are Live Events Different?Orchestrating the moment: Real-time RecommendationsUnder the Hood: How It WorksGet Netflix Technology Blog’s stories in your inboxBalancing the Moment: Throughput ManagementLooking AheadJoin Us for What’s Next

Sort: