Uber's real-time data infrastructure processes petabytes of data daily, supporting features like customer incentives and fraud detection. The system relies on Apache Kafka for streaming data, Apache Flink for stream processing, and Apache Pinot for real-time OLAP. Key requirements include consistency, availability, data freshness, scalability, and cost efficiency. Customizations and tools like FlinkSQL and uReplicator enhance reliability and performance. This enables real-time decisions such as dynamic pricing and operational insights. Scalability strategies, including Active-Active and Active-Passive Kafka setups, ensure high availability and fault tolerance.
Table of contents
Stop renting auth. Make it yours instead.(Sponsored)Critical Requirements of Uber’s Real-Time DataTutorial: Build a RAG AI Application with external contextual data in 3 days (Sponsored)Key Technologies Used By UberUse CasesScaling StrategiesKey LessonsConclusionSPONSOR USSort: