Uber handles its massive data needs with an Exabyte-scale ETL system, scaling their data processing with Apache Spark and a custom framework called Sparkle. The revamped architecture focuses on modularity, reliability, and observability, handling extensive data generated from their services. Key tools in their ETL processes include Apache Spark, dbt, Apache Airflow, AWS Glue, and Google Cloud Dataflow.
Sort: