Airbnb developed a key-value store named Mussel to handle petabytes of derived data with high reliability, availability, and low latency. Mussel's architecture leverages sharding, Kafka for replication, and HRegion for unified real-time and batch data storage. This system supports efficient bulk loading and offers impressive performance metrics, including over 99.9% availability and sub-8 millisecond read latency. Mussel overcame the limitations of previous solutions like HFileService and Nebula by automating shard management with Apache Helix and using Spark for incremental bulk data loads.
Table of contents
Evolution of Derived Data Storage at AirbnbMussel ArchitectureAdoption and Performance of MusselConclusionSPONSOR USSort: