Datadog built Monocle, a custom time-series database in Rust, to handle billions of metrics per second. The system uses Kafka for data distribution and replication, separates metadata storage from time-series data, and employs a thread-per-core architecture with LSM-tree storage. Key optimizations include arena allocators, time-based file pruning, and cost-based query scheduling. The platform splits storage into real-time (24 hours) and long-term systems, with the real-time database handling 99% of queries. Future plans include dynamic load balancing and merging separate databases into a unified columnar format.
Sort: