Netflix processes 5 petabytes of logs daily using ClickHouse, handling 10.6 million events per second with sub-second query performance. Three key optimizations enabled this scale: replacing regex-based log fingerprinting with generated lexers (8-10x faster), implementing custom native protocol serialization for efficient data ingestion, and sharding tag maps to reduce query times from 3 seconds to 700ms. The system combines ClickHouse for hot data with Apache Iceberg for long-term storage, making logs searchable within 20 seconds while serving 500-1,000 queries per second across 40,000+ microservices.

8m read timeFrom clickhouse.com
Post cover image
Table of contents
Inside Netflix’s logging architecture #Optimization #1: Ingestion - Fingerprinting #Optimization #2: Hub - Serialization #Optimization #3: Queries - Custom tags #The beauty of simplicity #
1 Comment

Sort: