ClickHouse has built an internal data warehouse to handle 50 TB of data daily, incorporating multiple internal sources like AWS, GCP, and Salesforce. They use Airflow for scheduling, AWS S3 as the intermediate data layer, and Superset for BI tools. Key features include data consistency, idempotency, and real-time analytics. They have also adopted dbt to streamline data transformations and introduced new tools for improved user access to data.
Sort: