How we scaled raw GROUP BY to 100 B+ rows in under a second
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
ClickHouse Cloud introduces parallel replicas, enabling GROUP BY queries to scale across thousands of cores automatically. The feature processes 100 billion rows in under half a second without pre-aggregation or data reshuffling. By distributing work across multiple nodes using partial aggregation states, queries achieve
Table of contents
Infinite horizontal query scaling #Why GROUP BY is the beating heart of analytics #GROUP BY at the speed of ClickHouse #How ClickHouse makes GROUP BY fast #How we measured #Scaling GROUP BY vertically (with more cores per node) #Scaling GROUP BY horizontally (with parallel replicas) #When GROUP BY gets heavy #Scaling limits and safeguards #How parallel replicas distribute work #GROUP BY at cloud scale #From one node to the cloud #Sort: