chDB 4 launches with a new DataStore API that lets data scientists write familiar Pandas-style code while executing it on the ClickHouse engine under the hood. The four-layer architecture records operations lazily, compiles them into optimized ClickHouse SQL with filter pushdown and column pruning, and automatically splits pipelines between ClickHouse and Pandas engines when Pandas-only operations are encountered. Key features include zero-copy data exchange via Python's memoryview buffer protocol, automatic caching of intermediate results, support for multiple data sources (local files, S3, PostgreSQL), and a row-ordering compatibility mode. chDB 4 is now natively integrated into Hex notebooks, requiring no local installation. Migration from existing Pandas code requires only a single import change.

12m read timeFrom clickhouse.com
Post cover image
Table of contents
The problem with eager execution #How Data Store works #Low-overhead DataFrame exchange #Unified data sources #Smart caching for interactive workflows #chDB 4 on Hex #

Sort: