DuckDB is an in-process analytical database optimized for OLAP workloads that can query local Parquet and Delta Lake files directly without loading them fully into memory. Unlike pandas, DuckDB uses columnar vectorized execution and predicate pushdown to read only the necessary row groups and columns, making it significantly

5m read timeFrom bartwullems.blogspot.com
Post cover image
Table of contents
What is DuckDB?InstallationQuerying local Parquet filesOnly read what you needQuerying Delta Lake tablesUsing a persistent connection for Multiple QueriesMixing DuckDB with PandasWhen to reach for DuckDB?Wrapping upMore information

Sort: