DuckDB now supports reading and writing Parquet Bloom filters, which help in selectively reading relevant data for queries by using compact index structures. The new feature is transparent to users and significantly improves query performance, especially in scenarios with large Parquet files or slow network connections. Bloom filters are supported for various data types, including integers, floating points, and strings, but not yet for nested types.

9m read timeFrom duckdb.org
Post cover image
Table of contents
Parquet Bloom FiltersDuckDB Bloom FiltersExample Use CaseConclusion

Sort: