This blog post delves into various advanced features and performance optimization techniques for DuckDB, particularly focusing on convenient methods for handling table operations and improving the processing speed of Parquet and CSV files. It includes practical examples using the Dutch railway services dataset, demonstrating column renaming with pattern matching, data loading with globbing, reordering Parquet files, and employing Hive partitioning to speed up queries significantly.

8m read timeFrom duckdb.org
Post cover image
Table of contents
OverviewDatasetExclude Columns with Pattern MatchingRename Columns with Pattern MatchingLoading with GlobbingReordering Parquet FilesHive PartitioningClosing Thoughts

Sort: