Vortex is a new open-source columnar file format designed to improve on Parquet's limitations through lightweight compression, late decompression, and compute expressions on compressed data. DuckDB now supports Vortex as a core extension through a partnership between SpiralDB and DuckDB Labs. Benchmark results show Vortex performs 18% faster than Parquet v2 and 35% faster than Parquet v1 on TPC-H queries, with lower standard deviation across runs. The format supports heterogeneous compute patterns, multiple data modalities (vectors, text, images, audio), and optimized layouts for CPU/GPU saturation using encodings like ALP, FSST, and FastLanes.
Sort: