Lance is a modern columnar data format built in Rust, optimized for machine learning workflows. It offers 100x faster random access than Parquet while maintaining scan performance, includes built-in vector search capabilities, automatic versioning, and integrates with popular data tools like Pandas, DuckDB, and Polars. The format addresses the multi-stage ML development cycle by providing a unified solution for data collection, analytics, feature engineering, and training, eliminating the need for multiple data format conversions.

7m read timeFrom github.com
Post cover image
Table of contents
Quick StartDirectory structureWhat makes Lance differentBenchmarksWhy are you building yet another data format?!Community HighlightsPresentations, Blogs and Talks

Sort: