Text embeddings, representing words and documents, are highly useful in various applications. While vector databases like faiss or Pinecone are typically used for handling embeddings, simpler methods involving numpy and formats like Parquet files can be more efficient for smaller projects. Parquet files, in combination with

14m read timeFrom minimaxir.com
Post cover image
Table of contents
The Worst Ways to Store Embeddings #The Intended-But-Not-Great Way to Store Embeddings #What are Parquet files? #How do you use Parquet files in Python for embeddings? #The Power of polars #Scaling to Vector Databases #

Sort: