Towards Data Science is a community-powered publication that showcases work in data science, machine learning and artificial intelligence. Every day newcomers, seasoned researchers and industry practitioners publish tutorials, research notes and real-world case studies that help the field move forward.

Towards Data Science

A detailed empirical comparison of two vector database storage optimization techniques: quantization (scalar int8, binary 1-bit, product) and Matryoshka Representation Learning (MRL). Using FAISS HNSW with the HotpotQA dataset and a 384-dimensional MRL-capable embedding model, the author measures storage savings and retrieval quality (Recall@10, MRR@10) across all combinations. Key findings: scalar int8 quantization alone cuts storage 63.7% with only ~1.5% recall loss; combining 256-dimensional MRL with scalar quantization achieves 70.8% savings with ~4.6% recall loss; binary quantization delivers extreme compression but causes severe accuracy degradation. The recommended sweet spot for most production RAG systems is MRL (256d) + scalar quantization, while binary quantization should only be used with re-ranking to compensate for quality loss.

Scaling Vector Search: Comparing Quantization and Matryoshka Embeddings for 80% Cost Reduction