Manticore Search 13.2.3 introduces vector quantization that compresses vectors from 32-bit floats to 8-bit or 1-bit representations, reducing RAM usage by 4x to 32x while maintaining search performance. The feature includes asymmetric quantization for better accuracy, oversampling and rescoring options to recover full-precision results, and shows significant improvements in indexing speed (2.2x faster) and search throughput (up to 2x higher at high concurrency) with 90% memory reduction.
Table of contents
What vector quantization isEnabling VQWhy oversampling + rescoring mattersBenchmarksConclusionsSort: