Erik Bernhardsson, author of Spotify's Annoy library, presents updated benchmarks for approximate nearest neighbor (ANN) search algorithms. The ANN-benchmarks project now features Dockerized algorithms and pre-computed datasets for reproducible comparisons. Across multiple datasets (GloVe, SIFT, Fashion-MNIST, GIST), HNSW from NMSLIB consistently ranks first — over 10x faster than Annoy — followed by KGraph, SW-graph, FAISS-IVF, and Annoy. Graph-based algorithms dominate, while LSH-based approaches like FALCONN have regressed. The author calls for all future ANN research papers to benchmark against these standard libraries.

4m read timeFrom erikbern.com
Post cover image
Table of contents
What’s new in ANN-benchmarks?A final word

Sort: