Annoy's index building algorithm was improved by sampling two points and computing the equidistant hyperplane to split point sets, rather than picking random hyperplanes. This change makes index building 4x faster for Euclidean distance and substantially improves search precision, especially in high-dimensional spaces where data lies on a lower-dimensional manifold. The new algorithm is also simpler, resulting in a net code reduction. Version 1.3.1 is available on PyPI and GitHub.

2m read timeFrom erikbern.com
Post cover image

Sort: