An introduction to nearest neighbor methods and vector models, covering how objects can be represented as vectors in high-dimensional spaces and why similarity search matters. Uses MNIST handwritten digits and a food image classifier (built with deep convolutional neural networks) as concrete examples. Introduces Annoy, a Spotify-built approximate nearest neighbor library, demonstrating it runs ~300x faster than brute-force exhaustive search on word2vec embeddings while returning nearly identical results. Also touches on collaborative filtering applications at Spotify powering Discover Weekly recommendations.

9m read timeFrom erikbern.com
Post cover image

Sort: