An introduction to nearest neighbor methods and vector models, covering how objects can be represented as vectors in high-dimensional spaces and why similarity search matters. Uses MNIST handwritten digits and a food image classifier (built with deep convolutional neural networks) as concrete examples. Introduces Annoy, a Spotify-built approximate nearest neighbor library, demonstrating it runs ~300x faster than brute-force exhaustive search on word2vec embeddings while returning nearly identical results. Also touches on collaborative filtering applications at Spotify powering Discover Weekly recommendations.
Sort: