A developer built two music discovery tools — Music Galaxy (an interactive 3D visualization of 70,000+ artists) and Artist Averager — using graph embeddings generated with node2vec from Spotify's related artists API. The post covers the full pipeline: scraping Spotify's artist relationship graph, tuning node2vec hyperparameters (p and q), experimenting with embedding dimensionality (4D then PCA'd to 3D), and validating the results. Key findings include that a high p (~2.5) and low q (~0.5) produced the best cluster structure, and that 4D→3D PCA outperformed direct 3D embedding generation. The tools work without genre labels, relying purely on listener behavior data.

17m read timeFrom cprimozic.net
Post cover image
Table of contents
What I BuiltAbout EmbeddingsSpotify Artist Relationship DataBuilding the Artist Relationship EmbeddingResults + ObservationsConclusion

Sort: