Spectral clustering is a graph-based clustering method that outperforms K-means on non-linear, complex cluster structures like the two-moon dataset. It works by building a similarity matrix using a Gaussian kernel, constructing the graph Laplacian matrix, then performing eigendecomposition to find eigenvectors that embed data into a lower-dimensional space where clusters become linearly separable. K-means is then applied in this new spectral embedding space. The post walks through each step from scratch in Python using NumPy and scikit-learn, covering the similarity matrix, degree matrix, Laplacian construction, eigengap heuristic for choosing the number of clusters, and gamma hyperparameter tuning.

9m read timeFrom towardsdatascience.com
Post cover image
Table of contents
Motivation for Spectral ClusteringWhat is Spectral Clustering?Implementing Spectral Clustering — Step by StepChoosing the Right Value for GammaClosing Thoughts

Sort: