8.12 Manifold learning: Isomap, LLE, Laplacian eigenmaps
The manifold hypothesis holds that high-dimensional data of interest lies on or near a low-dimensional manifold embedded in the ambient space. Faces, handwritten digits, gene-expression profiles all show this structure: they live in a vast ambient space yet have low intrinsic dimensionality.
Manifold learning algorithms estimate this structure non-parametrically.
8.12.1 Isomap
Tenenbaum et al.'s Isomap (2000) extends classical multidimensional scaling (MDS) to manifolds.
- Build a k-NN graph on the data, with edge weights equal to Euclidean distances.
- Compute shortest-path (geodesic) distances $d_G(i,j)$ via Floyd-Warshall or Dijkstra.
- Apply classical MDS to $\{d_G(i,j)\}$ to embed in $\mathbb{R}^k$.
Isomap recovers global geometry well when the manifold is smooth and densely sampled, but is brittle to noise (a single short-cut edge in the graph distorts every geodesic that uses it).
8.12.2 Locally linear embedding (LLE)
Roweis and Saul's LLE (2000) preserves local linear reconstruction weights.
- For each $\mathbf{x}^{(i)}$, find its $k$ nearest neighbours and compute reconstruction weights $w_{ij}$ such that $\sum_j w_{ij}\mathbf{x}^{(j)}\approx\mathbf{x}^{(i)}$, $\sum_j w_{ij}=1$. Closed form via constrained least squares.
- Find low-dimensional embedding $\mathbf{y}^{(i)}$ minimising $\sum_i\lVert\mathbf{y}^{(i)}-\sum_j w_{ij}\mathbf{y}^{(j)}\rVert^2$ subject to $\frac{1}{n}\sum_i\mathbf{y}^{(i)}\mathbf{y}^{(i)\top}=I$. Closed form: bottom $k+1$ eigenvectors of $(I-W)^{\top}(I-W)$ excluding the trivial constant.
LLE preserves local angles and distances but is sensitive to the choice of $k$.
8.12.3 Laplacian eigenmaps
Belkin and Niyogi's Laplacian eigenmaps (2003) treat embedding as a graph-Laplacian eigenproblem.
- Build a weighted k-NN graph with Gaussian edge weights $W_{ij}=\exp(-\lVert\mathbf{x}^{(i)}-\mathbf{x}^{(j)}\rVert^2/t)$.
- Solve the generalised eigenproblem $L\mathbf{f}=\lambda D\mathbf{f}$ for the bottom eigenvectors.
- Embed point $i$ as $(f_2(i),f_3(i),\dots,f_{k+1}(i))$, skipping the trivial constant eigenvector.
Laplacian eigenmaps minimise $\sum_{ij}W_{ij}\lVert\mathbf{y}^{(i)}-\mathbf{y}^{(j)}\rVert^2$ subject to a normalisation constraint, encouraging connected points to remain close. They are closely related to spectral clustering and to UMAP's initialisation step.
These three methods, Isomap, LLE, Laplacian eigenmaps, defined the early-2000s landscape of non-linear dimensionality reduction. They have been largely superseded by t-SNE and UMAP for visualisation and by autoencoders for representation learning, but they remain useful when interpretability of the embedding map is required.