8.9 Kernel PCA

Linear PCA cannot capture non-linear structure, for example, points on a circle in $\mathbb{R}^2$. Kernel PCA performs PCA in a feature space $\mathcal{H}$ implicitly defined by a kernel.

8.9.1 The kernel trick for PCA

Suppose we map $\mathbf{x}\mapsto\phi(\mathbf{x})\in\mathcal{H}$ and run PCA on $\phi(\mathbf{x}^{(i)})$. We never compute $\phi$ explicitly; the inner products $K_{ij} = \phi(\mathbf{x}^{(i)})^{\top}\phi(\mathbf{x}^{(j)}) = k(\mathbf{x}^{(i)},\mathbf{x}^{(j)})$ are given by a kernel.

Centre the kernel: $\tilde K = (I - \tfrac{1}{n}\mathbf{1}\mathbf{1}^{\top}) K (I - \tfrac{1}{n}\mathbf{1}\mathbf{1}^{\top})$. Eigendecompose $\tilde K = U\Lambda U^{\top}$. The principal components in feature space are

$$ \mathbf{v}_\ell = \sum_{i=1}^{n}\alpha_{i\ell}\,\phi(\mathbf{x}^{(i)}),\quad \alpha_{i\ell} = \frac{u_{i\ell}}{\sqrt{\lambda_\ell}}. $$

To project a new $\mathbf{x}$:

$$ z_\ell(\mathbf{x}) = \mathbf{v}_\ell^{\top}\phi(\mathbf{x}) = \sum_i\alpha_{i\ell}\,k(\mathbf{x}^{(i)},\mathbf{x}). $$

Common kernels: polynomial $k(\mathbf{x},\mathbf{y})=(\mathbf{x}^{\top}\mathbf{y}+c)^p$, RBF $k(\mathbf{x},\mathbf{y})=\exp(-\lVert\mathbf{x}-\mathbf{y}\rVert^2/(2\sigma^2))$, Laplacian, Matérn, neural-network kernels.

The cost is $O(n^3)$ from the eigendecomposition of $K\in\mathbb{R}^{n\times n}$, which limits kernel PCA to a few tens of thousands of points unless we use Nyström approximation or random features.

from sklearn.decomposition import KernelPCA
kpca = KernelPCA(n_components=2, kernel="rbf", gamma=0.5).fit_transform(X)
Example

Unrolling the swiss roll. The "swiss roll" data lies on a 2D manifold rolled into 3D. Linear PCA picks the two longest axes of the bounding ellipsoid and badly entangles points from opposite ends of the roll. Kernel PCA with an RBF kernel and $\gamma\approx 1/(\text{median}^2 \text{distance})$ separates them cleanly because nearby points on the manifold have high kernel similarity while geometrically close points across roll layers do not.

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).