Also known as: PCA
Principal Component Analysis (PCA) is the most fundamental dimensionality reduction technique in statistics and machine learning. It finds orthogonal directions—the principal components—along which the data's variance is maximised. The first principal component captures the direction of greatest variance; the second captures the most variance orthogonal to the first; and so on. Projecting the data onto the first $k$ principal components yields a lower-dimensional representation that preserves as much variance as possible.
Mathematically, PCA computes the eigendecomposition of the data's covariance matrix $C = \frac{1}{n} X^T X$ (after centring). The eigenvectors are the principal components; the eigenvalues quantify the variance captured by each. In practice, the covariance matrix is rarely formed explicitly—instead, the singular value decomposition of the data matrix yields the same principal components more stably and efficiently. Randomised SVD extends this to very large matrices.
PCA is used for data visualisation (project to 2 or 3 dimensions and plot), noise reduction (truncate low-variance components), feature extraction, and compression. The "eigenfaces" method uses PCA to represent faces as linear combinations of principal components. In genomics, PCA reveals population structure. PCA is a linear method and cannot discover nonlinear structure; Kernel PCA, t-SNE, UMAP, and autoencoders are nonlinear alternatives. PCA is sensitive to feature scaling, so standardising features first is standard practice.
Discussed in:
- Chapter 8: Unsupervised Learning — Principal Component Analysis
Also defined in: Textbook of AI, Textbook of Medical AI, Textbook of Medical Statistics