Dimensionality Reduction projects high-dimensional data onto lower-dimensional representations, facilitating visualisation, compression, noise removal, and downstream learning. The motivation is both practical (reducing memory and compute) and theoretical (the curse of dimensionality makes high-dimensional learning difficult). The manifold hypothesis posits that natural data, though represented in high-dimensional spaces, actually lies on or near lower-dimensional manifolds—dimensionality reduction attempts to uncover these.
Linear methods include PCA (Principal Component Analysis), which projects onto directions of maximum variance; LDA (Linear Discriminant Analysis), which maximises class separation; and Factor Analysis, which models observations as linear combinations of latent factors plus noise. PCA is the workhorse: simple, fast, and provides the optimal linear approximation for preserving variance.
Nonlinear methods capture more complex structure. t-SNE (t-distributed Stochastic Neighbour Embedding) preserves local neighbourhoods and excels at visualising clusters but distorts global distances. UMAP (Uniform Manifold Approximation and Projection) offers similar visualisation quality with better global structure preservation and faster computation. Autoencoders learn nonlinear encoder-decoder networks whose bottleneck representations serve as reduced-dimensional codes. Kernel PCA applies the kernel trick to linear PCA. The choice depends on the goal: PCA for speed and linearity, t-SNE/UMAP for visualisation, autoencoders for learned representations that generalise to new data.
Related terms: Principal Component Analysis, Autoencoder, Curse of Dimensionality
Discussed in:
- Chapter 8: Unsupervised Learning — Principal Component Analysis
- Chapter 6: ML Fundamentals — Features & Representations
Also defined in: Textbook of AI