An Autoencoder is a neural network trained to reconstruct its input through a bottleneck. It consists of an encoder $f_\theta$ mapping an input $\mathbf{x}$ to a lower-dimensional latent representation $\mathbf{z} = f_\theta(\mathbf{x})$, and a decoder $g_\phi$ mapping $\mathbf{z}$ back to a reconstruction $\hat{\mathbf{x}} = g_\phi(\mathbf{z})$. Training minimises a reconstruction loss, typically mean squared error or cross-entropy. Because the latent is smaller than the input, the autoencoder must learn to compress the data into its most salient features—a nonlinear generalisation of PCA.
Standard autoencoders learn useful representations but are poor generative models: their latent space is irregular, so random samples may decode to nonsense. The Variational Autoencoder (VAE) addresses this by imposing a probabilistic structure: the encoder outputs parameters of a distribution (typically Gaussian), and the decoder learns to reconstruct from samples drawn from it. The loss combines reconstruction error with a KL divergence term that regularises the latent distribution toward a standard Gaussian prior. The reparameterisation trick ($\mathbf{z} = \mu + \sigma \odot \epsilon$, $\epsilon \sim \mathcal{N}(0, I)$) enables gradients to flow through the sampling step.
Autoencoders are used for dimensionality reduction, denoising, anomaly detection (high reconstruction error signals an outlier), feature learning, and generative modelling. VQ-VAE uses a discrete codebook latent and has become an important building block in modern text-to-image systems, where Stable Diffusion runs its diffusion process in a VQ-VAE's compressed latent space to dramatically reduce compute.
Related terms: Variational Autoencoder, Dimensionality Reduction, Principal Component Analysis
Discussed in:
- Chapter 14: Generative Models — Autoencoders
Also defined in: Textbook of AI