Glossary

Normalising Flow

Normalising Flows construct a generative model by learning an invertible transformation between a simple base distribution (typically a standard Gaussian) and the complex data distribution. If $f_\theta$ is invertible and maps $z \sim p_z(z)$ to $x = f_\theta(z)$, then the density of $x$ is given by the change-of-variables formula:

$$p_x(x) = p_z(f_\theta^{-1}(x)) \left|\det \frac{\partial f_\theta^{-1}}{\partial x}\right|$$

Unlike GANs (no density) or VAEs (only a lower bound), normalising flows provide exact log-likelihoods—a major advantage for density estimation, anomaly detection, and principled Bayesian inference.

The central design challenge is constructing transformations that are both expressive and have tractable Jacobian determinants. Computing a general $d \times d$ determinant costs $O(d^3)$, prohibitive for high dimensions. The trick is to use transformations with triangular Jacobians, whose determinant is the product of diagonal entries. Coupling layers (NICE, RealNVP) split the input in half and transform one half conditioned on the other, leaving the second unchanged. Autoregressive flows (MAF, IAF) use causal conditioning, producing triangular Jacobians naturally. Continuous normalising flows use neural ODEs to define the transformation, with the Hutchinson trace estimator providing scalable density computation.

Normalising flows excel where exact likelihoods matter: Bayesian posterior approximation, physics simulations of Boltzmann distributions, anomaly detection. For image generation, they typically lag behind GANs and diffusion models due to the expressiveness constraints of invertibility. Recent convergence with diffusion models via the continuous-flow perspective is blurring the boundaries between these generative paradigms.

Related terms: Generative Model, Variational Autoencoder, Diffusion Model

Discussed in:

Also defined in: Textbook of AI