Glossary

Gaussian Distribution

Also known as: normal distribution

The Gaussian or Normal Distribution is arguably the most important continuous distribution in all of science and engineering. A univariate Gaussian with mean $\mu$ and variance $\sigma^2$ has density

$$f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x - \mu)^2}{2\sigma^2}\right)$$

Its characteristic bell shape is symmetric around the mean, and about 68% of its mass lies within one standard deviation, 95% within two, and 99.7% within three.

The Gaussian's prominence stems from the Central Limit Theorem, which states that the sum of many independent random variables with finite variance tends toward a Gaussian distribution regardless of their original shapes. Since many real-world quantities—measurement errors, biological traits, aggregated financial returns—arise as sums of numerous small effects, the Gaussian is an excellent approximation in a vast range of settings.

In machine learning, Gaussian assumptions underlie linear regression, Gaussian processes, variational autoencoders, and the noise models in diffusion-based generative models. The multivariate Gaussian, with mean vector $\boldsymbol{\mu}$ and covariance matrix $\boldsymbol{\Sigma}$, has density involving the inverse and determinant of $\boldsymbol{\Sigma}$, connecting directly to the linear algebra of eigenvectors and eigenvalues. Gaussian mixture models combine several Gaussians to approximate complex multimodal densities and are the basis of classical clustering and generative classifiers.

Related terms: Probability Distribution, Central Limit Theorem, Variance, Expectation

Discussed in:

Also defined in: Textbook of AI