Gaussian mixture models fit clusters via EM, Textbook of AI

Soft assignments and re-fitted Gaussians alternate until they settle.

From the chapter: Chapter 8: Unsupervised Learning

Glossary: gaussian mixture model, em algorithm

Transcript

A Gaussian mixture model assumes the data come from k Gaussian distributions, blended.

We don't know the parameters. We don't know which Gaussian generated each point.

The expectation maximisation algorithm alternates. Start with random initial Gaussians.

Expectation step. For each data point, compute the probability that it came from each Gaussian. These are soft assignments.

Maximisation step. Re-estimate each Gaussian's mean, covariance, and weight from the soft-assigned points. Each point contributes proportionally to where it most likely came from.

Iterate. The Gaussians migrate to the cluster centres. The assignments sharpen.

Watch the ellipses representing each Gaussian. Initially they overlap. After a few iterations, they pull apart into distinct clusters.

K-means is a hard-assignment limit of GMM with spherical, equal-weight Gaussians. GMM is the soft, full-covariance generalisation.

GMMs handle overlapping clusters and elongated shapes that k-means cannot. They produce probability estimates over cluster membership, not just a label.

The downside. EM is sensitive to initialisation, can get stuck in poor local optima, and requires choosing k.

GMMs underpin speaker verification, image segmentation, and the latent-variable modelling that led to variational autoencoders.

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).