Visualisation

Joint, marginal, and conditional distributions

Last reviewed 5 May 2026

A joint distribution lives over two axes. Marginalise to one axis, condition on a slice.

From the chapter: Chapter 4: Probability

Glossary: joint distribution, marginal distribution, conditional probability

Transcript

Two random variables, height and weight. Plot them on a grid. Each cell holds a probability. Together they form the joint distribution.

To get the marginal distribution of height alone, sum the joint along rows. The probability of being one metre eighty, regardless of weight, is the sum across that whole horizontal strip.

To get the marginal of weight alone, sum along columns instead.

Marginalisation throws information away on purpose. It says: I no longer care about the second variable, give me the first one only.

Conditioning is a different move. We slice the grid at a fixed value of one variable, say weight equals seventy kilograms, and renormalise. Now we are looking at the conditional distribution of height given weight equals seventy.

Marginal and conditional answer different questions. Marginals tell us what to expect when we know nothing. Conditionals tell us what to expect when we know one thing already.

Bayes' theorem links them. The conditional of A given B equals the joint, divided by the marginal of B. Three pictures, one identity. The whole machinery of probabilistic reasoning sits on this little grid.

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).