Visualisation

Correlation measures linear association

Last reviewed 5 May 2026

Pearson's r runs from minus one through zero to plus one. Visualise scatterplots at each value.

From the chapter: Chapter 4: Probability

Glossary: correlation, covariance

Transcript

Two variables, X and Y. We want a number that summarises how they move together.

The covariance is the average of X minus its mean times Y minus its mean. Positive when they move together, negative when one rises as the other falls.

But covariance has units. Larger-scale variables produce larger covariance. To compare across pairs, divide by the product of standard deviations. The result is Pearson's correlation coefficient, capped between minus one and plus one.

When r equals plus one, the relationship is perfectly positive. Every data point sits exactly on a line that rises to the right.

At r equals zero, no linear pattern remains. The cloud may be round, or it may have a non-linear shape that this measure cannot see.

With r equals minus one, the relationship is perfectly negative. The points fall exactly along a line that drops to the right.

In between, more spread.

Crucial caveats. r is purely linear. A clean parabola has correlation zero. A non-linear monotonic shape has correlation that depends on the curvature.

And correlation is not causation. Two ice cream variables can be correlated through a hidden cause, summer. Always look at the scatter plot, not just the number.

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).