Glossary

Causal Inference

Causal inference is the framework for inferring causal effects from data, distinguishing cause from correlation. Classical statistics describes joint distributions; causal inference asks what happens when we intervene.

Two equivalent frameworks:

Potential outcomes (Rubin 1974): for each unit $i$ and treatment $t$, posit a potential outcome $Y_i(t)$. Average treatment effect: $\mathrm{ATE} = \mathbb{E}[Y(1) - Y(0)]$. The fundamental problem: only one potential outcome is observed per unit (which one was actually treated).

Structural causal models (Pearl 2000): a directed acyclic graph (DAG) of variables where each node is determined by its parents and exogenous noise. The do-operator $\mathrm{do}(X = x)$ represents intervention. Bayes' theorem doesn't transfer directly under intervention; the do-calculus provides three rules for translating $P(y | \mathrm{do}(x))$ into observational distributions, when possible.

Identification: when can $P(y | \mathrm{do}(x))$ be computed from observational data alone? Backdoor criterion, frontdoor criterion, instrumental variables, regression discontinuity are sufficient conditions in different settings.

Counterfactuals: $P(Y_x | X = x', Y = y')$, "what would $Y$ have been if $X$ had been $x$, given that we observed $X = x'$ and $Y = y'$"? Strictly stronger than intervention; the third rung of Pearl's "ladder of causation" (association → intervention → counterfactual).

Modern relevance:

  • Causal inference for ML: out-of-distribution generalisation, fairness, transportability all benefit from causal framing.
  • Causal representation learning (Schölkopf et al.): learn representations whose components correspond to disentangled causal variables.
  • A/B testing: the gold standard for causal effects in tech companies. Randomisation eliminates confounding.
  • Health, social science, policy: where randomised trials are infeasible, observational causal inference (matching, propensity scores, IV) is the workhorse.

Pearl received the 2011 Turing Award partly for the do-calculus.

Related terms: judea-pearl, Bayesian Inference, Bayes' Theorem

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).