Bayes' Theorem is the cornerstone of probabilistic reasoning and arguably the single most important result in applied AI. For events $A$ and $B$ with $P(B) > 0$:
$$P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}$$
In the language of Bayesian inference, $P(A)$ is the prior (belief before observing data), $P(B \mid A)$ is the likelihood (probability of the data under hypothesis $A$), $P(B)$ is the evidence or marginal likelihood, and $P(A \mid B)$ is the posterior (updated belief after incorporating the data). The theorem provides a precise calculus for belief revision, optimal in a well-defined decision-theoretic sense.
Applied to model parameters, Bayes' theorem becomes $P(\theta \mid D) \propto P(D \mid \theta) \cdot P(\theta)$. Full Bayesian inference seeks the entire posterior distribution, enabling uncertainty quantification rather than just a point estimate. Exact posteriors are usually intractable, motivating approximate methods: Markov Chain Monte Carlo (MCMC) draws samples from the posterior; variational inference optimises a tractable approximation; Laplace approximation fits a Gaussian around the posterior mode.
The classic medical-test example illustrates the subtlety: if a disease afflicts 1 in 1000 and a test has 99% sensitivity and 95% specificity, the probability that a positive test indicates disease is only about 2%. This base-rate fallacy—forgetting the prior when reasoning about positive results—is a common failure of intuition that Bayes' theorem corrects. Bayesian thinking is indispensable wherever AI systems must reason about uncertainty.
Related terms: Maximum Likelihood Estimation
Discussed in:
- Chapter 4: Probability — 4.2 Bayes’ Theorem
Also defined in: Textbook of AI, Textbook of Medical AI, Textbook of Medical Statistics