Glossary

Confidence Interval

A Confidence Interval is a range of values constructed from sample data that, over repeated sampling, would contain the true parameter with a specified probability—the confidence level, typically 95%. For the population mean with known variance, the 95% confidence interval is $\bar{X} \pm 1.96 \sigma / \sqrt{n}$. With unknown variance, one uses the sample standard deviation and $t$-distribution quantiles.

The precise interpretation of a confidence interval is subtle. A 95% confidence interval does not mean there is a 95% probability that the true parameter lies in the interval (that would be a Bayesian credible interval). Instead, it means that if we were to repeat the sampling and construction procedure many times, 95% of the resulting intervals would contain the true parameter. The interval itself either contains the parameter or does not; the 95% refers to the long-run success rate of the procedure.

Confidence intervals are generally more informative than point estimates or p-values, because they communicate both the magnitude of an effect and its uncertainty. In machine learning, they are used to report performance metrics with appropriate uncertainty: "accuracy 82.3% (95% CI: 80.1% to 84.5%)". Bootstrap confidence intervals, constructed by resampling the data with replacement and computing the statistic on each resample, provide a flexible non-parametric approach when analytic formulas are unavailable.

Related terms: Hypothesis Testing, P-value

Discussed in:

Also defined in: Textbook of AI, Textbook of Medical AI, Textbook of Medical Statistics