A Gaussian's tails fall off so fast that three standard deviations cover virtually all the probability.
From the chapter: Chapter 4: Probability
Glossary: gaussian distribution, standard deviation
Transcript
The standard normal distribution. Mean zero, standard deviation one. The textbook bell curve.
Sixty-eight percent of the probability lies within plus or minus one standard deviation of the mean. The middle bulk.
Ninety-five percent within plus or minus two. The familiar confidence-interval cutoff.
Ninety-nine point seven percent within plus or minus three. Three sigma covers nearly everything.
Beyond three sigma, the curve falls off very fast. Five sigma corresponds to a probability of roughly one in three and a half million. Particle physics requires this level for discovery announcements.
This rapid decay is why the Gaussian under-predicts extreme events in noisy real-world data. Stock returns, earthquake magnitudes, file sizes, all have heavier tails. Substituting a Gaussian for these quantities dramatically underestimates risk.
But within three sigma, the Gaussian is the workhorse of statistics, signal processing, and machine learning. Standardisation, hypothesis testing, sampling distributions, the central limit theorem, Brownian motion, all built on this curve. The 68-95-99.7 numbers are worth memorising.