Glossary

Early Stopping

Early Stopping is a simple yet remarkably effective regularisation technique. During training, the model's performance on a held-out validation set is monitored. When the validation loss stops improving—and especially when it begins to rise—training is halted, even though the training loss may still be decreasing. The model from the epoch with the best validation loss is kept and used.

The rationale is that overfitting typically progresses in phases. In early training, both training and validation loss decrease together as the model learns genuine patterns. Beyond a certain point, the model begins to memorise idiosyncrasies of the training set, causing training loss to continue falling while validation loss rises. Early stopping identifies this inflection point and commits to the best-generalising model.

Early stopping is essentially free: it requires no changes to the model or optimiser, just monitoring and a bit of patience (typically a "patience" parameter specifies how many epochs to wait for improvement before stopping). Formally, early stopping in gradient descent on a quadratic loss is equivalent to L2 regularisation with a strength depending on learning rate and iteration count—providing theoretical justification for what practitioners have long used heuristically. Early stopping is almost universally employed in modern deep learning and is particularly valuable when combined with learning rate schedules.

Related terms: Overfitting, Regularisation, Training, Validation, and Test Sets

Discussed in:

Also defined in: Textbook of AI