Underfitting is the mirror image of overfitting: the model is too simple or too constrained to capture the patterns in the data, resulting in poor performance on both the training set and the test set. An underfit linear model applied to highly nonlinear data, or a shallow network applied to a complex perceptual task, will exhibit high bias—it systematically misses the truth regardless of how much training data it sees.
The diagnostic signature of underfitting is high training error together with high test error. By contrast, overfitting shows low training error but high test error. Distinguishing the two is the first step in improving a model. If underfitting, the remedies are to increase model capacity (more layers, more parameters, richer features), reduce regularisation, or engineer more informative features. If overfitting, the remedies are the opposite.
In practice, modern deep learning often sidesteps underfitting by starting with very large, highly expressive models and relying on regularisation, large datasets, and careful optimisation to control overfitting. The "double descent" phenomenon observed in overparameterised networks shows that the classical U-shaped curve of test error versus complexity has a second descent beyond the interpolation threshold—a surprising finding that has refined, but not overturned, the bias–variance framework.
Related terms: Overfitting, Bias-Variance Tradeoff, Regularisation
Discussed in:
- Chapter 6: ML Fundamentals — The ML Framework
Also defined in: Textbook of AI