Mikhail Belkin, Daniel Hsu, Siyuan Ma, & Soumik Mandal (2019), References, Textbook of AI

Mikhail Belkin, Daniel Hsu, Siyuan Ma, & Soumik Mandal (2019)

Proceedings of the National Academy of Sciences.

DOI: https://doi.org/10.1073/pnas.1903070116

Abstract. Introduces and empirically demonstrates the double-descent phenomenon. As a model's capacity grows past the interpolation threshold, the point at which it can fit the training data exactly, test error first rises (the classical regime) and then falls again, often to below the optimal classical level. The authors show double descent across decision trees, random feature models and small neural networks, challenging the textbook bias-variance trade-off. Subsequent work generalised the picture to dataset size, training time and many other axes of "complexity".

Tags: generalisation theory deep-learning

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Reconciling modern machine-learning practice and the classical bias-variance trade-off