References

How Does Batch Normalization Help Optimization?

Shibani Santurkar, Dimitris Tsipras, Andrew Ilyas, & Aleksander Madry (2018)

arXiv.

DOI: https://doi.org/10.48550/arxiv.1805.11604

Abstract. Demonstrates empirically that batch normalisation does not reduce internal covariate shift in any meaningful sense. The authors argue instead that BN smooths the loss landscape, reducing the Lipschitz constants of the loss and its gradient, which explains the observed training benefits.

Tags: regularisation batch-normalisation theory

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).