Shibani Santurkar, Dimitris Tsipras, Andrew Ilyas, & Aleksander Madry (2018), References, Textbook of AI

Shibani Santurkar, Dimitris Tsipras, Andrew Ilyas, & Aleksander Madry (2018)

arXiv.

DOI: https://doi.org/10.48550/arxiv.1805.11604

Abstract. Demonstrates empirically that batch normalisation does not reduce internal covariate shift in any meaningful sense. The authors argue instead that BN smooths the loss landscape, reducing the Lipschitz constants of the loss and its gradient, which explains the observed training benefits.

Tags: regularisation batch-normalisation theory

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

How Does Batch Normalization Help Optimization?