References

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, & Geoffrey E. Hinton (2016)

arXiv.

DOI: https://doi.org/10.48550/arxiv.1607.06450

Abstract. Proposes layer normalisation, which computes normalisation statistics over the features of a single example rather than over a mini-batch. Layer norm is independent of batch size and has become standard in transformer architectures.

Tags: regularisation layer-normalisation

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).