References

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

Arthur Jacot, Franck Gabriel, & Clément Hongler (2018)

Advances in Neural Information Processing Systems 31.

URL: https://arxiv.org/abs/1806.07572

Abstract. Establishes that in the infinite-width limit, training a neural network with gradient descent and small learning rate is equivalent to kernel regression with the Neural Tangent Kernel (NTK), a kernel determined by the network architecture at initialisation. The NTK remains constant during training in this limit, so dynamics become linear and analytically tractable. The paper provided the first solid theoretical handle on the optimisation and generalisation of overparameterised networks and seeded a substantial follow-up literature on lazy training, feature learning and the gap between NTK predictions and finite-width network behaviour.

Tags: theory deep-learning generalisation

Cited in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).