References

Some methods of speeding up the convergence of iteration methods

Boris T. Polyak (1964)

USSR Computational Mathematics and Mathematical Physics.

DOI: https://doi.org/10.1016/0041-5553(64)90137-5

Abstract. Introduces the heavy-ball method, gradient descent augmented with a momentum term that pushes the iterate in the direction of an exponential moving average of past gradients. Polyak proved that for strongly convex quadratics the heavy-ball method achieves the accelerated convergence rate $\mathcal{O}((\sqrt{\kappa}-1)/(\sqrt{\kappa}+1))^t$, a quadratic improvement over plain gradient descent's $\mathcal{O}((\kappa-1)/(\kappa+1))^t$. The method is the historical ancestor of every modern momentum-based optimiser including SGD-with-momentum, Nesterov's accelerated gradient, Adam and AdamW.

Tags: optimisation history

Cited in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).