Ilya Loshchilov & Frank Hutter (2016)
arXiv.
DOI: https://doi.org/10.48550/arxiv.1608.03983
Abstract. Introduces cosine annealing with warm restarts: a learning rate schedule that follows a cosine curve down to a small value and then resets, enabling the optimiser to escape local basins and explore new regions of the loss landscape.
Tags: optimisation learning-rate cosine-annealing