Arthur E. Hoerl & Robert W. Kennard (1970)
Technometrics.
DOI: https://doi.org/10.1080/00401706.1970.10488634
Abstract. The original ridge-regression paper. Identifies that the ordinary least-squares estimator $\hat{\mathbf{w}} = (\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{X}^\top \mathbf{y}$ has explosive variance when $\mathbf{X}^\top\mathbf{X}$ is ill-conditioned, even though it remains unbiased. Proposes adding $\lambda \mathbf{I}$ to $\mathbf{X}^\top\mathbf{X}$ before inversion: the resulting ridge estimator trades a small bias for a substantial variance reduction and yields lower mean-squared error in expectation. The paper introduced what is now called Tikhonov or $\ell_2$ regularisation to the statistics literature and remains the primary citation for the technique.
Tags: regression statistics regularisation
Cited in: