1928–2005, Statistician
Leo Breiman was an American statistician at UC Berkeley who, after a first career in industry and the RAND Corporation, returned to academia in his fifties and produced a long series of foundational contributions to machine learning. With Friedman, Olshen and Stone he wrote Classification and Regression Trees (1984), introducing CART, the recursive binary-splitting decision tree algorithm that, together with Quinlan's ID3 / C4.5, defined modern decision-tree learning.
He invented bagging (bootstrap aggregating, 1996), the technique of training many models on bootstrap resamples of the data and averaging their predictions, a key ingredient in modern ensemble methods. In 2001 he combined bagging with random feature selection at each split to produce random forests, one of the most widely-used machine-learning algorithms ever developed.
Breiman's 2001 essay Statistical Modeling: The Two Cultures offered an unusually sharp methodological critique of mainstream statistics. He argued that the "data modelling culture" (which assumes the data are generated by a particular probabilistic model and infers its parameters) had been overtaken in many applications by the "algorithmic modelling culture" (which treats the data-generating process as a black box and selects predictive algorithms by performance). The essay anticipated nearly every methodological argument of the deep-learning era.
Related people: Vladimir Vapnik
Works cited in this book:
- Bagging predictors (1996)
- Random Forests (2001)
Discussed in:
- Chapter 7: Supervised Learning, Supervised Learning