1967–, Computer scientist
Also known as: Josef Hochreiter
Josef "Sepp" Hochreiter is an Austrian computer scientist whose 1991 diploma thesis, supervised by Jürgen Schmidhuber in Munich, identified and analysed the vanishing-gradient problem in recurrent neural networks: gradients of the loss with respect to weights in early layers shrink exponentially with the network's depth or temporal extent, making naive recurrent training effectively impossible.
In 1997 he and Schmidhuber published Long Short-Term Memory, introducing the LSTM unit, a recurrent cell with explicit gating mechanisms (input, forget, output gates) and an unchanging "constant error carousel" that allowed gradients to flow over long time horizons. LSTMs were the dominant sequence model from roughly 2014 to 2018, powering Google Translate, Apple Siri, Amazon Alexa and many other production systems.
Hochreiter heads the Institute for Machine Learning at Johannes Kepler University Linz and is a co-director of the European Laboratory for Learning and Intelligent Systems (ELLIS) Unit Linz.
Video
Related people: Jürgen Schmidhuber
Works cited in this book:
- Long Short-Term Memory (1997) (with Jürgen Schmidhuber)
- Self-Normalizing Neural Networks (2017) (with Günter Klambauer, Thomas Unterthiner, Andreas Mayr)
Discussed in:
- Chapter 12: Sequence Models, Sequence Models