People

Sepp Hochreiter

1967–, Computer scientist

Also known as: Josef Hochreiter

Josef "Sepp" Hochreiter is an Austrian computer scientist whose 1991 diploma thesis, supervised by Jürgen Schmidhuber in Munich, identified and analysed the vanishing-gradient problem in recurrent neural networks: gradients of the loss with respect to weights in early layers shrink exponentially with the network's depth or temporal extent, making naive recurrent training effectively impossible.

In 1997 he and Schmidhuber published Long Short-Term Memory, introducing the LSTM unit, a recurrent cell with explicit gating mechanisms (input, forget, output gates) and an unchanging "constant error carousel" that allowed gradients to flow over long time horizons. LSTMs were the dominant sequence model from roughly 2014 to 2018, powering Google Translate, Apple Siri, Amazon Alexa and many other production systems.

Hochreiter heads the Institute for Machine Learning at Johannes Kepler University Linz and is a co-director of the European Laboratory for Learning and Intelligent Systems (ELLIS) Unit Linz.

Video

Related people: Jürgen Schmidhuber

Works cited in this book:

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).