1963–, Computer scientist
Jürgen Schmidhuber is a German computer scientist whose Swiss-based IDSIA lab produced an enormous fraction of the foundational neural-network work of the 1990s and 2000s. With his student Sepp Hochreiter he developed the Long Short-Term Memory (LSTM) architecture (1997), which solved the vanishing-gradient problem in recurrent neural networks and was the dominant sequence model from roughly 2014 to 2018 (when Transformers superseded it).
Schmidhuber's lab has produced contributions to recurrent networks, reinforcement learning, evolutionary methods, neural Turing machines and other topics. He is also notorious for his detailed and unrelenting public claims that nearly every major modern deep-learning result was either anticipated or directly invented in his lab, claims that range from well-supported (LSTM, the Highway Network's antecedent role for ResNet) to vigorously contested.
Schmidhuber moved to KAUST in 2021. The "deep-learning history wars" between his lab and the Hinton-LeCun-Bengio camp continue intermittently.
Video
Related people: Sepp Hochreiter, Geoffrey Hinton
Works cited in this book:
- Long Short-Term Memory (1997) (with Sepp Hochreiter)
- Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks (2006) (with Alex Graves, Santiago Fernández, Faustino Gomez)
Discussed in:
- Chapter 12: Sequence Models, Sequence Models