Sepp Hochreiter & Jürgen Schmidhuber (1997)
Neural Computation, 9(8), 1735-1780.
DOI: https://doi.org/10.1162/neco.1997.9.8.1735
Abstract. Introduces the LSTM, whose gated cell state provides a 'gradient highway' that enables learning of long-range dependencies in sequences. LSTMs overcame the vanishing gradient problem that crippled simple RNNs and dominated sequence modelling for two decades.
Tags: rnn lstm sequence-models