Dzmitry Bahdanau, Kyunghyun Cho, & Yoshua Bengio (2014)
arXiv.
DOI: https://doi.org/10.48550/arxiv.1409.0473
Abstract. Introduces the attention mechanism for neural machine translation, allowing the decoder to compute a weighted combination of encoder states at each step rather than relying on a single fixed context vector. Attention dramatically improved translation quality on long sentences.
Tags: attention nmt seq2seq
Cited in: