People

Dzmitry Bahdanau

1989–, Computer scientist

Dzmitry Bahdanau is a Belarusian computer scientist whose 2014 paper with Kyunghyun Cho and Yoshua Bengio Neural Machine Translation by Jointly Learning to Align and Translate introduced the attention mechanism for sequence-to-sequence learning. Where standard seq2seq models compressed the entire source sentence into a single fixed-size vector, Bahdanau attention let the decoder look back at every source word at each output step, with learned weights determining how much each source word contributed to the next output.

The mechanism dramatically improved neural machine translation, especially for long sentences, and was the conceptual ancestor of the self-attention at the heart of the Transformer (Vaswani et al., 2017). The lineage from Bahdanau attention to modern LLMs is direct, every attention head in every modern Transformer is a refinement of Bahdanau's 2014 idea.

Bahdanau completed his PhD at Montréal under Bengio and has worked at Element AI / ServiceNow Research and now at MILA.

Related people: Yoshua Bengio, Ashish Vaswani

Works cited in this book:

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).