Peter Shaw, Jakob Uszkoreit, & Ashish Vaswani (2018)
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 464-468.
DOI: https://doi.org/10.18653/v1/n18-2074
Abstract. Introduces relative positional representations for self-attention: rather than adding absolute position embeddings to inputs, the attention logits are modified by learned embeddings that depend on the relative distance between positions.
Tags: transformer positional-encoding attention