Piotr Bojanowski, Edouard Grave, Armand Joulin, & Tomas Mikolov (2017)
Transactions of the Association for Computational Linguistics, 5, 135-146.
DOI: https://doi.org/10.1162/tacl_a_00051
Abstract. Introduces FastText, which represents each word as a bag of character n-grams and sums their embeddings. This subword approach enables embeddings for out-of-vocabulary words and is particularly valuable for morphologically rich languages.
Tags: embeddings fasttext subword