People

Tomas Mikolov

1980–, Computer scientist

Tomáš Mikolov is a Czech computer scientist whose 2013 papers Efficient Estimation of Word Representations in Vector Space (with Chen, Corrado and Dean) and Distributed Representations of Words and Phrases and their Compositionality (with Sutskever, Chen, Corrado and Dean) introduced word2vec, a family of efficient neural-network-based methods for learning continuous vector representations of words from large unlabelled corpora.

The two main word2vec architectures, CBOW (predict word from context) and skip-gram (predict context from word), trained with the negative sampling objective, made high-quality word embeddings practically computable on commodity hardware. The famous arithmetic results , that king − man + woman ≈ queen in embedding space, demonstrated that the learned representations captured semantic and syntactic regularities, popularising distributed representations to a wide audience.

Mikolov's PhD work at Brno introduced the recurrent neural network language model, which substantially outperformed n-gram models on perplexity and was an important precursor to modern neural language modelling. After Google, Microsoft Research and Facebook AI Research, he returned to academia at the Czech Institute of Informatics, Robotics and Cybernetics.

Video

Related people: Jeff Dean, Yoshua Bengio

Works cited in this book:

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).