1986–, Computer scientist
Ashish Vaswani is an Indian-American computer scientist whose 2017 paper Attention Is All You Need (with Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser and Illia Polosukhin at Google Brain) introduced the Transformer, an architecture using only multi-head self-attention and feed-forward layers, with no recurrence or convolution.
The Transformer was originally developed for machine translation but proved highly general. BERT (2018), GPT (2018), T5 (2019), GPT-2 (2019), GPT-3 (2020) and every large language model since are decoder-only or encoder-decoder Transformers. Vision Transformers (Dosovitskiy 2020) extended the architecture to images. AlphaFold 2 (2021) used Transformer- style attention for protein structure prediction. The Transformer is now the dominant computational primitive of AI.
Vaswani co-founded Adept AI in 2022 and Essential AI in 2023, both working on agentic and tool-using LLMs. The eight authors of the original Transformer paper have collectively become a defining cohort of post-2017 AI: Shazeer co-founded Character.AI; Parmar and Uszkoreit founded Adept; Polosukhin co-founded NEAR; Gomez founded Cohere; Kaiser is at OpenAI.
Video
Related people: Noam Shazeer, Geoffrey Hinton
Works cited in this book:
- Attention Is All You Need (2017) (with Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin)
- Self-Attention with Relative Position Representations (2018) (with Peter Shaw, Jakob Uszkoreit)
Discussed in:
- Chapter 13: Attention & Transformers, Attention and Transformers