Jacob Devlin, Ming-Wei Chang, Kenton Lee, & Kristina Toutanova (2019), References, Textbook of AI

Jacob Devlin, Ming-Wei Chang, Kenton Lee, & Kristina Toutanova (2019)

Proceedings of NAACL-HLT 2019, Volume 1 (Long and Short Papers), 4171-4186.

DOI: https://doi.org/10.18653/v1/n19-1423

Abstract. Introduces BERT, a bidirectional transformer encoder pre-trained with masked language modelling and next-sentence prediction. BERT's pretrain-then-fine-tune recipe achieved state-of-the-art on most NLP benchmarks and transformed natural language processing.

Tags: transformer bert pretraining nlp

Cited in:

Chapter 13: Attention & Transformers

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding