Nils Reimers & Iryna Gurevych (2019), References, Textbook of AI

Nils Reimers & Iryna Gurevych (2019)

Conference on Empirical Methods in Natural Language Processing.

URL: https://arxiv.org/abs/1908.10084

Abstract. Introduces Sentence-BERT (SBERT), the standard architecture for fixed-length sentence embeddings. Plain BERT produces token-level contextual embeddings; pooling them naively yields poor sentence similarity. SBERT fine-tunes BERT in a Siamese architecture with a contrastive objective on natural-language-inference pairs, so that semantically similar sentences map to embeddings with high cosine similarity. SBERT cut sentence-similarity inference time from $\mathcal{O}(n^2)$ pairwise BERT calls to $\mathcal{O}(n)$ embeddings plus dot-product search. Its descendants, MPNet, E5, BGE, GTE, power the modern retrieval-augmented-generation stack.

Tags: language-models retrieval embeddings

Cited in:

Chapter 13: Attention & Transformers

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks