Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, & Denny Zhou (2023), References, Textbook of AI

Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, & Denny Zhou (2023)

International Conference on Learning Representations.

URL: https://arxiv.org/abs/2211.15661

Abstract. Investigates the mechanism of in-context learning by training Transformers on a synthetic linear-regression task and comparing their behaviour with classical learning algorithms. Shows that the trained Transformer's predictions match those of ordinary least squares and one step of gradient descent, and constructs explicit weight assignments that implement these algorithms in a single forward pass. The paper is a key data point for the "in-context learning is meta-learned optimisation" hypothesis.

Tags: language-models in-context-learning transformers

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

What learning algorithm is in-context learning? Investigations with linear models