References

What learning algorithm is in-context learning? Investigations with linear models

Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, & Denny Zhou (2023)

International Conference on Learning Representations.

URL: https://arxiv.org/abs/2211.15661

Abstract. Investigates the mechanism of in-context learning by training Transformers on a synthetic linear-regression task and comparing their behaviour with classical learning algorithms. Shows that the trained Transformer's predictions match those of ordinary least squares and one step of gradient descent, and constructs explicit weight assignments that implement these algorithms in a single forward pass. The paper is a key data point for the "in-context learning is meta-learned optimisation" hypothesis.

Tags: language-models in-context-learning transformers

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).