Abstract. Investigates the mechanism of in-context learning by training Transformers on a synthetic linear-regression task and comparing their behaviour with classical learning algorithms. Shows that the trained Transformer's predictions match those of ordinary least squares and one step of gradient descent, and constructs explicit weight assignments that implement these algorithms in a single forward pass. The paper is a key data point for the "in-context learning is meta-learned optimisation" hypothesis.