Glossary

In-Context Learning

In-Context Learning (ICL) is the remarkable ability of large language models to perform new tasks based solely on examples provided in the prompt, without any updates to the model's weights. First documented prominently in GPT-3, ICL allows a practitioner to specify a task simply by showing the model a few input-output examples followed by a new input, and the model produces the appropriate output—essentially "learning" from the prompt's context.

ICL has several forms: zero-shot (the task is described but no examples given), one-shot (a single example), and few-shot (several examples). The few-shot setting is particularly powerful: with just a handful of demonstrations, LLMs can perform classification, translation, arithmetic, and many other tasks at surprising levels of competence. Chain-of-thought prompting—adding intermediate reasoning steps to the examples—dramatically improves performance on multi-step reasoning tasks.

The mechanisms underlying ICL remain poorly understood theoretically. The model's weights do not change, yet its behaviour adapts as if it had been trained on the prompt examples. Several hypotheses have been proposed: the model implements implicit gradient descent on the prompt; it performs a form of Bayesian inference treating the examples as evidence; or attention heads implement pattern-matching "induction" circuits. Whatever the mechanism, ICL has transformed how LLMs are deployed: rather than fine-tuning a model for each task, practitioners can often just write a good prompt. This has democratised access to powerful NLP and enabled rapid prototyping across an enormous range of applications.

Related terms: Large Language Model, Chain-of-Thought

Discussed in:

Also defined in: Textbook of AI