Visualisation

Few-shot examples teach the model in the prompt

Last reviewed 5 May 2026

A handful of input-output pairs in the prompt steer a frozen model to a new task.

From the chapter: Chapter 15: Modern AI

Glossary: in context learning, few shot learning

Transcript

A pre-trained language model. Frozen weights. We never gradient-update again.

How does it learn a new task. By being shown examples in the prompt.

Task: classify movie reviews as positive or negative.

Zero-shot prompt: "Review: this film is a masterpiece. Sentiment:" The model guesses, sometimes right, sometimes not.

One-shot prompt: prepend one labelled example. "Review: I hated every minute. Sentiment: negative. Review: this film is a masterpiece. Sentiment:" Now the model is more reliable.

Few-shot prompt: prepend three or five labelled examples. The accuracy keeps climbing.

The model has not been retrained. Its weights are unchanged. The labelled examples sit in its context window, and somehow the forward pass uses them to set the right output for the test query.

What is happening inside. Researchers have shown that attention can implement a form of gradient descent on a small linear regression problem hidden in the prompt. The transformer is acting as a learning algorithm, taking gradient steps inside the activations.

In-context learning was one of the most surprising properties of GPT-3. It made foundation models suddenly useful for tasks no one had imagined when training. Prompt engineering, retrieval-augmented generation, and tool use are all built on the same trick.

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).