Foundation Model, Glossary, Textbook of AI

A foundation model (term coined in Bommasani et al. 2021, On the Opportunities and Risks of Foundation Models, Stanford CRFM) is a large model pre-trained on broad data at scale, adaptable to a wide range of downstream tasks. The defining example is the modern large language model (GPT, Claude, Gemini, Llama). The term emphasises a paradigm shift: rather than training task-specific models from scratch, train one general-purpose model and adapt it.

Properties:

Scale: typically billions to trillions of parameters, trained on trillions of tokens.
Generality: handles many downstream tasks via fine-tuning, prompting, or in-context learning.
Emergence: capabilities not present in smaller models appear at scale (chain-of-thought reasoning, in-context learning, instruction following).
Homogenisation: many applications now share the same underlying model, with adaptation rather than redesign.

Risks highlighted by the foundation-models report:

Single point of failure: errors and biases in the foundation model propagate to all downstream uses.
Concentration of power: only well-resourced organisations can train them.
Opacity: hard to audit a 70B-parameter model's reasoning.
Misuse: the same general capability can be applied to harmful tasks.

Multimodal foundation models: CLIP, GPT-4V, Claude, Gemini, combining text, image, audio, and video modalities. Increasingly the standard rather than the exception.

Vertical foundation models: Med-PaLM (medicine), ESM-2 (proteins), GraphCast (weather), GNoME (materials). Domain-specific foundation models trained on domain-specific data.

Related terms: Language Model, GPT, Claude, CLIP, In-Context Learning

Discussed in:

Chapter 15: Modern AI, Modern AI

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).