Multi-Agent Orchestration, Glossary, Textbook of AI

Multi-agent orchestration is the architectural pattern in which $N \ge 2$ LLM agents collaborate. Each agent has its own context window, system prompt, tool set, and (often) model. Coordination is achieved through messages, shared state, or a supervising agent.

Why multiple agents

Specialisation, a "Researcher" agent with web-search tools, a "Coder" agent with Python, a "Critic" agent with no tools but a strong system prompt.
Context segmentation, long tasks blow up a single context window; sub-agents work in their own windows and report summaries.
Parallelism, independent sub-tasks run concurrently.
Adversarial dynamics, debate between agents (Du et al. 2023) improves factuality.
Role-play emergent behaviour, Park et al.'s Generative Agents (2023) showed believable simulated societies.

Topologies

Topology	Description	Frameworks
Pipeline	Linear chain: A → B → C	LangChain Sequential
Supervisor	One leader spawns and gathers from workers	OpenAI Swarm, CrewAI, Claude Code's sub-agents
Group chat	All agents see a shared transcript; a router picks the next speaker	AutoGen GroupChat
Hierarchical	Tree of supervisors and workers	MetaGPT, AutoGen
Debate	Two or more agents argue, judge picks winner	Du et al. 2023

Pseudocode (supervisor pattern)

def supervisor(task):
    plan = llm("plan", task)
    results = []
    for step in plan:
        agent = pick_specialist(step)
        results.append(agent.run(step))
    return llm("synthesize", task, results)

Frameworks

AutoGen (Microsoft, 2023), group chat + tool use.
CrewAI, role-based, production-focused.
MetaGPT, software-team simulation, SOP-driven.
OpenAI Swarm (2024), minimal handoff-based orchestration.
LangGraph (LangChain, 2024), stateful directed graph of agents.

When not to use multi-agent

Empirical lesson from 2024–2025 production: multi-agent often hurts. Anthropic's 2024 post "Building effective agents" and Cognition's "Don't Build Multi-Agents" both argue that:

Communication overhead dominates.
Errors compound across agents.
A single capable model with good memory management and tool use usually outperforms a pipeline of specialised agents.

The remaining use cases are true parallelism (e.g. searching 100 documents simultaneously) and strict role separation for safety.

Related terms: AutoGen, CrewAI, MetaGPT, Tool Use, Memory and Context Management

Discussed in:

Chapter 15: Modern AI, Modern AI

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).