Glossary

Multi-Agent Orchestration

Multi-agent orchestration is the architectural pattern in which $N \ge 2$ LLM agents collaborate. Each agent has its own context window, system prompt, tool set, and (often) model. Coordination is achieved through messages, shared state, or a supervising agent.

Why multiple agents

  1. Specialisation, a "Researcher" agent with web-search tools, a "Coder" agent with Python, a "Critic" agent with no tools but a strong system prompt.
  2. Context segmentation, long tasks blow up a single context window; sub-agents work in their own windows and report summaries.
  3. Parallelism, independent sub-tasks run concurrently.
  4. Adversarial dynamics, debate between agents (Du et al. 2023) improves factuality.
  5. Role-play emergent behaviour, Park et al.'s Generative Agents (2023) showed believable simulated societies.

Topologies

Topology Description Frameworks
Pipeline Linear chain: A → B → C LangChain Sequential
Supervisor One leader spawns and gathers from workers OpenAI Swarm, CrewAI, Claude Code's sub-agents
Group chat All agents see a shared transcript; a router picks the next speaker AutoGen GroupChat
Hierarchical Tree of supervisors and workers MetaGPT, AutoGen
Debate Two or more agents argue, judge picks winner Du et al. 2023

Pseudocode (supervisor pattern)

def supervisor(task):
    plan = llm("plan", task)
    results = []
    for step in plan:
        agent = pick_specialist(step)
        results.append(agent.run(step))
    return llm("synthesize", task, results)

Frameworks

  • AutoGen (Microsoft, 2023), group chat + tool use.
  • CrewAI, role-based, production-focused.
  • MetaGPT, software-team simulation, SOP-driven.
  • OpenAI Swarm (2024), minimal handoff-based orchestration.
  • LangGraph (LangChain, 2024), stateful directed graph of agents.

When not to use multi-agent

Empirical lesson from 2024–2025 production: multi-agent often hurts. Anthropic's 2024 post "Building effective agents" and Cognition's "Don't Build Multi-Agents" both argue that:

  • Communication overhead dominates.
  • Errors compound across agents.
  • A single capable model with good memory management and tool use usually outperforms a pipeline of specialised agents.

The remaining use cases are true parallelism (e.g. searching 100 documents simultaneously) and strict role separation for safety.

Related terms: AutoGen, CrewAI, MetaGPT, Tool Use, Memory and Context Management

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).