Graph of Thoughts, Glossary, Textbook of AI

Graph of Thoughts (GoT) (Besta et al., ETH Zürich, 2023) generalises Tree of Thoughts by allowing thoughts to be combined, refined, and looped rather than only branched. The reasoning structure is a directed acyclic graph (DAG); thoughts are vertices, and edges represent dependencies between thoughts.

Why a graph

Tree of Thoughts cannot naturally express:

Aggregation, merging insights from two sibling branches.
Refinement, improving a single thought via self-reflection.
Reuse, referencing a thought in multiple downstream branches.

GoT models all three by giving thoughts arbitrary in-/out-degree.

Operations

Besta et al. define five graph operations:

Operation	Effect
Generate	Create $k$ child thoughts
Refine	Improve a thought in-place
Aggregate	Merge $n$ thoughts into one
Score	Evaluate a thought
KeepBest	Prune to top-$k$

A Graph of Operations (GoO) is a static schedule of these operations the developer designs for the task. The LLM executes each operation step.

Example: sorting

For sorting a list of 64 numbers (which GPT-4 cannot do reliably), Besta et al. structure the GoT as:

Split the list into 4 chunks (Generate).
Sort each chunk (Generate, parallel).
Merge pairs of sorted chunks (Aggregate, twice).
Score the result (Score).
Refine if score is low (Refine).

Result: 70% accuracy on 64-number sorting vs 24% for CoT and 28% for ToT.

Other tasks

Set intersection (62% reduction in cost vs ToT at same quality).
Keyword counting in long documents.
Document merging (multi-document summarisation).

Trade-offs

Pro:

Strictly more general than ToT.
Aggregation is the killer feature for tasks involving combination (merging, intersection, comparison).

Con:

Complexity, designing the GoO requires task-specific engineering.
Cost, same as ToT, multiplied.
Less mainstream adoption than ToT.

Modern relevance

GoT is rarely used in production for the same reason ToT is, too expensive, but its insight that reasoning has graph structure informs modern thinking. Frameworks like LangGraph make graph-of-LLM-calls a first-class abstraction; AlphaProof's lemma-graph search is a GoT-style structure trained into the model.

Relationship

Generalises Tree of Thoughts (a DAG with branching factor 1 inside paths).
Conceptually related to self-reflection (the Refine operation).
Influenced LangGraph and other graph-based agent frameworks (multi-agent orchestration).

Citation

Besta, M. et al. (2023). Graph of Thoughts: Solving Elaborate Problems with Large Language Models. AAAI 2024. arXiv:2308.09687.

Discussed in:

Chapter 15: Modern AI, Modern AI

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).