Planning is the AI subfield concerned with computing sequences of actions to achieve goals. The classical formulation is: given an initial state, a goal condition, and a set of action operators specified by their preconditions and effects, find a plan (sequence of operators) that transforms the initial state into one that satisfies the goal.
Formally a planning problem is a tuple $\langle S, A, T, s_0, G \rangle$ where $S$ is a set of states, $A$ a set of actions, $T: S \times A \to S$ a transition function, $s_0 \in S$ the initial state, and $G \subseteq S$ the goal set. A plan $\pi = \langle a_1, a_2, \ldots, a_k \rangle$ is valid if executing it from $s_0$ produces a state in $G$.
History
STRIPS (Stanford Research Institute Problem Solver), introduced by Richard Fikes and Nils Nilsson in 1971 to control the Shakey robot, was the founding formalism. STRIPS represents states as sets of ground atoms and operators as triples $\langle \text{precondition}, \text{add list}, \text{delete list} \rangle$. Modern planners use PDDL (Planning Domain Definition Language), introduced by Drew McDermott in 1998 for the International Planning Competition; PDDL extends STRIPS with typed objects, conditional effects, durative actions, numeric fluents and preferences.
Algorithms
- State-space search: forward (progression from $s_0$) or backward (regression from $G$). $A^*$ with admissible heuristics dominates; classical heuristics include $h^+$ (delete-relaxation), $h_{\max}$, $h_{\text{add}}$, landmark-based heuristics and the FF heuristic based on relaxed plan extraction.
- Plan-space search: search over partial plans (least-commitment planning, e.g. POP, UCPOP, SNLP).
- GraphPlan (Blum & Furst 1995): builds a layered planning graph, then extracts a plan by backward search.
- SAT-based planning: encode the bounded planning problem as a propositional satisfiability instance and call a SAT solver.
- Hierarchical task networks (HTN): decompose high-level tasks into sub-tasks via a library of methods (SHOP, SHOP2).
- Modern academic planners: FF, FastDownward, LAMA, Mercury, DELFI dominate IPC benchmarks.
Modern relevance
Classical planners remain dominant for problems with clean formal specifications: operations research (vehicle routing, scheduling), robotics (manipulation, autonomous driving), aerospace (NASA's Remote Agent controlled Deep Space 1, EUROPA scheduled Mars rover operations), software pipelines and automated configuration.
LLM-based agents do "planning" differently, generating plan-like outputs through chain-of-thought, ReAct-style tool calls and self-reflection without explicit STRIPS-style search. This is flexible but brittle: LLMs hallucinate preconditions and produce invalid plans. Hybrid neuro-symbolic approaches use LLMs to propose actions or translate natural-language goals into PDDL, then hand off to classical planners for verification, completion or expansion. The PDDL-LLM line of work and systems like LLM+P demonstrate strong empirical gains over pure LLM planning. Reinforcement learning (especially MuZero, AlphaZero) provides a third paradigm where planning is performed by Monte Carlo tree search guided by learned value and policy networks.
Related terms: STRIPS, Heuristic Search, Reinforcement Learning
Discussed in:
- Chapter 4: Probability, Symbolic AI and Planning