Planning
AI planning finds a sequence of actions to achieve a goal from an initial state. Unlike search, planning uses a factored representation of states (sets of logical propositions) enabling powerful reasoning.
Classical Planning
STRIPS Representation
State: Set of ground atoms (propositions that are true). Closed-world assumption: unmentioned atoms are false.
Action: Defined by preconditions (what must be true), add effects (what becomes true), delete effects (what becomes false).
Action: Move(robot, from, to)
Precondition: At(robot, from) ∧ Clear(to)
Add: At(robot, to) ∧ Clear(from)
Delete: At(robot, from) ∧ Clear(to)
PDDL (Planning Domain Definition Language)
Standard language for describing planning problems.
(define (domain logistics)
(:action drive
:parameters (?truck ?from ?to)
:precondition (and (truck ?truck) (at ?truck ?from) (connected ?from ?to))
:effect (and (at ?truck ?to) (not (at ?truck ?from)))))
(define (problem deliver)
(:domain logistics)
(:init (truck t1) (at t1 city-a) (package p1) (at p1 city-a) (connected city-a city-b))
(:goal (at p1 city-b)))
State-Space Search
Forward (Progression)
Start from initial state, apply actions, search for goal state. Standard graph search (BFS, A*) on the state space.
Challenge: Branching factor is large (many applicable actions). Heuristics are essential.
Backward (Regression)
Start from goal, regress through actions (find actions whose effects achieve subgoals), search for the initial state.
Advantage: Focuses on relevant actions (only those achieving goal propositions). Smaller branching factor.
Planning Graph (Graphplan)
Build a layered graph alternating between proposition levels and action levels.
Level 0 (props) → Level 1 (actions) → Level 1 (props) → Level 2 (actions) → ...
Mutex (mutual exclusion): Two actions or propositions that can't coexist at the same level. Track mutexes to detect infeasibility.
Extract solution: Once all goal propositions appear at the same level without mutex, search backward through the graph for a valid plan.
Graphplan: Build the planning graph until goals are reachable, then extract. Very efficient for many classical planning benchmarks.
Heuristics for Planning
Relaxed Planning Graph
Ignore delete effects — once a proposition is true, it stays true forever. This is a relaxation that makes the problem easier to solve.
h^add: Sum of costs to achieve each goal proposition independently (in the relaxed problem). Inadmissible but informative.
h^max: Maximum cost to achieve any single goal proposition. Admissible but less informative.
h^FF (FastForward heuristic): Solve the relaxed problem exactly (using a planning graph without mutexes). Count the number of actions in the relaxed plan. Very effective — powers the FF planner.
Landmark Heuristics
A landmark is a proposition or action that must be true/executed at some point in every valid plan.
Count the number of unachieved landmarks → admissible heuristic. LM-Cut and LM-Count are among the strongest admissible heuristics.
Hierarchical Task Network (HTN) Planning
Decompose compound tasks into subtasks recursively, down to primitive actions.
Task: PrepareTrip(destination)
→ BookFlight(destination) + BookHotel(destination) + PackBags()
Task: BookFlight(dest)
→ SearchFlights(dest) + SelectFlight() + Pay()
Methods: Each compound task has methods (decomposition recipes). The planner selects which method to apply.
Advantages: Encodes domain expertise. Reduces search space dramatically. Plans are hierarchically structured (easier to understand and modify).
Used in: Game AI (HTN-based behavior), robotics, military planning, workflow automation.
Partial-Order Planning
Instead of a total ordering of actions, find a partial order — specify orderings only where necessary. More flexible than total-order planning.
Least commitment: Don't commit to orderings until forced by constraints. Avoids unnecessary backtracking.
Planning Under Uncertainty
Conditional Planning
Plan with branches based on observation outcomes.
If sensor reads X:
Execute plan A
Else:
Execute plan B
Replanning
Execute the plan. If something goes wrong (unexpected state), replan from the current state.
Simple and practical: Used in most real-world systems. Doesn't require modeling all contingencies.
POMDP (Partially Observable MDP)
The most general framework for planning under uncertainty. The agent doesn't know the true state — maintains a belief state (probability distribution over states).
Exact solution: Intractable for most problems (exponential in the state space). Approximate methods: Point-based value iteration (PBVI, SARSOP), Monte Carlo planning.
Applications in CS
- Robotics: Task and motion planning (TAMP). Combining symbolic planning with geometric reasoning.
- Game AI: HTN planning for NPC behavior. GOAP (Goal-Oriented Action Planning) in game engines.
- Logistics: Delivery routing, warehouse automation, supply chain optimization.
- Automated testing: Generate test sequences that cover specific behaviors.
- Web services: Compose web services to achieve a complex task (service composition as planning).
- Natural language: Dialogue planning, story generation.