3 min read
On this page

Planning

AI planning finds a sequence of actions to achieve a goal from an initial state. Unlike search, planning uses a factored representation of states (sets of logical propositions) enabling powerful reasoning.

Classical Planning

STRIPS Representation

State: Set of ground atoms (propositions that are true). Closed-world assumption: unmentioned atoms are false.

Action: Defined by preconditions (what must be true), add effects (what becomes true), delete effects (what becomes false).

Action: Move(robot, from, to)
  Precondition: At(robot, from) ∧ Clear(to)
  Add: At(robot, to) ∧ Clear(from)
  Delete: At(robot, from) ∧ Clear(to)

PDDL (Planning Domain Definition Language)

Standard language for describing planning problems.

(define (domain logistics)
  (:action drive
    :parameters (?truck ?from ?to)
    :precondition (and (truck ?truck) (at ?truck ?from) (connected ?from ?to))
    :effect (and (at ?truck ?to) (not (at ?truck ?from)))))

(define (problem deliver)
  (:domain logistics)
  (:init (truck t1) (at t1 city-a) (package p1) (at p1 city-a) (connected city-a city-b))
  (:goal (at p1 city-b)))

Forward (Progression)

Start from initial state, apply actions, search for goal state. Standard graph search (BFS, A*) on the state space.

Challenge: Branching factor is large (many applicable actions). Heuristics are essential.

Backward (Regression)

Start from goal, regress through actions (find actions whose effects achieve subgoals), search for the initial state.

Advantage: Focuses on relevant actions (only those achieving goal propositions). Smaller branching factor.

Planning Graph (Graphplan)

Build a layered graph alternating between proposition levels and action levels.

Level 0 (props) → Level 1 (actions) → Level 1 (props) → Level 2 (actions) → ...

Mutex (mutual exclusion): Two actions or propositions that can't coexist at the same level. Track mutexes to detect infeasibility.

Extract solution: Once all goal propositions appear at the same level without mutex, search backward through the graph for a valid plan.

Graphplan: Build the planning graph until goals are reachable, then extract. Very efficient for many classical planning benchmarks.

Heuristics for Planning

Relaxed Planning Graph

Ignore delete effects — once a proposition is true, it stays true forever. This is a relaxation that makes the problem easier to solve.

h^add: Sum of costs to achieve each goal proposition independently (in the relaxed problem). Inadmissible but informative.

h^max: Maximum cost to achieve any single goal proposition. Admissible but less informative.

h^FF (FastForward heuristic): Solve the relaxed problem exactly (using a planning graph without mutexes). Count the number of actions in the relaxed plan. Very effective — powers the FF planner.

Landmark Heuristics

A landmark is a proposition or action that must be true/executed at some point in every valid plan.

Count the number of unachieved landmarks → admissible heuristic. LM-Cut and LM-Count are among the strongest admissible heuristics.

Hierarchical Task Network (HTN) Planning

Decompose compound tasks into subtasks recursively, down to primitive actions.

Task: PrepareTrip(destination)
  → BookFlight(destination) + BookHotel(destination) + PackBags()

Task: BookFlight(dest)
  → SearchFlights(dest) + SelectFlight() + Pay()

Methods: Each compound task has methods (decomposition recipes). The planner selects which method to apply.

Advantages: Encodes domain expertise. Reduces search space dramatically. Plans are hierarchically structured (easier to understand and modify).

Used in: Game AI (HTN-based behavior), robotics, military planning, workflow automation.

Partial-Order Planning

Instead of a total ordering of actions, find a partial order — specify orderings only where necessary. More flexible than total-order planning.

Least commitment: Don't commit to orderings until forced by constraints. Avoids unnecessary backtracking.

Planning Under Uncertainty

Conditional Planning

Plan with branches based on observation outcomes.

If sensor reads X:
    Execute plan A
Else:
    Execute plan B

Replanning

Execute the plan. If something goes wrong (unexpected state), replan from the current state.

Simple and practical: Used in most real-world systems. Doesn't require modeling all contingencies.

POMDP (Partially Observable MDP)

The most general framework for planning under uncertainty. The agent doesn't know the true state — maintains a belief state (probability distribution over states).

Exact solution: Intractable for most problems (exponential in the state space). Approximate methods: Point-based value iteration (PBVI, SARSOP), Monte Carlo planning.

Applications in CS

  • Robotics: Task and motion planning (TAMP). Combining symbolic planning with geometric reasoning.
  • Game AI: HTN planning for NPC behavior. GOAP (Goal-Oriented Action Planning) in game engines.
  • Logistics: Delivery routing, warehouse automation, supply chain optimization.
  • Automated testing: Generate test sequences that cover specific behaviors.
  • Web services: Compose web services to achieve a complex task (service composition as planning).
  • Natural language: Dialogue planning, story generation.