Informed Search
Informed search uses domain-specific knowledge (heuristics) to guide the search toward the goal more efficiently.
Heuristic Functions
A heuristic h(n) estimates the cost from state n to the nearest goal. A good heuristic dramatically reduces the search space.
Admissible: h(n) ≤ h*(n) (never overestimates true cost). Guarantees optimality for A*.
Consistent (monotone): h(n) ≤ cost(n,n') + h(n') for every successor n'. Implies admissible. Guarantees A* never reopens nodes.
Designing Heuristics
Relaxed problem: Remove constraints from the original problem. The optimal cost of the relaxed problem is an admissible heuristic.
Example (8-puzzle):
- Misplaced tiles (h₁): Count tiles not in goal position. Admissible (each misplaced tile needs at least 1 move).
- Manhattan distance (h₂): Sum of Manhattan distances of each tile to its goal position. Admissible. Dominates h₁ (h₂ ≥ h₁ always → h₂ is more informative → A* expands fewer nodes).
Pattern databases: Precompute exact costs for subproblems. Store in a lookup table. Very effective for puzzles (15-puzzle, Rubik's cube).
Landmark heuristics: Precompute distances to a set of landmark nodes. Use triangle inequality to derive lower bounds.
Greedy Best-First Search
Expand the node with the smallest h(n) — closest estimated distance to goal.
Not optimal: May find a suboptimal path (h is just an estimate). Not complete on infinite graphs. Fast in practice: O(b^m) worst case but often much better with a good heuristic.
A* Search
Expand the node with the smallest f(n) = g(n) + h(n) — actual cost so far + estimated cost to goal.
PROCEDURE A_STAR(start, goal, h) → (path, cost) or NONE
open ← min-priority queue (ordered by f value)
g_score ← empty map
came_from ← empty map
g_score[start] ← 0.0
INSERT(open, (f ← h(start), state ← start))
WHILE open IS NOT empty DO
current ← EXTRACT_MIN(open)
IF goal(current) THEN RETURN RECONSTRUCT_PATH(came_from, current, g_score[current])
g_current ← g_score[current]
FOR EACH (action, neighbor, cost) IN SUCCESSORS(current) DO
tentative_g ← g_current + cost
IF tentative_g < g_score.GET(neighbor, default ← INFINITY) THEN
g_score[neighbor] ← tentative_g
came_from[neighbor] ← (current, action)
f ← tentative_g + h(neighbor)
INSERT(open, (f, neighbor))
RETURN NONE
A* Properties
Optimality: If h is admissible (tree search) or consistent (graph search), A* finds the optimal solution.
Completeness: Yes, if branching factor is finite and step costs ≥ ε > 0.
Optimally efficient: Among algorithms that use the same heuristic h, A* expands the minimum number of nodes. No other optimal algorithm expands fewer nodes (for consistent h).
Time/Space: O(b^d) worst case. With a perfect heuristic (h = h*): O(d). In practice: exponential improvement over uninformed search with a good heuristic.
Space is the bottleneck: A* stores all generated nodes. For large problems, memory is exhausted before time.
A* Optimality Proof (Sketch)
If A* expands a goal node, it has the smallest f value on the open list. For any unexplored optimal path: f ≤ g*(goal) (by admissibility, h doesn't overestimate). So the first goal expanded must be optimal.
Memory-Bounded Search
IDA* (Iterative Deepening A*)
Like IDDFS but uses f-value cutoff instead of depth cutoff. Increase the cutoff each iteration to the minimum f-value that exceeded the previous cutoff.
Space: O(bd) — like IDDFS. Time: May re-expand many nodes. Efficient when h values have few distinct values.
RBFS (Recursive Best-First Search)
Linear space. Keeps track of the best alternative f-value. Backs up and re-expands when a better alternative is found.
SMA* (Simplified Memory-Bounded A*)
Like A* but drops the worst node when memory is full. Remembers the dropped node's f-value in the parent.
Local Search
For optimization problems where the path doesn't matter — only the final state.
Hill Climbing
Always move to the best neighbor (highest/lowest evaluation).
current = initial_state
loop:
neighbor = best_neighbor(current)
if eval(neighbor) ≤ eval(current): return current // stuck
current = neighbor
Problems: Local maxima, plateaus, ridges. No guarantee of global optimum.
Variants: Steepest ascent (check all neighbors), stochastic hill climbing (random uphill neighbor), random restarts (run multiple times from random starts — surprisingly effective).
Simulated Annealing
Like hill climbing but occasionally accepts worse moves (with probability decreasing over time).
P(accept worse move) = e^(-ΔE / T)
T (temperature) decreases over time (cooling schedule). High T → explore freely. Low T → exploit (like hill climbing).
Guaranteed to converge to global optimum if cooled slowly enough (theoretical — impractical for exact convergence). In practice: good solutions for many problems.
Genetic Algorithms
Population of candidate solutions. Evolve via selection, crossover, and mutation.
1. Initialize random population
2. Evaluate fitness of each individual
3. Select parents (proportional to fitness)
4. Crossover: combine parents to create offspring
5. Mutation: randomly modify some offspring
6. Replace population with offspring
7. Repeat from step 2
Ant Colony Optimization: Inspired by ant foraging. Pheromone trails guide search. Good for TSP-like problems.
Particle Swarm Optimization: Particles move through solution space, influenced by personal best and global best positions.
Beam Search
BFS but keep only the k best nodes at each level (beam width k).
Not optimal, not complete. But practical for very large search spaces. Used in NLP (machine translation, text generation).
k = 1: Greedy search. k = ∞: BFS.
Applications in CS
- Pathfinding: A* is the standard for game pathfinding, GPS navigation, robot motion planning.
- Puzzle solving: A* + pattern database heuristics for 15-puzzle, Rubik's cube.
- Planning: AI planning systems use A* variants with domain-specific heuristics.
- NLP: Beam search for sequence generation (translation, summarization, code generation).
- Optimization: Simulated annealing for circuit placement, scheduling. Genetic algorithms for design optimization.