Evolutionary Computation
Evolutionary computation uses principles from biological evolution — selection, mutation, crossover, and survival of the fittest — to solve optimization and search problems.
Genetic Algorithms (GAs)
Framework
1. Initialize random POPULATION of candidate solutions.
2. Evaluate FITNESS of each individual.
3. SELECT parents (higher fitness → higher selection probability).
4. Apply CROSSOVER (recombine parent genes) → offspring.
5. Apply MUTATION (random small changes) → mutated offspring.
6. Replace population with offspring (generational or steady-state).
7. Repeat from step 2 until termination (max generations, fitness threshold).
Encoding
Binary: Solution represented as a bit string. Classic GA encoding.
Integer/Real-valued: Direct representation for numerical optimization.
Permutation: For ordering problems (TSP, scheduling). Requires special crossover operators (order crossover, PMX).
Tree: For program structures (genetic programming).
Selection Methods
Roulette wheel (fitness proportionate): Probability of selection ∝ fitness. Risk: dominant individuals take over (premature convergence).
Tournament selection: Randomly pick k individuals. Select the best. Adjustable selection pressure (larger k → stronger pressure).
Rank selection: Sort by fitness. Selection probability based on rank, not raw fitness. Avoids premature convergence when fitness values vary widely.
Elitism: Always keep the best individual(s) from the previous generation. Ensures the best solution is never lost.
Crossover Operators
Single-point: Choose a random crossover point. Swap the segments.
Parent1: AAAA|BBBB → AAAA|CCCC
Parent2: CCCC|DDDD → CCCC|BBBB
Two-point: Swap the middle segment.
Uniform: Each bit independently chosen from either parent with 50% probability.
Order crossover (OX, for permutations): Copy a random segment from Parent1. Fill remaining positions from Parent2 in order, skipping duplicates.
Mutation Operators
Bit flip: Flip each bit with probability p_m (typically 1/L where L = chromosome length).
Gaussian mutation (real-valued): Add Gaussian noise: xᵢ' = xᵢ + N(0, σ).
Swap mutation (permutation): Swap two random positions.
Inversion (permutation): Reverse a random segment.
Schema Theorem (Holland)
Schema: Pattern with wildcards. 1*0*1 matches 10001, 10011, 11001, 11011.
Building Block Hypothesis: GAs work by combining short, low-order, highly-fit schemas (building blocks) into increasingly fit individuals.
Schema theorem: Schemas with above-average fitness, short defining length, and low order increase exponentially in frequency. (More of a guiding intuition than a precise performance guarantee.)
Genetic Programming (GP)
Evolve programs (represented as trees) rather than fixed-length strings.
Program tree:
+
/ \
* sin
/ \ \
x x y
Terminals: Variables (x, y), constants. Functions: +, -, *, /, sin, cos, if, ...
Crossover: Swap subtrees between parents. Mutation: Replace a subtree with a random new subtree.
Applications: Symbolic regression (discover equations from data), circuit design, game strategy evolution.
Evolution Strategies (ES)
Focus on real-valued optimization. Primarily use mutation (Gaussian perturbation). Selection is based on ranking.
CMA-ES (Covariance Matrix Adaptation)
State-of-the-art for continuous black-box optimization.
Key idea: Maintain and adapt a covariance matrix describing the search distribution. The covariance matrix learns the shape of the fitness landscape (elongated valleys, correlations between variables).
1. Sample λ candidate solutions from N(mean, σ²·C)
2. Evaluate fitness. Select μ best.
3. Update mean toward the best solutions.
4. Update covariance matrix C based on successful steps (increase variance in successful directions).
5. Update step size σ (increase if consecutive steps are correlated).
Self-adaptive: No hyperparameter tuning needed beyond population size. Invariant to monotone transformations of the fitness function. Handles non-separable, ill-conditioned functions.
Used in: Robotics (controller optimization), game AI, reinforcement learning (OpenAI's ES for Atari), hyperparameter tuning.
Differential Evolution (DE)
Mutation uses differences between existing population members:
Mutant: v = x_r1 + F × (x_r2 - x_r3) (F ∈ [0.5, 1.0])
Trial: u = crossover(v, x_target) (binomial crossover)
Selection: if f(u) ≤ f(x_target): replace
Simple, few parameters (F, CR), effective for many real-valued optimization problems. Competitive with CMA-ES for some problems.
Swarm Intelligence
Particle Swarm Optimization (PSO)
Particles move through the search space, influenced by their personal best position and the global best position.
vᵢ = w·vᵢ + c₁·r₁·(pbestᵢ - xᵢ) + c₂·r₂·(gbest - xᵢ)
xᵢ = xᵢ + vᵢ
w = inertia weight. c₁, c₂ = cognitive and social coefficients. r₁, r₂ = random [0,1].
Simple, fast convergence. Good for continuous optimization. May converge prematurely.
Ant Colony Optimization (ACO)
Inspired by ant foraging. Ants deposit pheromone on paths. Other ants prefer paths with more pheromone. Good paths get reinforced. Bad paths evaporate.
P(edge ij) ∝ τᵢⱼ^α × ηᵢⱼ^β
τ = pheromone level. η = heuristic desirability (e.g., 1/distance). α, β = relative importance.
Best for: Combinatorial optimization (TSP, vehicle routing, scheduling). Adapts to dynamic environments.
Neuroevolution
Evolve neural network weights, architectures, or both.
NEAT (NeuroEvolution of Augmenting Topologies)
Evolves both network topology AND weights.
Key innovations: Historical markings (track gene origins for meaningful crossover between different topologies). Speciation (protect innovations by competing within species). Complexification (start simple, add complexity through mutation).
HyperNEAT
Evolve a pattern-generating network (CPPN) that defines the weights of a larger network. Exploits geometric regularities (symmetry, repetition).
Multi-Objective Optimization
Optimize multiple conflicting objectives simultaneously. No single best solution — find the Pareto front (set of non-dominated solutions).
NSGA-II
Non-dominated Sorting Genetic Algorithm II. The standard multi-objective GA.
- Fast non-dominated sorting: Rank solutions into Pareto fronts.
- Crowding distance: Prefer solutions in less crowded regions of the front (diversity).
- Select parents by: lower front rank is better; within same rank, higher crowding distance is better.
MOEA/D
Decompose the multi-objective problem into many single-objective subproblems. Each subproblem optimized by one member of the population. Neighbors share information.
Quality Diversity (QD)
Find a diverse collection of high-performing solutions, not just one optimal solution.
MAP-Elites: Divide the behavior space into cells. For each cell, maintain the highest-performing solution. Explore by mutating existing solutions; place offspring in the appropriate cell.
Applications: Robot design (diverse gaits), game content generation, creative search.
Applications in CS
- Engineering design: Optimize aerodynamic shapes, antenna design, circuit layout, structural design.
- Machine learning: Neural architecture search (evolving network topologies). Hyperparameter optimization. Feature selection.
- Robotics: Controller optimization, gait learning, morphology design.
- Game AI: Evolve game strategies, procedural content generation (levels, characters, rules).
- Bioinformatics: Protein folding, sequence alignment, gene regulatory network inference.
- Scheduling: Job-shop scheduling, vehicle routing, timetabling.