Combinatorial Optimization
Overview
Combinatorial optimization seeks an optimal element from a finite (or countably infinite) set. The feasible set typically has combinatorial structure (subsets, permutations, matchings, paths) that prevents simple enumeration.
minimize f(S)
subject to S in F ⊆ 2^E
where E is a ground set and F is a family of feasible subsets.
Matroid Theory
A matroid M = (E, I) consists of a ground set E and a family of independent sets I satisfying:
- Empty set is in I.
- Hereditary: if A in I and B ⊆ A, then B in I.
- Exchange: if A, B in I and |A| < |B|, then there exists e in B \ A with A ∪ {e} in I.
Key Examples
| Matroid | Ground Set | Independent Sets | |---------|-----------|-----------------| | Uniform U_{k,n} | n elements | All subsets of size <= k | | Graphic | Edges of graph | Acyclic edge subsets (forests) | | Linear/Vectorial | Vectors | Linearly independent subsets | | Partition | Elements partitioned into groups | Sets with <= k_i from group i | | Transversal | Edges of bipartite graph | Matchable subsets |
Why Matroids Matter
The greedy algorithm solves the optimization problem max { w(S) : S in I, S is a basis } if and only if the feasible set forms a matroid. This explains why greedy works for minimum spanning tree (graphic matroid) and scheduling on one machine (partition matroid).
Matroid intersection: optimizing over the intersection of two matroids is polynomial (Edmonds). This captures bipartite matching, arborescences, and colorful spanning trees. Three-matroid intersection is NP-hard.
Matroid union: the union of k matroids is a matroid. Efficient algorithms exist.
Submodular Functions
A set function f: 2^E -> R is submodular if for all A ⊆ B ⊆ E and e not in B:
f(A ∪ {e}) - f(A) >= f(B ∪ {e}) - f(B)
This is the diminishing returns property. Equivalently: f(A) + f(B) >= f(A ∪ B) + f(A ∩ B).
Key Examples
- Cut function of a graph: f(S) = edges crossing (S, V\S).
- Coverage functions: f(S) = |∪_{i in S} A_i|.
- Entropy: f(S) = H(X_S) for jointly distributed random variables.
- Rank function of a matroid.
Optimization of Submodular Functions
Maximization (NP-hard in general):
- Unconstrained: randomized O(1/2) approximation.
- Monotone + cardinality constraint: greedy gives (1 - 1/e) ≈ 0.632 approximation (tight).
- Monotone + matroid constraint: greedy gives 1/2; continuous greedy gives (1 - 1/e).
Minimization: polynomial via the ellipsoid method or combinatorial algorithms (Iwata-Fleischer-Fujishige, Schrijver). Practically solved by minimum norm point algorithms.
Network Optimization
Minimum Cost Flow
minimize sum_{(i,j)} c_ij * x_ij
subject to sum_j x_ij - sum_j x_ji = b_i for all i (flow conservation)
l_ij <= x_ij <= u_ij for all (i,j) (capacity)
Subsumes shortest path (single source-sink, unit flow), max flow (maximize total flow, zero cost), assignment (bipartite, unit capacities), and transportation problems.
Algorithms: network simplex (specialized simplex for network structure, very fast in practice), successive shortest paths, cost scaling. Polynomial time due to total unimodularity.
Maximum Flow / Minimum Cut
Max-flow min-cut theorem: the maximum flow from s to t equals the minimum capacity of an s-t cut.
Algorithms:
- Ford-Fulkerson / Edmonds-Karp: O(V E^2).
- Push-relabel (Goldberg-Tarjan): O(V^2 E), O(V^3) with highest-label.
- Orlin: O(VE).
Matching
- Bipartite: maximum matching via augmenting paths (Hopcroft-Karp, O(E sqrt(V))). Weighted: Hungarian algorithm O(n^3).
- General: Edmonds' blossom algorithm for maximum matching. Weighted: O(n^3).
Facility Location
Given clients and potential facility locations, open a subset of facilities and assign clients to minimize total cost (opening + connection).
Uncapacitated Facility Location (UFL):
minimize sum_i f_i y_i + sum_{i,j} c_ij x_ij
subject to sum_i x_ij = 1 for all j, x_ij <= y_i, x,y in {0,1}
Best known approximation: 1.488 (Li 2013). LP-rounding and primal-dual give 1.61. The problem is NP-hard, and no PTAS exists (unless NP = P) as it is MAX-SNP hard.
k-median, k-means, k-center: variants with different objectives and cardinality constraints on open facilities.
Vehicle Routing Problem (VRP)
Generalization of TSP: multiple vehicles serve customers from a depot, minimizing total travel cost.
Capacitated VRP (CVRP): each vehicle has capacity Q. Exact methods solve instances up to ~100 customers. Heuristics (LKH, adaptive large neighborhood search) handle thousands.
Variants: time windows (VRPTW), pickup and delivery, heterogeneous fleet, dynamic/stochastic demands.
State-of-the-art: branch-cut-and-price combining column generation (routes as columns) with cutting planes and branching.
Constraint Programming (CP)
CP models problems using variables with finite domains and constraints (not necessarily linear).
Core Techniques
- Domain propagation: reduce variable domains by enforcing arc consistency (and higher levels).
- Global constraints: alldifferent, cumulative, circuit, table—encode common patterns with specialized propagation.
- Search: systematic backtracking with variable/value selection heuristics.
- Nogood learning: record and propagate infeasibility explanations (inspired by SAT CDCL).
CP excels at scheduling, rostering, and highly constrained feasibility problems where LP relaxations are weak.
CP vs MIP
| Aspect | CP | MIP | |--------|-----|-----| | Bounds | Weak (domain-based) | Strong (LP relaxation) | | Feasibility | Strong (propagation) | Moderate | | Logical constraints | Native | Requires linearization | | Objective optimization | Weaker | Strong |
Hybrid: CP-SAT (Google OR-Tools) and CPLEX CP Optimizer integrate CP and MIP reasoning.
SAT-Based Optimization
MaxSAT
Given a CNF formula with hard and weighted soft clauses, maximize the total weight of satisfied soft clauses. Equivalent to weighted partial MaxSAT.
Algorithms:
- Core-guided (OLL, OLLITI): iteratively find unsatisfiable cores, add relaxation variables.
- Linear search: fix a bound, solve SAT, tighten.
- Branch and bound with unit propagation.
Pseudo-Boolean Optimization
Optimize a linear function of binary variables subject to pseudo-Boolean constraints (linear inequalities over {0,1}). Generalizes MaxSAT. Solvers extend SAT with cutting planes and conflict-driven search.
Applications of SAT-Based Methods
- Verification and model checking
- Configuration problems
- Scheduling with complex logical constraints
- Combinatorial design
Approximation Algorithms
For NP-hard problems, approximation algorithms provide provable guarantees:
| Problem | Best Known Ratio | Technique | |---------|-----------------|-----------| | Vertex Cover | 2 | LP rounding / primal-dual | | Set Cover | O(ln n) | Greedy (tight) | | TSP (metric) | 3/2 | Christofides-Serdyukov | | Max-Cut | 0.878 | SDP rounding (GW) | | Steiner Tree | ln(4) + epsilon | Iterative rounding | | Facility Location | 1.488 | LP rounding |
Inapproximability: PCP theorem and unique games conjecture establish limits on achievable ratios.
Connections
- Integer programming formulations for combinatorial problems (→
02-integer-programming.md). - SDP relaxations for approximation algorithms (→
05-conic-programming.md). - Submodular maximization via stochastic methods (→
08-stochastic-optimization.md). - Solver landscape for combinatorial problems (→
09-optimization-solvers.md).