5 min read
On this page

Combinatorial Optimization

Overview

Combinatorial optimization seeks an optimal element from a finite (or countably infinite) set. The feasible set typically has combinatorial structure (subsets, permutations, matchings, paths) that prevents simple enumeration.

minimize    f(S)
subject to  S in F ⊆ 2^E

where E is a ground set and F is a family of feasible subsets.

Matroid Theory

A matroid M = (E, I) consists of a ground set E and a family of independent sets I satisfying:

  1. Empty set is in I.
  2. Hereditary: if A in I and B ⊆ A, then B in I.
  3. Exchange: if A, B in I and |A| < |B|, then there exists e in B \ A with A ∪ {e} in I.

Key Examples

| Matroid | Ground Set | Independent Sets | |---------|-----------|-----------------| | Uniform U_{k,n} | n elements | All subsets of size <= k | | Graphic | Edges of graph | Acyclic edge subsets (forests) | | Linear/Vectorial | Vectors | Linearly independent subsets | | Partition | Elements partitioned into groups | Sets with <= k_i from group i | | Transversal | Edges of bipartite graph | Matchable subsets |

Why Matroids Matter

The greedy algorithm solves the optimization problem max { w(S) : S in I, S is a basis } if and only if the feasible set forms a matroid. This explains why greedy works for minimum spanning tree (graphic matroid) and scheduling on one machine (partition matroid).

Matroid intersection: optimizing over the intersection of two matroids is polynomial (Edmonds). This captures bipartite matching, arborescences, and colorful spanning trees. Three-matroid intersection is NP-hard.

Matroid union: the union of k matroids is a matroid. Efficient algorithms exist.

Submodular Functions

A set function f: 2^E -> R is submodular if for all A ⊆ B ⊆ E and e not in B:

f(A ∪ {e}) - f(A) >= f(B ∪ {e}) - f(B)

This is the diminishing returns property. Equivalently: f(A) + f(B) >= f(A ∪ B) + f(A ∩ B).

Key Examples

  • Cut function of a graph: f(S) = edges crossing (S, V\S).
  • Coverage functions: f(S) = |∪_{i in S} A_i|.
  • Entropy: f(S) = H(X_S) for jointly distributed random variables.
  • Rank function of a matroid.

Optimization of Submodular Functions

Maximization (NP-hard in general):

  • Unconstrained: randomized O(1/2) approximation.
  • Monotone + cardinality constraint: greedy gives (1 - 1/e) ≈ 0.632 approximation (tight).
  • Monotone + matroid constraint: greedy gives 1/2; continuous greedy gives (1 - 1/e).

Minimization: polynomial via the ellipsoid method or combinatorial algorithms (Iwata-Fleischer-Fujishige, Schrijver). Practically solved by minimum norm point algorithms.

Network Optimization

Minimum Cost Flow

minimize    sum_{(i,j)} c_ij * x_ij
subject to  sum_j x_ij - sum_j x_ji = b_i   for all i  (flow conservation)
            l_ij <= x_ij <= u_ij              for all (i,j)  (capacity)

Subsumes shortest path (single source-sink, unit flow), max flow (maximize total flow, zero cost), assignment (bipartite, unit capacities), and transportation problems.

Algorithms: network simplex (specialized simplex for network structure, very fast in practice), successive shortest paths, cost scaling. Polynomial time due to total unimodularity.

Maximum Flow / Minimum Cut

Max-flow min-cut theorem: the maximum flow from s to t equals the minimum capacity of an s-t cut.

Algorithms:

  • Ford-Fulkerson / Edmonds-Karp: O(V E^2).
  • Push-relabel (Goldberg-Tarjan): O(V^2 E), O(V^3) with highest-label.
  • Orlin: O(VE).

Matching

  • Bipartite: maximum matching via augmenting paths (Hopcroft-Karp, O(E sqrt(V))). Weighted: Hungarian algorithm O(n^3).
  • General: Edmonds' blossom algorithm for maximum matching. Weighted: O(n^3).

Facility Location

Given clients and potential facility locations, open a subset of facilities and assign clients to minimize total cost (opening + connection).

Uncapacitated Facility Location (UFL):

minimize  sum_i f_i y_i + sum_{i,j} c_ij x_ij
subject to  sum_i x_ij = 1  for all j,  x_ij <= y_i,  x,y in {0,1}

Best known approximation: 1.488 (Li 2013). LP-rounding and primal-dual give 1.61. The problem is NP-hard, and no PTAS exists (unless NP = P) as it is MAX-SNP hard.

k-median, k-means, k-center: variants with different objectives and cardinality constraints on open facilities.

Vehicle Routing Problem (VRP)

Generalization of TSP: multiple vehicles serve customers from a depot, minimizing total travel cost.

Capacitated VRP (CVRP): each vehicle has capacity Q. Exact methods solve instances up to ~100 customers. Heuristics (LKH, adaptive large neighborhood search) handle thousands.

Variants: time windows (VRPTW), pickup and delivery, heterogeneous fleet, dynamic/stochastic demands.

State-of-the-art: branch-cut-and-price combining column generation (routes as columns) with cutting planes and branching.

Constraint Programming (CP)

CP models problems using variables with finite domains and constraints (not necessarily linear).

Core Techniques

  • Domain propagation: reduce variable domains by enforcing arc consistency (and higher levels).
  • Global constraints: alldifferent, cumulative, circuit, table—encode common patterns with specialized propagation.
  • Search: systematic backtracking with variable/value selection heuristics.
  • Nogood learning: record and propagate infeasibility explanations (inspired by SAT CDCL).

CP excels at scheduling, rostering, and highly constrained feasibility problems where LP relaxations are weak.

CP vs MIP

| Aspect | CP | MIP | |--------|-----|-----| | Bounds | Weak (domain-based) | Strong (LP relaxation) | | Feasibility | Strong (propagation) | Moderate | | Logical constraints | Native | Requires linearization | | Objective optimization | Weaker | Strong |

Hybrid: CP-SAT (Google OR-Tools) and CPLEX CP Optimizer integrate CP and MIP reasoning.

SAT-Based Optimization

MaxSAT

Given a CNF formula with hard and weighted soft clauses, maximize the total weight of satisfied soft clauses. Equivalent to weighted partial MaxSAT.

Algorithms:

  • Core-guided (OLL, OLLITI): iteratively find unsatisfiable cores, add relaxation variables.
  • Linear search: fix a bound, solve SAT, tighten.
  • Branch and bound with unit propagation.

Pseudo-Boolean Optimization

Optimize a linear function of binary variables subject to pseudo-Boolean constraints (linear inequalities over {0,1}). Generalizes MaxSAT. Solvers extend SAT with cutting planes and conflict-driven search.

Applications of SAT-Based Methods

  • Verification and model checking
  • Configuration problems
  • Scheduling with complex logical constraints
  • Combinatorial design

Approximation Algorithms

For NP-hard problems, approximation algorithms provide provable guarantees:

| Problem | Best Known Ratio | Technique | |---------|-----------------|-----------| | Vertex Cover | 2 | LP rounding / primal-dual | | Set Cover | O(ln n) | Greedy (tight) | | TSP (metric) | 3/2 | Christofides-Serdyukov | | Max-Cut | 0.878 | SDP rounding (GW) | | Steiner Tree | ln(4) + epsilon | Iterative rounding | | Facility Location | 1.488 | LP rounding |

Inapproximability: PCP theorem and unique games conjecture establish limits on achievable ratios.

Connections

  • Integer programming formulations for combinatorial problems (→ 02-integer-programming.md).
  • SDP relaxations for approximation algorithms (→ 05-conic-programming.md).
  • Submodular maximization via stochastic methods (→ 08-stochastic-optimization.md).
  • Solver landscape for combinatorial problems (→ 09-optimization-solvers.md).