Combinatorial Optimization

Overview

Combinatorial optimization seeks an optimal element from a finite (or countably infinite) set. The feasible set typically has combinatorial structure (subsets, permutations, matchings, paths) that prevents simple enumeration.

minimize    f(S)
subject to  S in F ⊆ 2^E

where E is a ground set and F is a family of feasible subsets.

Matroid Theory

A matroid M = (E, I) consists of a ground set E and a family of independent sets I satisfying:

Empty set is in I.
Hereditary: if A in I and B ⊆ A, then B in I.
Exchange: if A, B in I and |A| < |B|, then there exists e in B \ A with A ∪ {e} in I.

Key Examples

Matroid	Ground Set	Independent Sets
Uniform U_{k,n}	n elements	All subsets of size <= k
Graphic	Edges of graph	Acyclic edge subsets (forests)
Linear/Vectorial	Vectors	Linearly independent subsets
Partition	Elements partitioned into groups	Sets with <= k_i from group i
Transversal	Edges of bipartite graph	Matchable subsets

Why Matroids Matter

The greedy algorithm solves the optimization problem max { w(S) : S in I, S is a basis } if and only if the feasible set forms a matroid. This explains why greedy works for minimum spanning tree (graphic matroid) and scheduling on one machine (partition matroid).

Matroid intersection: optimizing over the intersection of two matroids is polynomial (Edmonds). This captures bipartite matching, arborescences, and colorful spanning trees. Three-matroid intersection is NP-hard.

Matroid union: the union of k matroids is a matroid. Efficient algorithms exist.

Submodular Functions

A set function f: 2^E -> R is submodular if for all A ⊆ B ⊆ E and e not in B:

f(A ∪ {e}) - f(A) >= f(B ∪ {e}) - f(B)

This is the diminishing returns property. Equivalently: f(A) + f(B) >= f(A ∪ B) + f(A ∩ B).

Key Examples

Cut function of a graph: f(S) = edges crossing (S, V\S).
Coverage functions: f(S) = |∪_{i in S} A_i|.
Entropy: f(S) = H(X_S) for jointly distributed random variables.
Rank function of a matroid.

Optimization of Submodular Functions

Maximization (NP-hard in general):

Unconstrained: randomized O(1/2) approximation.
Monotone + cardinality constraint: greedy gives (1 - 1/e) ≈ 0.632 approximation (tight).
Monotone + matroid constraint: greedy gives 1/2; continuous greedy gives (1 - 1/e).

Minimization: polynomial via the ellipsoid method or combinatorial algorithms (Iwata-Fleischer-Fujishige, Schrijver). Practically solved by minimum norm point algorithms.

Network Optimization

Minimum Cost Flow

minimize    sum_{(i,j)} c_ij * x_ij
subject to  sum_j x_ij - sum_j x_ji = b_i   for all i  (flow conservation)
            l_ij <= x_ij <= u_ij              for all (i,j)  (capacity)

Subsumes shortest path (single source-sink, unit flow), max flow (maximize total flow, zero cost), assignment (bipartite, unit capacities), and transportation problems.

Algorithms: network simplex (specialized simplex for network structure, very fast in practice), successive shortest paths, cost scaling. Polynomial time due to total unimodularity.

Maximum Flow / Minimum Cut

Max-flow min-cut theorem: the maximum flow from s to t equals the minimum capacity of an s-t cut.

Algorithms:

Ford-Fulkerson / Edmonds-Karp: O(V E^2).
Push-relabel (Goldberg-Tarjan): O(V^2 E), O(V^3) with highest-label.
Orlin: O(VE).

Matching

Bipartite: maximum matching via augmenting paths (Hopcroft-Karp, O(E sqrt(V))). Weighted: Hungarian algorithm O(n^3).
General: Edmonds' blossom algorithm for maximum matching. Weighted: O(n^3).

Facility Location

Given clients and potential facility locations, open a subset of facilities and assign clients to minimize total cost (opening + connection).

Uncapacitated Facility Location (UFL):

minimize  sum_i f_i y_i + sum_{i,j} c_ij x_ij
subject to  sum_i x_ij = 1  for all j,  x_ij <= y_i,  x,y in {0,1}

Best known approximation: 1.488 (Li 2013). LP-rounding and primal-dual give 1.61. The problem is NP-hard, and no PTAS exists (unless NP = P) as it is MAX-SNP hard.

k-median, k-means, k-center: variants with different objectives and cardinality constraints on open facilities.

Vehicle Routing Problem (VRP)

Generalization of TSP: multiple vehicles serve customers from a depot, minimizing total travel cost.

Capacitated VRP (CVRP): each vehicle has capacity Q. Exact methods solve instances up to ~100 customers. Heuristics (LKH, adaptive large neighborhood search) handle thousands.

Variants: time windows (VRPTW), pickup and delivery, heterogeneous fleet, dynamic/stochastic demands.

State-of-the-art: branch-cut-and-price combining column generation (routes as columns) with cutting planes and branching.

Constraint Programming (CP)

CP models problems using variables with finite domains and constraints (not necessarily linear).

Core Techniques

Domain propagation: reduce variable domains by enforcing arc consistency (and higher levels).
Global constraints: alldifferent, cumulative, circuit, table—encode common patterns with specialized propagation.
Search: systematic backtracking with variable/value selection heuristics.
Nogood learning: record and propagate infeasibility explanations (inspired by SAT CDCL).

CP excels at scheduling, rostering, and highly constrained feasibility problems where LP relaxations are weak.

CP vs MIP

Aspect	CP	MIP
Bounds	Weak (domain-based)	Strong (LP relaxation)
Feasibility	Strong (propagation)	Moderate
Logical constraints	Native	Requires linearization
Objective optimization	Weaker	Strong

Hybrid: CP-SAT (Google OR-Tools) and CPLEX CP Optimizer integrate CP and MIP reasoning.

SAT-Based Optimization

MaxSAT

Given a CNF formula with hard and weighted soft clauses, maximize the total weight of satisfied soft clauses. Equivalent to weighted partial MaxSAT.

Algorithms:

Core-guided (OLL, OLLITI): iteratively find unsatisfiable cores, add relaxation variables.
Linear search: fix a bound, solve SAT, tighten.
Branch and bound with unit propagation.

Pseudo-Boolean Optimization

Optimize a linear function of binary variables subject to pseudo-Boolean constraints (linear inequalities over {0,1}). Generalizes MaxSAT. Solvers extend SAT with cutting planes and conflict-driven search.

Applications of SAT-Based Methods

Verification and model checking
Configuration problems
Scheduling with complex logical constraints
Combinatorial design

Approximation Algorithms

For NP-hard problems, approximation algorithms provide provable guarantees:

Problem	Best Known Ratio	Technique
Vertex Cover	2	LP rounding / primal-dual
Set Cover	O(ln n)	Greedy (tight)
TSP (metric)	3/2	Christofides-Serdyukov
Max-Cut	0.878	SDP rounding (GW)
Steiner Tree	ln(4) + epsilon	Iterative rounding
Facility Location	1.488	LP rounding

Inapproximability: PCP theorem and unique games conjecture establish limits on achievable ratios.

Connections

Integer programming formulations for combinatorial problems (→ 02-integer-programming.md).
SDP relaxations for approximation algorithms (→ 05-conic-programming.md).
Submodular maximization via stochastic methods (→ 08-stochastic-optimization.md).
Solver landscape for combinatorial problems (→ 09-optimization-solvers.md).