Combinatorial Optimization
Combinatorial optimization finds the best solution from a finite (but often exponentially large) set of possibilities. Many fundamental problems are NP-hard, driving the study of approximation algorithms, heuristics, and structural insights.
Traveling Salesman Problem (TSP)
Problem: Given n cities and distances between every pair, find the shortest tour that visits each city exactly once and returns to the starting city.
Formally: Find a minimum-weight Hamiltonian cycle in a complete weighted graph.
Complexity
- Decision version ("Is there a tour of length ≤ k?") is NP-complete.
- No polynomial-time algorithm is known (and none exists if P ≠ NP).
- The number of possible tours: (n-1)!/2 (for symmetric TSP).
Metric TSP
If the distances satisfy the triangle inequality (d(a,c) ≤ d(a,b) + d(b,c)):
- Nearest-neighbor heuristic: Greedily visit the closest unvisited city. Approximation ratio: O(log n).
- Christofides' algorithm (1976): 3/2-approximation. Uses MST + minimum-weight perfect matching on odd-degree vertices.
- Best known: Slightly below 3/2 by Karlin, Klein, Gharan (2020).
Exact Algorithms
- Brute force: O(n!) — impractical for n > 15.
- Dynamic programming (Held-Karp): O(n² · 2ⁿ) time, O(n · 2ⁿ) space. Practical for n ≤ 25.
- Branch and bound: With good bounds (e.g., LP relaxation, 1-tree), can solve instances with hundreds of cities.
- Concorde solver: The state-of-the-art exact TSP solver. Has solved instances with over 85,000 cities.
TSP Variants
- Asymmetric TSP: d(a,b) ≠ d(b,a). Harder — best known approximation is O(log n / log log n).
- TSP with time windows: Each city must be visited within a specific time interval.
- Euclidean TSP: Cities are points in the plane. Admits a PTAS (Arora, Mitchell).
Minimum Vertex Cover
Problem: Find the smallest set of vertices such that every edge has at least one endpoint in the set.
Complexity: NP-hard.
Approximation:
- 2-approximation: Find a maximal matching M. Take both endpoints of every edge in M. This gives a vertex cover of size 2|M|, which is at most twice the minimum (since any vertex cover must include at least one endpoint of each matched edge).
- This is the best known polynomial-time approximation assuming the Unique Games Conjecture.
Relationship:
- Vertex cover + independent set = V (complement sets).
- Minimum vertex cover = n - maximum independent set.
- In bipartite graphs: minimum vertex cover = maximum matching (König's theorem).
Maximum Independent Set
Problem: Find the largest set of vertices with no edges between them.
Complexity: NP-hard. Hard to approximate within n^(1-ε) for any ε > 0 (unless P = NP).
Special cases:
- Bipartite graphs: Maximum independent set = n - maximum matching (by König's theorem). Polynomial.
- Interval graphs: Greedy algorithm (sort by endpoint, greedily select non-overlapping). Polynomial.
- Planar graphs: PTAS exists (Baker's technique).
- Trees: Dynamic programming in O(n).
Relationship to clique: Maximum independent set in G = maximum clique in complement G̅.
Maximum Clique
Problem: Find the largest complete subgraph.
Complexity: NP-hard. Same inapproximability as independent set.
ω(G) = size of maximum clique. For perfect graphs, ω(G) = χ(G).
Graph Partitioning
Minimum Cut
Problem: Partition V into two non-empty sets minimizing the number (or weight) of edges between them.
Algorithms:
- Stoer-Wagner: O(VE + V² log V) for undirected graphs.
- Karger's randomized algorithm: O(V² log³ V). Based on random edge contraction.
- Network flow methods (max-flow min-cut for s-t cuts).
Balanced Partitioning
Minimum bisection: Partition V into two equal halves minimizing crossing edges.
NP-hard. Spectral methods (using the Fiedler vector of the Laplacian) provide practical heuristics.
Kernighan-Lin algorithm: Local search heuristic that iteratively swaps pairs of vertices between partitions.
Multilevel partitioning (METIS, KaHIP): Coarsen graph, partition the small graph, uncoarsen with refinement. Very effective in practice.
Set Cover
Problem: Given a universe U and a collection S of subsets, find the minimum number of subsets whose union is U.
Complexity: NP-hard. Best approximation ratio: O(ln n) (greedy algorithm).
Greedy algorithm:
- While uncovered elements remain:
- Select the subset covering the most uncovered elements.
This achieves Hₙ ≈ ln n approximation ratio, which is essentially optimal (cannot do better than (1-ε) ln n unless P = NP).
Weighted version: Greedy by cost-effectiveness (cost / elements covered).
Ramsey Theory
Ramsey theory studies the conditions under which order must appear in large enough structures.
Ramsey Numbers
R(s, t): The minimum number n such that any 2-coloring of the edges of Kₙ contains either a red Kₛ or a blue Kₜ.
R(3, 3) = 6: Among 6 people, there must be 3 mutual friends or 3 mutual strangers.
Proof: Take any person v. By pigeonhole, v has ≥ 3 edges of the same color (say red). Among those 3 people:
- If any edge between them is red → red triangle with v.
- If no edge between them is red → they form a blue triangle. ∎
Known Values and Bounds
| R(s,t) | t=3 | t=4 | t=5 | |---|---|---|---| | s=3 | 6 | 9 | 14 | | s=4 | 9 | 18 | 25 | | s=5 | 14 | 25 | 43-48 |
Upper bound (Erdős-Szekeres): R(s, t) ≤ C(s+t-2, s-1).
Lower bound (Erdős probabilistic argument): R(s, s) ≥ 2^(s/2). This was one of the first applications of the probabilistic method.
Current best for R(5,5): Known to be between 43 and 48. Despite decades of work, exact Ramsey numbers are notoriously hard to compute. As Erdős said: "Suppose aliens invade and demand R(5,5) or they'll destroy Earth. We should marshal all computers and mathematicians. But if they demand R(6,6), we should attack."
Ramsey Theory Applications
- Communication complexity: Ramsey-theoretic arguments give lower bounds.
- Coding theory: Connections to error-correcting codes.
- Geometry: Ramsey-type results in combinatorial geometry (Happy Ending Theorem).
Integer Programming Formulations
Many combinatorial optimization problems can be formulated as integer programs:
Vertex cover:
minimize Σ xᵥ
subject to xᵤ + xᵥ ≥ 1 for all edges (u,v)
xᵥ ∈ {0, 1}
Independent set:
maximize Σ xᵥ
subject to xᵤ + xᵥ ≤ 1 for all edges (u,v)
xᵥ ∈ {0, 1}
TSP (Miller-Tucker-Zemlin):
minimize Σ cᵢⱼ xᵢⱼ
subject to Σⱼ xᵢⱼ = 1 for all i
Σᵢ xᵢⱼ = 1 for all j
subtour elimination constraints
xᵢⱼ ∈ {0, 1}
The LP relaxation (allowing 0 ≤ x ≤ 1) provides lower bounds and guides branch-and-bound algorithms.
Real-World Applications
- Logistics: TSP and vehicle routing for delivery, garbage collection, and circuit board drilling.
- Telecommunications: Network design (Steiner tree, minimum-cost connectivity).
- Scheduling: Job scheduling, exam timetabling, sports league scheduling.
- VLSI design: Partitioning circuits across chips, minimizing interconnect.
- Bioinformatics: Genome assembly, protein folding.
- Social networks: Community detection (graph partitioning), influence maximization.
- Resource allocation: Cloud computing task assignment, crew scheduling for airlines.