Complexity Theory Introduction
Complexity theory studies the resources (time, space) required to solve computational problems. While decidability asks "can it be solved?", complexity asks "how efficiently can it be solved?"
Time Complexity
Definitions
The time complexity of a TM M on input w is the number of steps M takes before halting.
The worst-case time complexity of M: T(n) = max over all inputs of length n of the number of steps.
A language L is in TIME(f(n)) if some TM decides L in O(f(n)) steps.
Complexity Classes
P (Polynomial Time)
P = ⋃_k TIME(n^k) = TIME(n) ∪ TIME(n²) ∪ TIME(n³) ∪ ...
Problems solvable in polynomial time by a deterministic TM.
Interpretation: P represents "efficiently solvable" problems.
Examples: Sorting (O(n log n)), shortest path (O(V² or E log V)), primality testing (O(log⁶ n) by AKS), linear programming, matching.
NP (Nondeterministic Polynomial Time)
NP = ⋃_k NTIME(n^k)
Problems solvable in polynomial time by a nondeterministic TM.
Equivalent definition (verifier): L ∈ NP iff there exists a polynomial-time verifier V such that:
w ∈ L iff ∃ certificate c with |c| = poly(|w|) and V(w, c) accepts
Interpretation: Problems where a solution can be verified in polynomial time, even if finding the solution might be hard.
Examples: SAT (certificate: satisfying assignment), Hamiltonian path (certificate: the path), graph coloring (certificate: the coloring), integer factoring (certificate: the factors).
coNP
The complement class: L ∈ coNP iff L̄ ∈ NP.
Interpretation: Problems where "no" answers have short proofs.
Examples: Tautology (unsatisfiable formula has no short proof of satisfiability), primality (actually in P, but naturally in coNP via "composite" certificates).
P ⊆ NP ∩ coNP: Every problem in P is in both NP and coNP.
P vs NP
The central open problem in theoretical computer science: Is P = NP?
P = NP would mean: Every problem whose solution can be efficiently verified can also be efficiently solved.
P ≠ NP (widely believed) would mean: There exist problems where finding a solution is inherently harder than verifying one.
Consequences of P = NP:
- Cryptography collapses (RSA, Diffie-Hellman trivially breakable).
- Optimization problems become easy (scheduling, routing, design).
- Proof search becomes efficient (mathematical discovery automated).
- AI and machine learning: many hard problems become tractable.
Why most believe P ≠ NP: Decades of effort haven't found polynomial algorithms for NP-complete problems. But no proof of separation exists.
Clay Millennium Prize: $1M for proof of P = NP or P ≠ NP.
NP-Completeness
Definition
A problem L is NP-complete if:
- L ∈ NP (solution verifiable in polynomial time)
- Every problem in NP is polynomial-time reducible to L (L is NP-hard)
NP-hard: At least as hard as everything in NP. May or may not be in NP itself.
NP-hard ⊇ NP-complete = NP-hard ∩ NP
Significance: If ANY NP-complete problem has a polynomial algorithm, then P = NP (all NP problems become polynomial via reduction).
Cook-Levin Theorem (1971)
Theorem: SAT (Boolean satisfiability) is NP-complete.
SAT: Given a Boolean formula φ, is there an assignment of variables that makes φ true?
Proof sketch: Every NP computation can be encoded as a Boolean formula. The formula is satisfiable iff the NP machine accepts. The encoding is polynomial.
This was the first NP-complete problem. All subsequent NP-completeness proofs reduce from SAT (or from another known NP-complete problem).
Polynomial-Time Reductions
A polynomial-time reduction from L₁ to L₂ (L₁ ≤_P L₂) is a polynomial-time computable function f such that:
w ∈ L₁ iff f(w) ∈ L₂
If L₂ ∈ P and L₁ ≤_P L₂, then L₁ ∈ P. If L₁ is NP-hard and L₁ ≤_P L₂, then L₂ is NP-hard.
NP-Complete Problems
3-SAT
SAT restricted to formulas in CNF where each clause has exactly 3 literals.
Reduction: SAT ≤_P 3-SAT. Split long clauses using auxiliary variables.
CLIQUE
Given graph G and integer k, does G have a clique (complete subgraph) of size k?
Reduction: 3-SAT ≤_P CLIQUE. Create a vertex for each literal in each clause. Add edges between consistent non-same-clause vertices.
VERTEX-COVER
Given graph G and integer k, is there a set of k vertices that covers all edges?
Reduction: CLIQUE ≤_P VERTEX-COVER via complement graph. Clique of size k in G ↔ vertex cover of size n-k in Ḡ.
INDEPENDENT-SET
Given graph G and integer k, is there a set of k pairwise non-adjacent vertices?
Relationship: Independent set of size k in G ↔ clique of size k in Ḡ ↔ vertex cover of size n-k in G.
HAMILTONIAN-PATH / HAMILTONIAN-CYCLE
Does the graph have a path/cycle visiting every vertex exactly once?
SUBSET-SUM
Given integers a₁, ..., aₙ and target t, is there a subset summing to t?
TSP (Decision Version)
Given cities with distances and bound k, is there a tour of total length ≤ k?
Other NP-Complete Problems
3-COLORING, SET-COVER, KNAPSACK (decision version), BIN-PACKING, INTEGER PROGRAMMING, MAX-CUT, STEINER TREE, LONGEST PATH, ...
Thousands of NP-complete problems are known, spanning virtually every area of CS and applied mathematics.
Space Complexity
PSPACE
PSPACE = ⋃_k SPACE(n^k)
Problems solvable using polynomial space (but possibly exponential time).
PSPACE-complete problems: TQBF (True Quantified Boolean Formula), generalized versions of many games (chess, Go on n×n board).
L (Logarithmic Space)
L = SPACE(log n)
Problems solvable using only O(log n) extra space (beyond the read-only input).
Examples: Graph reachability in directed graphs (PATH) — actually NL-complete.
NL (Nondeterministic Log Space)
NL = NSPACE(log n)
NL-complete: PATH (directed s-t connectivity).
Theorem (Immerman-Szelepcsenyi): NL = coNL.
Containment
L ⊆ NL ⊆ P ⊆ NP ⊆ PSPACE ⊆ EXPTIME
We know L ≠ PSPACE and P ≠ EXPTIME (by hierarchy theorems), but the relationships between adjacent classes are open.
EXPTIME
EXPTIME = ⋃_k TIME(2^(n^k))
P ≠ EXPTIME (provable by the time hierarchy theorem). So at least one of P ≠ NP or NP ≠ EXPTIME must hold (actually both are believed true).
EXPTIME-complete problems: Generalized chess (on n×n board), certain planning problems.
Relationship Summary
L ⊆ NL ⊆ P ⊆ NP ⊆ PSPACE = NPSPACE ⊆ EXPTIME ⊆ NEXPTIME ⊆ EXPSPACE
Known strict containments:
L ⊊ PSPACE (Savitch's theorem + space hierarchy)
P ⊊ EXPTIME (time hierarchy theorem)
NL ⊆ P (log-space TM can be simulated in polynomial time)
Unknown:
L = NL? P = NP? NP = PSPACE? NP = coNP?
Coping with NP-Hardness
When faced with an NP-hard problem in practice:
- Approximation algorithms: Find a near-optimal solution in polynomial time (e.g., 2-approximation for vertex cover).
- Parameterized algorithms: Polynomial in n, exponential only in a small parameter k (fixed-parameter tractability).
- Heuristics: Algorithms that work well in practice without theoretical guarantees (simulated annealing, genetic algorithms).
- Special cases: The problem may be polynomial for the specific instances you encounter (planar graphs, trees, bounded treewidth).
- Exact exponential algorithms: Better than brute force (e.g., O(1.2^n) instead of O(2^n)).
- SAT solvers: Modern SAT solvers handle many practical instances efficiently despite worst-case exponential time.
- Randomized algorithms: Polynomial expected time for some problems.
Applications in CS
- Algorithm design: Knowing a problem is NP-hard tells you to seek approximation algorithms or heuristics instead of exact polynomial algorithms.
- Cryptography: Security of RSA, AES, etc. relies on assumed hardness of certain problems (factoring, discrete log). If P = NP, modern cryptography is broken.
- AI planning: Many planning problems are PSPACE-complete. Practical planners use heuristics.
- Compiler optimization: Many optimization problems (register allocation, instruction scheduling) are NP-hard. Compilers use heuristics.
- Operations research: Scheduling, routing, resource allocation — often NP-hard. Solved with approximation and integer programming.
- Verification: Model checking is PSPACE-complete in general but tractable for many practical instances.