7 min read
On this page

Probabilistically Checkable Proofs

PCP: Definition

A probabilistically checkable proof (PCP) system for a language L provides:

  • A proof string pi (of polynomial length) written by an omniscient prover
  • A polynomial-time verifier V that uses r(n) random bits and queries q(n) bits of pi

The system satisfies:

  1. Completeness: If x in L, there exists pi such that V^pi(x) accepts with probability 1
  2. Soundness: If x not in L, for all pi, V^pi(x) accepts with probability at most 1/2

PCP[r(n), q(n)] denotes the class of languages with such verifiers.

The PCP Theorem

Theorem (Arora-Safra, 1998; Arora-Lund-Motwani-Sudan-Szegedy, 1998):

NP = PCP[O(log n), O(1)]

Every NP statement has a proof that can be verified by reading only a constant number of bits, using only O(log n) random bits.

This is extraordinary: a polynomial-length proof of a SAT instance can be checked by examining O(1) positions, with constant soundness error. The O(log n) random bits select which positions to examine (giving polynomially many possible checks, covering the proof).

Proof Overview

The proof proceeds in stages, combining algebraic and combinatorial techniques:

  1. Arithmetization: Encode the NP witness as evaluations of a low-degree polynomial over a finite field
  2. Linearity testing: Use Blum-Luby-Rubinfeld linearity tests to ensure the proof is "close to" a valid low-degree polynomial
  3. Low-degree testing: Verify that the proof string represents evaluations of a polynomial of bounded degree
  4. Consistency checking: Combine the above with the constraint structure of the original NP problem
  5. Alphabet reduction and composition: Use PCP composition to reduce the query complexity to O(1)

Dinur (2007) gave a substantially simpler combinatorial proof using gap amplification: starting from a "trivial" PCP (NP = PCP[poly(n), poly(n)]), iteratively amplify the soundness gap through graph powering and alphabet reduction, maintaining O(log n) randomness.

Inapproximability: The Fundamental Connection

The PCP theorem is equivalent to the existence of a gap-introducing reduction from SAT to itself. Specifically:

There exists epsilon > 0 and a polynomial-time reduction f such that:

  • If phi is satisfiable, then f(phi) is satisfiable
  • If phi is unsatisfiable, then no assignment satisfies more than (1 - epsilon) fraction of clauses of f(phi)

This transforms the PCP theorem from a proof-theoretic statement into a tool for hardness of approximation.

Hardness of Approximation

MAX-3SAT

Theorem (Hastad, 2001): It is NP-hard to distinguish between 3SAT instances where:

  • At least (1 - epsilon) fraction of clauses are satisfiable
  • At most (7/8 + epsilon) fraction of clauses are satisfiable

Since a random assignment satisfies 7/8 of all 3SAT clauses in expectation, this means: no polynomial-time algorithm can beat random assignment for MAX-3SAT (unless P = NP). This is an optimal inapproximability result.

Hastad's proof uses the 3-query PCP with specific algebraic structure (linearity testing over GF(2), Fourier analysis of Boolean functions).

MAX-CLIQUE

Theorem (Hastad, 1996): For every epsilon > 0, it is NP-hard to approximate the maximum clique in an n-vertex graph within a factor of n^{1-epsilon}.

The proof uses the connection between PCPs and graph products. If V is a PCP verifier making q queries, construct a graph where vertices are (randomness, proof-bits) pairs and edges connect consistent pairs. The clique number relates to the satisfiability of the original instance, with gap amplification via graph powering.

Stronger result (Khot, 2001, assuming UGC): Clique cannot be approximated within n^{1-epsilon} even with quasi-polynomial time.

Set Cover

Theorem (Dinur-Steurer, 2014): Set Cover cannot be approximated within (1 - epsilon) ln n for any epsilon > 0 unless P = NP.

This matches the greedy algorithm's O(ln n) approximation ratio, showing the greedy algorithm is essentially optimal.

Vertex Cover

Theorem (Dinur-Safra, 2005): Vertex Cover is NP-hard to approximate within 1.3606. Assuming UGC (Khot-Regev, 2008), it is hard to approximate within 2 - epsilon, matching the LP-based 2-approximation.

Traveling Salesman Problem

  • Metric TSP: NP-hard to approximate within 123/122 (Karpinski-Lampis-Schmied, 2015). Christofides' algorithm gives 3/2. Karlin-Klein-Oveis Gharan (2021) slightly beat 3/2.
  • General TSP: NP-hard to approximate within any polynomial factor.

The Unique Games Conjecture (UGC)

Conjecture (Khot, 2002): For every epsilon, delta > 0, there exists a constant k such that it is NP-hard to determine, given a Unique Games instance with alphabet size k:

  • (YES) There is an assignment satisfying >= (1-epsilon) fraction of constraints
  • (NO) No assignment satisfies more than delta fraction of constraints

A Unique Game is a 2-variable CSP where each constraint is a bijection between label sets.

Implications of UGC

If UGC is true:

  • MAX-CUT: The Goemans-Williamson SDP-based 0.878-approximation is optimal
  • Vertex Cover: The LP-based 2-approximation is optimal
  • Every Max-CSP: The optimal approximation ratio is exactly characterized by a semidefinite programming relaxation (Raghavendra, 2008)
  • Sparsest Cut, Minimum Linear Arrangement: Current algorithms are optimal

Status of UGC

  • Not proven, but widely studied as a working hypothesis
  • Subexponential algorithms exist for Unique Games (Arora-Barak-Steurer, 2010), which do not refute UGC but constrain potential proofs
  • The 2-to-2 Games Theorem (Khot-Minzer-Safra, 2018) proves a variant with imperfect completeness, a major step toward UGC

Gap Amplification

Starting from a weak PCP (large query complexity, small gap), gap amplification transforms it into a strong PCP (small query complexity, large gap).

Dinur's approach:

  1. Represent the constraint graph as an expander (via graph powering)
  2. Amplify the gap: if alpha fraction of constraints are violated, after powering, Omega(alpha) fraction are violated even locally
  3. Reduce alphabet size through composition with an "inner verifier"
  4. Iterate O(log n) times to achieve constant gap

Each iteration maintains O(log n) randomness and constant query complexity. The key technical insight: expander-based amplification preserves the structure needed for composition.

PCP Variants

PCPP (PCP of Proximity)

A PCP of proximity verifies that an input is close to being in a language, rather than exactly in it. The verifier reads a few bits of the input and a few bits of the proof.

Used in PCP composition: the "inner verifier" is a PCPP for the outer verifier's constraint checking problem.

Assignment Testers

An assignment tester checks that a proof encodes a valid assignment to a CSP. Combined with gap amplification, assignment testers yield the PCP theorem modularly.

The Sliding Scale Conjecture

Conjecture (Bellare-Goldreich-Sudan, 1998): NP = PCP[O(log n), O(1)] with soundness 2^{-q} where q is the query complexity. That is, soundness error decreases exponentially with queries.

This would give optimal hardness for many problems. Partial results exist (Moshkovitz-Raz, 2010: near-linear size PCPs with polylogarithmic query complexity and nearly-exponential soundness).

Selected Inapproximability Results

| Problem | Hardness Ratio | Best Algorithm | Match? | |---|---|---|---| | MAX-3SAT | 7/8 + epsilon | 7/8 (random) | Yes (optimal) | | MAX-CLIQUE | n^{1-epsilon} | O(n/log^2 n) | Near (both ~n) | | Set Cover | (1-epsilon) ln n | (1+o(1)) ln n (greedy) | Yes | | Vertex Cover | 1.3606 (2-epsilon under UGC) | 2 (LP) | Under UGC | | MAX-CUT | ~0.878 (under UGC) | 0.878 (GW) | Under UGC | | Chromatic Number | n^{1-epsilon} | O(n/log n) | Near | | Shortest Vector (CVP) | 2^{log^{1-epsilon} n} | Exponential | Gap |

Connections to Other Areas

  • Coding theory: PCP proofs are locally testable codes (LTCs). The PCP theorem is equivalent to the existence of certain LTCs.
  • Property testing: PCP verifiers are property testers for proof validity.
  • Cryptography: PCPs underlie succinct arguments (SNARGs) and verifiable computation.
  • Proof complexity: PCPs provide strong lower bounds on proof systems (automatizability).

Open Problems

  • Resolve the Unique Games Conjecture
  • Prove the Sliding Scale Conjecture
  • Determine the exact approximation threshold for Metric TSP
  • Construct linear-size PCPs with O(1) queries (currently, PCPs have quasi-linear size)
  • Can Dinur-style amplification be made to work for 2-query PCPs (for Label Cover with perfect completeness)?