5 min read
On this page

Randomized Algorithms

Overview

Randomized algorithms use random choices during execution to achieve good expected performance or high-probability guarantees. They often yield simpler, faster solutions than their deterministic counterparts for problems where worst-case deterministic bounds are expensive.

Classification

Las Vegas Algorithms

  • Always correct, runtime is a random variable
  • Expected running time is bounded; worst case may be unbounded
  • Example: Randomized quicksort always sorts correctly; expected O(n log n)

Monte Carlo Algorithms

  • Always terminate within a time bound, but may produce incorrect results
  • Error probability can be reduced by repetition
  • One-sided error: only errs in one direction (e.g., Miller-Rabin)
  • Two-sided error: can err in both directions (class BPP)

| Property | Las Vegas | Monte Carlo | |----------------|------------------|-------------------| | Correctness | Always correct | Probabilistic | | Runtime | Expected bound | Worst-case bound | | Amplification | Run longer | Repeat and vote |

Randomized Quicksort

Choose pivot uniformly at random from the subarray.

RandomizedQuicksort(A, lo, hi):
    if lo < hi:
        pivot_index = Random(lo, hi)
        swap A[pivot_index] with A[hi]
        q = Partition(A, lo, hi)
        RandomizedQuicksort(A, lo, q - 1)
        RandomizedQuicksort(A, q + 1, hi)
  • Expected comparisons: 2n ln n ~ 1.39n log2 n
  • No adversarial input can force O(n^2) in expectation
  • Probability of exceeding O(n log n) by constant factor is exponentially small

Randomized Selection (QuickSelect)

Find the k-th smallest element in expected O(n) time using random pivot selection. Worst case O(n^2), but the median-of-medians deterministic fallback guarantees O(n).

Skip Lists

Probabilistic alternative to balanced BSTs.

Structure: Multiple sorted linked-list levels. Each element is promoted to the next level with probability p = 1/2.

  • Search/Insert/Delete: O(log n) expected time
  • Space: O(n) expected
  • Simpler to implement than red-black or AVL trees
  • Used in Redis sorted sets, LevelDB/RocksDB memtables
Search(key):
    node = header
    for level = max_level down to 0:
        while node.forward[level].key < key:
            node = node.forward[level]
    node = node.forward[0]
    return node if node.key == key

Treaps

Combination of BST (by key) and heap (by random priority).

  • Assign each node a uniformly random priority at insertion
  • Maintain BST property on keys, min-heap property on priorities
  • Expected depth O(log n); supports split/merge in O(log n)
  • Equivalent to a random insertion-order BST

Karger's Min-Cut Algorithm

Repeatedly contract a uniformly random edge until two vertices remain.

KargerMinCut(G):
    while |V| > 2:
        pick edge (u, v) uniformly at random
        contract u and v (merge, remove self-loops)
    return the number of remaining edges
  • Probability of finding min-cut in one trial: >= 2 / (n(n-1))
  • Repeat O(n^2 log n) times for high probability: overall O(n^4 log n) naive
  • Karger-Stein improvement: recursive contraction gives O(n^2 log^3 n)

Reservoir Sampling

Select k items uniformly at random from a stream of unknown length n.

ReservoirSample(stream, k):
    reservoir = first k items
    for i = k+1 to n:
        j = Random(1, i)
        if j <= k:
            reservoir[j] = stream[i]
    return reservoir
  • Each element appears in the final sample with probability k/n
  • Single pass, O(k) memory

Miller-Rabin Primality Test

Monte Carlo algorithm for primality with one-sided error (may say "probably prime" for a composite).

Write n - 1 = 2^s * d with d odd. For random base a in [2, n-2]:

  1. Compute x = a^d mod n
  2. If x == 1 or x == n - 1, pass this round
  3. Repeat s - 1 times: x = x^2 mod n; if x == n - 1, pass
  4. If no pass, n is composite
  • Error per round: <= 1/4
  • With t rounds: error <= 4^(-t)
  • Used in practice for cryptographic key generation

Schwartz-Zippel Lemma

For a non-zero polynomial P(x1, ..., xn) of total degree d over a field F, and a finite subset S of F:

Pr[P(r1, ..., rn) = 0] <= d / |S| when each ri is chosen uniformly from S.

Applications:

  • Polynomial identity testing: evaluate at random point to check if P = Q
  • Perfect matching detection (Tutte matrix)
  • Verifying matrix multiplication: check ABr = Cr for random vector r

Random Walks and Markov Chains

A random walk on a graph: at each step, move to a uniformly random neighbor.

  • Hitting time H(u,v): expected steps to reach v from u
  • Cover time: expected steps to visit all vertices; O(n^3) for general graphs
  • Commute time: H(u,v) + H(v,u) = 2m * R_eff(u,v) where R_eff is effective resistance
  • s-t connectivity in RL: random walks solve undirected connectivity in O(log n) space

Markov Chain Monte Carlo (MCMC)

Sample from a target distribution pi by constructing a Markov chain whose stationary distribution is pi.

Metropolis-Hastings

state = x0
for each iteration:
    propose y ~ Q(y | state)
    alpha = min(1, pi(y)Q(state|y) / pi(state)Q(y|state))
    if Random() < alpha:
        state = y

Gibbs Sampling

Special case: sample each variable conditioned on the current values of all others.

Mixing time: number of steps until the chain is close to stationary. Fast mixing is essential for practical use. Conductance (Cheeger's inequality) provides bounds.

Applications: Bayesian inference, counting problems (#P), sampling combinatorial structures (random spanning trees, colorings).

Key Complexity Classes

| Class | Definition | |--------|-------------------------------------------------------------| | RP | One-sided error, poly time, error <= 1/2 on YES instances | | coRP | One-sided error on NO instances | | BPP | Two-sided error <= 1/3, amplifiable to 2^(-poly) | | ZPP | Zero error, expected poly time (ZPP = RP intersect coRP) |

Derandomization Techniques

  • Pairwise independence: many algorithms only need limited independence
  • Expander graphs: deterministic amplification replacing independent repetitions
  • Pseudorandom generators: Nisan-Wigderson PRG if one-way functions exist
  • Method of conditional expectations: derandomize by fixing random bits greedily

Summary

Randomized algorithms provide elegant solutions with strong probabilistic guarantees. Key design patterns include random sampling, random partitioning, hashing, and Markov chain simulation. Understanding when Las Vegas vs Monte Carlo is appropriate and how to amplify success probability are fundamental skills.