Probabilistic Analysis
Probabilistic analysis determines the expected performance of algorithms, either over random inputs (average-case) or using randomization in the algorithm itself.
Average-Case Analysis
Analyze the expected running time over a distribution of inputs (typically uniform random).
Quicksort Average Case
With a random pivot, the partition position is uniformly distributed.
Recurrence: T(n) = (1/n) × Σᵢ₌₀ⁿ⁻¹ [T(i) + T(n-1-i)] + O(n).
Solution: T(n) = O(n log n).
Intuition: Even "mediocre" pivots (anywhere in the middle 50%) give a roughly balanced partition. This happens often enough that the expected depth is O(log n).
Precise result: Expected comparisons ≈ 2n ln n ≈ 1.386 n log₂ n. About 39% more comparisons than merge sort's n log₂ n, but faster in practice (better constants, cache behavior, in-place).
Hash Table Average Case
With universal hashing and load factor α = n/m:
Expected chain length: α (under uniform hashing assumption).
Expected search time: O(1 + α). If α = O(1) (e.g., resize when α > 0.75), then O(1) expected.
Worst case: O(n) — all keys hash to the same bucket. But extremely unlikely with universal hashing.
Randomized Algorithms Analysis
The algorithm itself uses randomness. The analysis bounds the expected performance over the algorithm's random choices (not the input).
Indicator Random Variables
An indicator random variable X_A = 1 if event A occurs, 0 otherwise.
Key property: E[X_A] = P(A).
Linearity of expectation: E[Σ Xᵢ] = Σ E[Xᵢ] — works even when variables are dependent!
Randomized Quicksort
Expected comparisons: Let Xᵢⱼ = 1 if elements zᵢ and zⱼ are compared (where z₁ < z₂ < ... < zₙ is the sorted order).
P(zᵢ compared with zⱼ) = 2/(j - i + 1)
// zᵢ and zⱼ are compared iff one of them is chosen as pivot
// before any element between them (j-i+1 candidates, 2 trigger comparison)
E[total comparisons] = Σᵢ<ⱼ 2/(j - i + 1)
= 2n × Hₙ - O(n)
= 2n ln n + O(n)
= O(n log n)
Skip List Analysis
Expected height: Each element is promoted with probability p = 1/2. Expected height of an element: 1/(1-p) = 2 levels. Maximum level of any element: O(log n) with high probability.
Expected search time: O(log n). At each level, expected number of steps backward before going down: O(1).
Expected space: O(n). Each element has expected 2 levels → n elements × 2 = 2n pointers.
Treap Analysis
A treap is a BST with random priorities. The expected depth of any node = O(log n) (same as a random BST).
Expected time for search/insert/delete: O(log n).
Randomized Selection (Quickselect)
Find the k-th smallest element. Like quicksort but recurse on one side only.
Expected time: T(n) = n + T(3n/4) expected (the pivot is in the middle 50% with probability 1/2).
Solving: T(n) = O(n) expected. Each "good" partition reduces the problem by at least 1/4.
Indicator Variable Technique
Expected Fixed Points in Random Permutation
X = number of fixed points (positions where π(i) = i).
Xᵢ = 1 if π(i) = i. P(π(i) = i) = 1/n.
E[X] = Σ E[Xᵢ] = n × (1/n) = 1.
Regardless of n, the expected number of fixed points is 1. Surprisingly simple proof using linearity of expectation.
Expected Inversions
An inversion in a permutation: pair (i, j) where i < j but π(i) > π(j). Counts how "unsorted" the permutation is.
Xᵢⱼ = 1 if (i, j) is an inversion. P(Xᵢⱼ = 1) = 1/2 (each pair equally likely to be in either order).
E[inversions] = C(n, 2) × 1/2 = n(n-1)/4.
Application: Insertion sort does exactly as many swaps as inversions. Average-case: Θ(n²/4) = Θ(n²).
Linearity of Expectation
The most powerful tool in probabilistic analysis:
E[X₁ + X₂ + ... + Xₙ] = E[X₁] + E[X₂] + ... + E[Xₙ]
Works even when Xᵢ are dependent! (No independence requirement.)
This is why indicator variables are so powerful: decompose a complex quantity into simple indicators, then sum expectations.
Birthday Problem via Linearity
Expected number of collisions among n balls in m bins:
Let Xᵢⱼ = 1 if persons i and j share a birthday.
E[X] = C(n, 2) × 1/365 = n(n-1)/730.
For n = 23: E[X] = 23 × 22 / 730 ≈ 0.69 — explains why the 50% threshold is around 23.
Tail Bounds (Review)
How tightly concentrated is a random variable around its expectation?
Markov: P(X ≥ a) ≤ E[X]/a. For non-negative X.
Chebyshev: P(|X - μ| ≥ k) ≤ Var(X)/k². Requires variance.
Chernoff: P(X ≥ (1+δ)μ) ≤ e^(-μδ²/3) for sum of independent 0/1 variables. Exponentially tight.
High Probability Bounds
"With high probability" (w.h.p.) means probability ≥ 1 - 1/n^c for some constant c > 0.
Example: Random quicksort takes O(n log n) time with high probability (not just in expectation). The probability of taking ω(n log n) time is polynomially small.
Derandomization
Technique: If a randomized algorithm achieves a property with positive probability, a deterministic algorithm achieving the same property exists (existential argument).
Methods: Method of conditional expectations, pessimistic estimators, derandomized rounding.
Applications in CS
- Algorithm design: Randomized algorithms are often simpler and faster than deterministic ones (quicksort, skip list, treap, Miller-Rabin).
- Hash function analysis: Expected performance of hash tables depends on hash function quality.
- Load balancing: Power of two choices: random placement achieves O(log log n) max load vs O(log n) for single random choice.
- Streaming algorithms: Probabilistic guarantees for count-min sketch, HyperLogLog.
- Distributed systems: Randomized consensus, gossip protocols, probabilistic broadcast.