Number-Theoretic Algorithms

Number theory algorithms are the backbone of modern cryptography, hashing, and competitive programming. This extends the mathematical foundations (topic 01) with algorithmic implementations.

Primality Testing

Trial Division

Test if n is divisible by any integer from 2 to √n.

FUNCTION IS_PRIME(n)
    IF n < 2 THEN RETURN false
    IF n < 4 THEN RETURN true
    IF n MOD 2 = 0 OR n MOD 3 = 0 THEN RETURN false
    i ← 5
    WHILE i * i ≤ n
        IF n MOD i = 0 OR n MOD (i + 2) = 0 THEN RETURN false
        i ← i + 6
    RETURN true

Time: O(√n). Sufficient for n up to ~10¹⁸ (but slow for very large n).

Miller-Rabin Primality Test

A probabilistic test. Based on Fermat's little theorem and the observation that the only square roots of 1 modulo a prime are ±1.

Algorithm: Write n - 1 = 2^s × d (factor out powers of 2). For a random base a:

Compute x = a^d mod n.
If x = 1 or x = n - 1: probably prime for this base.
Repeat s - 1 times: x = x² mod n. If x = n - 1: probably prime.
If we never saw 1 or n - 1: composite.

FUNCTION MILLER_RABIN(n, a)
    IF n < 2 THEN RETURN false
    IF n = a THEN RETURN true
    IF n MOD 2 = 0 THEN RETURN false

    d ← n - 1
    s ← 0
    WHILE d MOD 2 = 0
        d ← d / 2; s ← s + 1

    x ← MOD_POW(a, d, n)
    IF x = 1 OR x = n - 1 THEN RETURN true

    FOR i ← 1 TO s - 1
        x ← MOD_MUL(x, x, n)
        IF x = n - 1 THEN RETURN true
    RETURN false

FUNCTION IS_PRIME_MILLER_RABIN(n)
    IF n < 2 THEN RETURN false
    // Deterministic for n < 3,317,044,064,679,887,385,961,981
    FOR EACH a IN [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
        IF n = a THEN RETURN true
        IF NOT MILLER_RABIN(n, a) THEN RETURN false
    RETURN true

Probabilistic: Each base has ≤ 1/4 probability of false positive (for composite n). With k bases: error ≤ 4^(-k).

Deterministic: For n < 3.3 × 10²⁴, testing bases {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37} is deterministic (no false positives).

Time: O(k log² n) for k bases (using fast modular exponentiation).

AKS Primality Test

The first deterministic polynomial-time primality test (Agrawal, Kayal, Saxena, 2002). O(log^(6+ε) n).

Theoretical breakthrough but too slow for practice. Miller-Rabin is used universally.

Integer Factorization

Trial Division

Divide by primes up to √n. O(√n).

Pollard's Rho Algorithm

A randomized algorithm for finding a non-trivial factor of n.

Idea: Use Floyd's cycle detection on the sequence x_{i+1} = x_i² + c (mod n). When gcd(|x_i - x_j|, n) > 1, we've found a factor.

FUNCTION POLLARD_RHO(n)
    IF n MOD 2 = 0 THEN RETURN 2
    LOOP
        c ← RANDOM(1, n - 1)
        x ← RANDOM(2, n - 1)
        y ← x
        d ← 1

        WHILE d = 1
            x ← (x * x + c) MOD n             // tortoise
            y ← (y * y + c) MOD n              // hare (2 steps)
            y ← (y * y + c) MOD n
            d ← GCD(|x - y|, n)

        IF d ≠ n THEN RETURN d
        // else: try different c

Expected time: O(n^(1/4)) for finding the smallest prime factor. Very fast in practice for numbers up to ~10⁶⁰.

Brent's improvement: Use Brent's cycle detection instead of Floyd's. Faster in practice.

Quadratic Sieve

For larger numbers (60-100 digits). Finds smooth relations and uses linear algebra over GF(2) to find a factorization.

Time: O(e^(√(ln n · ln ln n))). Sub-exponential but super-polynomial.

Number Field Sieve (NFS)

The fastest known algorithm for general integer factorization.

Time: O(e^((64/9)^(1/3) · (ln n)^(1/3) · (ln ln n)^(2/3))). Used to factor RSA challenge numbers. The 829-bit RSA-250 was factored in 2020 using NFS.

Modular Exponentiation

Compute a^b mod m efficiently using exponentiation by squaring (binary method). Covered in topic 01.

FUNCTION MOD_POW(base, exp, modulus)
    result ← 1
    base ← base MOD modulus
    WHILE exp > 0
        IF exp is odd
            result ← (result * base) MOD modulus
        exp ← exp / 2   // right shift
        base ← (base * base) MOD modulus
    RETURN result

// Multiplication avoiding overflow
FUNCTION MOD_MUL(a, b, m)
    RETURN (a * b) MOD m

Time: O(log b) multiplications.

Modular Inverse

Find x such that ax ≡ 1 (mod m). Exists iff gcd(a, m) = 1.

Method 1: Extended Euclidean algorithm. O(log m). Method 2: Fermat/Euler: a^(φ(m)-1) mod m. O(log m) with modular exponentiation. Method 3: For prime m: a^(m-2) mod m.

Euler's Totient Function

FUNCTION EULER_TOTIENT(n)
    result ← n
    p ← 2
    WHILE p * p ≤ n
        IF n MOD p = 0
            WHILE n MOD p = 0
                n ← n / p
            result ← result - result / p
        p ← p + 1
    IF n > 1 THEN result ← result - result / n
    RETURN result

Time: O(√n).

Sieve of Eratosthenes

Find all primes up to n.

FUNCTION SIEVE(n)
    is_prime ← array of size n + 1, all true
    is_prime[0] ← false
    is_prime[1] ← false
    i ← 2
    WHILE i * i ≤ n
        IF is_prime[i]
            j ← i * i
            WHILE j ≤ n
                is_prime[j] ← false
                j ← j + i
        i ← i + 1
    RETURN is_prime

Time: O(n log log n). Space: O(n).

Segmented Sieve

For primes in range [L, R] when R is large but R - L is small:

Sieve primes up to √R using standard sieve.
Use those primes to mark composites in [L, R].

Space: O(√R + (R - L)).

Linear Sieve

Each composite is marked by its smallest prime factor exactly once.

FUNCTION LINEAR_SIEVE(n)
    is_prime ← array of size n + 1, all true
    primes ← empty list
    spf ← array of size n + 1, all 0   // smallest prime factor

    FOR i ← 2 TO n
        IF is_prime[i]
            APPEND i TO primes
            spf[i] ← i
        FOR EACH p IN primes
            IF i * p > n THEN BREAK
            is_prime[i * p] ← false
            spf[i * p] ← p
            IF i MOD p = 0 THEN BREAK   // key: stop when p divides i
    RETURN (is_prime, primes)

Time: O(n) — strictly linear! Also computes the smallest prime factor for fast factorization.

Möbius Function

μ(n) = { 1    if n = 1
        { (-1)^k if n = p₁p₂...pₖ (product of k distinct primes)
        { 0    if n has a squared prime factor

Möbius inversion: If g(n) = Σ_{d|n} f(d), then f(n) = Σ_{d|n} μ(d) · g(n/d).

Used in competitive programming for inclusion-exclusion over divisors.

Discrete Logarithm

Find x such that g^x ≡ h (mod p).

Baby-step Giant-step (Shanks): O(√p) time and space.

Compute baby steps: g^j mod p for j = 0, 1, ..., m-1 (m = ⌈√p⌉). Store in hash table.
Compute giant steps: h · (g^(-m))^i mod p for i = 0, 1, .... Look up in hash table.

Pollard's rho for discrete logarithm: O(√p) expected time, O(1) space.

RSA Basics

Choose large primes p, q. Let n = pq.
φ(n) = (p-1)(q-1).
Choose e coprime to φ(n) (typically e = 65537).
Compute d = e⁻¹ mod φ(n) (using extended GCD).
Public key: (n, e). Private key: (n, d).
Encrypt: c = m^e mod n.
Decrypt: m = c^d mod n.

Correctness: m^(ed) ≡ m (mod n) by Euler's theorem (since ed ≡ 1 mod φ(n)).

Security: Based on the difficulty of factoring n = pq. If you can factor n, you can compute φ(n) and then d.

Applications in CS

Cryptography: RSA, Diffie-Hellman, elliptic curve crypto — all built on number theory.
Hashing: Hash functions use modular arithmetic. Universal hashing families use prime moduli.
Error detection: CRC uses polynomial arithmetic over GF(2). Reed-Solomon codes use finite field arithmetic.
Random number generation: Linear congruential generators, Mersenne Twister — based on modular arithmetic.
Competitive programming: Modular arithmetic, sieve, GCD/LCM, Chinese Remainder Theorem, combinatorics modulo primes.
Blockchain: Proof of work involves computing hashes. Digital signatures use number-theoretic algorithms.
Database sharding: Consistent hashing, hash partitioning use modular arithmetic.