Fuzz Testing

What Is Fuzz Testing?

Fuzz testing (fuzzing) feeds random, unexpected, or malformed input to your program to find crashes, panics, and security vulnerabilities that you would never think to test for manually.

Where unit tests verify expected behavior with known inputs, fuzz testing explores the vast space of unexpected inputs that real-world attackers and edge cases produce.

Why It Matters

Finds edge cases that human testers miss — inputs no developer would think to try
Discovers buffer overflows, parsing errors, integer overflows, and panic-inducing inputs
Especially valuable for parsers, deserializers, network protocols, and any code handling untrusted input
Has found thousands of real security vulnerabilities in production software

Coverage-Guided Fuzzing

Modern fuzzers don't just throw random bytes. They use coverage-guided fuzzing — tracking which code paths each input exercises, then generating new inputs that explore unexplored paths.

How It Works

1. Start with a corpus of seed inputs (valid examples)
2. Mutate an input (flip bits, insert bytes, splice inputs together)
3. Run the target function with the mutated input
4. Track code coverage — did this input reach new code paths?
5. If YES → save the input to the corpus (it's interesting)
6. If NO → discard it
7. Repeat millions of times

This feedback loop means the fuzzer progressively discovers deeper and more complex code paths. It doesn't just test the surface — it systematically explores the entire reachable state space.

The Engine: libFuzzer

libFuzzer is LLVM's coverage-guided fuzzing engine. It instruments your code at compile time to track coverage, then uses that feedback to guide input generation.

In Rust, cargo-fuzz wraps libFuzzer with a convenient interface.

Fuzz Testing in Rust with cargo-fuzz

Setup

# Install cargo-fuzz (requires nightly Rust)
cargo install cargo-fuzz

# Initialize fuzzing in your project
cargo fuzz init

# Add a fuzz target
cargo fuzz add parse_config

This creates the directory structure:

fuzz/
  Cargo.toml
  fuzz_targets/
    parse_config.rs

Writing a Fuzz Target

A fuzz target is a function that takes arbitrary bytes and feeds them to the code under test:

// fuzz/fuzz_targets/parse_config
FUZZ TARGET (data: bytes)
    // Convert raw bytes to a string (skip invalid UTF-8)
    IF data IS valid UTF-8
        input <- data AS string
        // The fuzzer will try millions of inputs
        // If PARSE_CONFIG panics on ANY input, the fuzzer reports it
        CALL PARSE_CONFIG(input)

Running the Fuzzer

# Run the fuzzer (runs indefinitely until stopped or crash found)
cargo +nightly fuzz run parse_config

# Run with a time limit
cargo +nightly fuzz run parse_config -- -max_total_time=300  # 5 minutes

# Run with a corpus directory (seed inputs)
cargo +nightly fuzz run parse_config corpus/parse_config/

Using Structured Inputs with Arbitrary

For more targeted fuzzing, use the arbitrary crate to generate structured inputs instead of raw bytes:

STRUCTURE FuzzInput (auto-generated from random data)
    name: string
    age: unsigned integer
    tags: list of string
    nested: optional FuzzInput

FUZZ TARGET (input: FuzzInput)
    // The fuzzer generates valid FuzzInput structs
    // with random but well-typed data
    CALL PROCESS_USER_INPUT(input.name, input.age, input.tags)

This produces more meaningful inputs than raw bytes, reaching deeper into your application logic.

Practical Examples

Fuzzing a JSON Parser

FUZZ TARGET (data: bytes)
    IF data IS valid UTF-8
        input <- data AS string
        // Should never panic, regardless of input
        CALL JSON_PARSE(input)

This tests that your JSON parser handles every possible input without panicking. Real JSON parsers have been found to panic on deeply nested structures, specific Unicode sequences, and extremely long number literals.

Fuzzing a Network Protocol Parser

FUZZ TARGET (data: bytes)
    // Network protocols receive untrusted bytes from the network
    // They must handle ANY input safely
    CALL PARSE_MESSAGE(data)

Fuzzing with Round-Trip Invariants

A powerful pattern: verify that encoding then decoding produces the original value.

FUZZ TARGET (original: list of bytes)
    encoded <- ENCODE(original)
    decoded <- DECODE(encoded)  // must succeed for valid encoded data
    ASSERT original = decoded, "round-trip must preserve data"

If this assertion fails on any input, you have found a real bug in your codec.

Fuzzing Comparison: Differential Testing

Compare two implementations of the same function to find discrepancies:

FUZZ TARGET (data: bytes)
    IF data IS valid UTF-8
        input <- data AS string
        result_v1 <- PARSER_V1.PARSE(input)
        result_v2 <- PARSER_V2.PARSE(input)
        ASSERT result_v1 = result_v2, "v1 and v2 disagree on input: " + input

This finds inputs where your new implementation differs from the old one — invaluable during rewrites.

Security Bugs Found by Fuzzing

Fuzzing has found critical vulnerabilities in real-world software:

Heartbleed-Class Bugs

Buffer over-reads where the program reads past the end of allocated memory. Fuzzers find these by generating inputs with length fields that don't match actual data lengths:

Input: "LENGTH=1000" but only 5 bytes of data follow
Result: Program reads 995 bytes of adjacent memory (information leak)

Integer Overflows

// Vulnerable code
FUNCTION ALLOCATE_BUFFER(width: unsigned integer, height: unsigned integer) -> list of bytes
    size <- width * height * 4  // Can overflow!
    RETURN NEW list of zeroed bytes WITH length size

A fuzzer might generate width = 65536, height = 65536, causing width * height * 4 to overflow, allocating a tiny buffer that subsequent writes overflow.

Panic in Production

In Rust, panics don't cause memory corruption, but they do crash the program. For servers, a panic in a request handler crashes the worker or the entire process:

// Found by fuzzing: panic on empty input
FUNCTION PARSE_HEADER(data: bytes) -> Header
    version <- data[0]                          // panics if data is empty
    length <- BYTES_TO_U32(data[1..5])          // panics if < 5 bytes
    // ...

A fuzzer finds this instantly. A unit test might never test with empty input because the developer assumed "the network layer validates minimum length."

OSS-Fuzz Successes

Google's OSS-Fuzz project continuously fuzzes open-source software. As of 2025, it has found over 10,000 bugs across 1,000+ projects, including:

Memory safety bugs in OpenSSL, SQLite, and curl
Logic errors in compression libraries (zlib, brotli)
Parsing crashes in image libraries (libpng, libjpeg-turbo)
Protocol handling bugs in networking libraries

Managing a Fuzz Corpus

The corpus is the set of inputs the fuzzer has found interesting (they reached new code paths). Managing it well improves fuzzing effectiveness.

Seed Corpus

Start with real-world examples of valid input:

# Create seed corpus directory
mkdir -p fuzz/corpus/parse_config/

# Add real config files as seeds
cp examples/valid_config.toml fuzz/corpus/parse_config/seed1
cp examples/minimal_config.toml fuzz/corpus/parse_config/seed2

Seeds give the fuzzer a head start — it mutates them to find crashes faster than starting from random bytes.

Minimizing Crash Inputs

When the fuzzer finds a crash, the triggering input is often large and contains irrelevant bytes. Minimize it:

# Minimize a crash-triggering input
cargo +nightly fuzz tmin parse_config fuzz/artifacts/parse_config/crash-abc123

This produces the smallest input that still triggers the crash, making debugging easier.

Reproducing Crashes

# Reproduce a specific crash
cargo +nightly fuzz run parse_config fuzz/artifacts/parse_config/crash-abc123

Then write a regression test so the bug stays fixed:

TEST regression_crash_empty_header
    // Found by fuzzing: parse_header panicked on empty input
    result <- PARSE_HEADER(empty bytes)
    ASSERT result IS error  // Must return Err, not panic

Common Pitfalls

Fuzzing Too Broadly

Fuzzing raw bytes against a high-level API is inefficient. The fuzzer spends most of its time generating invalid inputs rejected immediately. Use Arbitrary for structured fuzzing to get past input validation and into the interesting logic.

Ignoring Timeouts

Some inputs cause the program to run for minutes (e.g., pathological regex patterns, deeply nested structures). Set timeouts:

cargo +nightly fuzz run parse_config -- -timeout=5  # 5-second timeout per input

Parsers and deserializers (JSON, XML, TOML, custom formats)
Network protocol handlers (anything receiving bytes from the network)
Codecs (compression, encryption, encoding)
Any code processing untrusted or user-supplied input
Security-sensitive code

Avoid for:

Business logic (use unit tests with known inputs)
UI code (fuzzing clicks doesn't find meaningful bugs)
Code that only processes trusted, internally-generated data

Key Takeaways

Fuzz testing finds bugs you would never think to test for. It systematically explores inputs humans can't imagine.
Coverage-guided fuzzing (libFuzzer, cargo-fuzz) is far more effective than random input generation.
Use Arbitrary for structured fuzzing to bypass input validation and test deeper logic.
Always write regression tests for crashes found by fuzzing — they are real bugs that will recur.
Run fuzzers for hours or days, not seconds. Deep bugs require sustained exploration.
Fuzzing is particularly high-value for any code that handles untrusted input — parsers, protocols, and codecs.