Fuzzing & Benchmarks

Go 1.18 introduced built-in fuzz testing alongside generics. Fuzz tests generate random inputs to find bugs that hand-written test cases miss. Benchmarks, available since Go 1.0, measure performance with precision. Combined with profiling via pprof, these tools give you a complete picture of correctness and efficiency.

Fuzz Testing

Fuzz testing feeds random, mutated inputs to your function looking for crashes, panics, or incorrect behavior. Go's fuzzer is integrated into go test.

Writing a Fuzz Test

Fuzz test functions start with Fuzz and take *testing.F.

func FuzzParseAmount(f *testing.F) {
    // Seed corpus: known inputs to start from
    f.Add("100.00")
    f.Add("0.01")
    f.Add("-50.00")
    f.Add("")
    f.Add("not a number")

    f.Fuzz(func(t *testing.T, input string) {
        amount, err := ParseAmount(input)
        if err != nil {
            return // invalid input is fine — we just care about crashes
        }
        // Round-trip check: if parsing succeeded, formatting should too
        formatted := FormatAmount(amount)
        reparsed, err := ParseAmount(formatted)
        if err != nil {
            t.Errorf("round-trip failed: ParseAmount(%q) -> %v -> FormatAmount -> %q -> error: %v",
                input, amount, formatted, err)
        }
        if reparsed != amount {
            t.Errorf("round-trip mismatch: %v != %v", amount, reparsed)
        }
    })
}

Running Fuzz Tests

# Run the seed corpus as regular tests
go test -run FuzzParseAmount

# Run the fuzzer (generates random inputs)
go test -fuzz FuzzParseAmount -fuzztime 30s

# Run with a time limit
go test -fuzz FuzzParseAmount -fuzztime 2m

$ go test -fuzz FuzzParseAmount -fuzztime 30s
fuzz: elapsed: 0s, gathering baseline coverage
fuzz: elapsed: 3s, execs: 28451 (9483/sec), new interesting: 12
fuzz: elapsed: 6s, execs: 61203 (10200/sec), new interesting: 14
--- FAIL: FuzzParseAmount (8.12s)
    --- FAIL: FuzzParseAmount/abc123... (0.00s)
        parse_test.go:22: round-trip failed...

The Seed Corpus

The seed corpus provides starting inputs that the fuzzer mutates. Good seeds cover typical cases, edge cases, and known problematic inputs.

f.Add("0")
f.Add("999999999")
f.Add("-1")
f.Add("1.23456789")
f.Add("  100  ")  // whitespace
f.Add("\x00")     // null byte

When the fuzzer finds a failing input, it saves it in testdata/fuzz/<TestName>/. These files become part of the permanent corpus and run as regression tests on every go test invocation.

testdata/
  fuzz/
    FuzzParseAmount/
      abc123def456    # auto-generated corpus entry

What Fuzzing Finds

Fuzz testing excels at finding:

Panics from unexpected input (nil dereferences, index out of range)
Infinite loops or excessive memory allocation
Round-trip inconsistencies (parse then format then parse again)
Edge cases in string parsing, encoding, and decoding

Benchmarks

Benchmarks measure how fast your code runs and how much memory it allocates. Benchmark functions start with Benchmark and take *testing.B.

Writing Benchmarks

func BenchmarkSortSlice(b *testing.B) {
    data := generateRandomSlice(10000)
    b.ResetTimer() // exclude setup time

    for i := 0; i < b.N; i++ {
        input := make([]int, len(data))
        copy(input, data)
        sort.Ints(input)
    }
}

The b.N loop is essential. The framework adjusts b.N to run the benchmark long enough to get a stable measurement. Never hardcode the iteration count.

Running Benchmarks

# Run all benchmarks
go test -bench=.

# Run specific benchmarks
go test -bench=BenchmarkSort

# Include memory allocation stats
go test -bench=. -benchmem

# Run for a minimum duration
go test -bench=. -benchtime=5s

# Run a specific number of iterations
go test -bench=. -benchtime=1000x

$ go test -bench=. -benchmem
BenchmarkSortSlice-8      1234    967432 ns/op    81920 B/op    1 allocs/op
BenchmarkSearchMap-8    5432109     221 ns/op        0 B/op      0 allocs/op

The output columns are: benchmark name, iterations, nanoseconds per operation, bytes allocated per operation, and allocations per operation.

Memory Allocation Benchmarks

Use b.ReportAllocs() to always report allocation stats, even without the -benchmem flag.

func BenchmarkStringConcat(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        s := ""
        for j := 0; j < 100; j++ {
            s += "x"
        }
    }
}

func BenchmarkStringBuilder(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        var sb strings.Builder
        for j := 0; j < 100; j++ {
            sb.WriteString("x")
        }
        _ = sb.String()
    }
}

BenchmarkStringConcat-8      12345    97200 ns/op    5840 B/op    99 allocs/op
BenchmarkStringBuilder-8    543210     2180 ns/op     512 B/op     4 allocs/op

This shows strings.Builder is roughly 45 times faster with far fewer allocations.

Sub-Benchmarks

Like tests, benchmarks support sub-benchmarks for comparing different parameters.

func BenchmarkLookup(b *testing.B) {
    for _, size := range []int{10, 100, 1000, 10000} {
        m := buildMap(size)
        b.Run(fmt.Sprintf("size=%d", size), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                _ = m[size/2]
            }
        })
    }
}

BenchmarkLookup/size=10-8       200000000    5.92 ns/op
BenchmarkLookup/size=100-8      200000000    6.15 ns/op
BenchmarkLookup/size=1000-8     180000000    6.84 ns/op
BenchmarkLookup/size=10000-8    150000000    7.21 ns/op

Comparing Benchmarks with benchstat

The benchstat tool compares benchmark results across runs to determine if a change is statistically significant.

# Install benchstat
go install golang.org/x/perf/cmd/benchstat@latest

# Run benchmarks before your change
go test -bench=. -count=10 > old.txt

# Make your optimization
# ...

# Run benchmarks after your change
go test -bench=. -count=10 > new.txt

# Compare
benchstat old.txt new.txt

name           old time/op  new time/op  delta
SortSlice-8    967us +/- 2% 812us +/- 1% -16.03% (p=0.000 n=10+10)
SearchMap-8    221ns +/- 1% 219ns +/- 1%   ~     (p=0.245 n=10+10)

The -count=10 flag runs each benchmark 10 times. More runs give better statistical significance. The ~ symbol means the difference is not statistically significant.

Profiling with pprof

Go has built-in profiling support through the runtime/pprof and net/http/pprof packages.

CPU Profiling from Benchmarks

# Generate a CPU profile
go test -bench=BenchmarkSortSlice -cpuprofile=cpu.out

# Analyze with pprof
go tool pprof cpu.out

Inside the pprof interactive shell:

(pprof) top10
(pprof) list SortSlice
(pprof) web

Memory Profiling from Benchmarks

# Generate a memory profile
go test -bench=BenchmarkSortSlice -memprofile=mem.out

# Analyze
go tool pprof mem.out

HTTP Profiling for Running Services

Add net/http/pprof to a running service to profile it live.

import _ "net/http/pprof"

func main() {
    go func() {
        log.Println(http.ListenAndServe("localhost:6060", nil))
    }()
    // ... rest of your application
}

# Capture a 30-second CPU profile
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

# View heap allocations
go tool pprof http://localhost:6060/debug/pprof/heap

# View goroutines
go tool pprof http://localhost:6060/debug/pprof/goroutine

Reading pprof Output

The key columns in pprof output are:

      flat  flat%   sum%        cum   cum%
   420ms 42.00% 42.00%    980ms 98.00%  sort.Ints
   310ms 31.00% 73.00%    310ms 31.00%  runtime.memmove
   250ms 25.00% 98.00%    250ms 25.00%  sort.partition

flat: time spent in the function itself. cum: time spent in the function and everything it calls. High flat time means the function itself is slow. High cum but low flat means the function calls something slow.

Common Pitfalls

Benchmarking code the compiler eliminates. If the result of a computation is never used, the compiler may optimize it away. Assign results to a package-level variable or use b.StopTimer() and b.StartTimer() to bracket work.
Including setup in benchmark timing. Use b.ResetTimer() after expensive setup to exclude it from the measurement.
Too few benchmark runs. A single run is noisy. Use -count=10 and benchstat for reliable comparisons.
Fuzzing without a round-trip property. The most effective fuzz tests verify a property (like parse-format-parse roundtrip), not just "doesn't crash." Though crash-finding alone is valuable too.
Not saving fuzz corpus entries. Commit the testdata/fuzz/ directory so discovered edge cases are checked in CI.
Profiling in debug mode. Always profile optimized builds (go test -bench does this by default). Profiling unoptimized code gives misleading results.

Key Takeaways

Fuzz testing generates random inputs to find bugs that unit tests miss. Use *testing.F and provide a seed corpus.
Failing fuzz inputs are saved as regression tests in testdata/fuzz/.
Benchmarks use *testing.B and the b.N loop. Never hardcode iteration counts.
Use b.ReportAllocs() to track memory allocations per operation.
Compare benchmarks statistically with benchstat and at least 10 runs.
Profile with pprof to find where time and memory are actually spent.
The combination of fuzzing, benchmarks, and profiling gives you a complete quality and performance toolkit with no external dependencies.