3 min read
On this page

Testing

Testing verifies that software behaves correctly. It is the primary quality assurance mechanism in modern software development.

Testing Levels

Unit Testing

Test individual functions, methods, or modules in isolation.

TEST MODULE
    TEST test_add
        ASSERT ADD(2, 3) = 5
        ASSERT ADD(-1, 1) = 0
        ASSERT ADD(0, 0) = 0

    TEST test_divide
        ASSERT DIVIDE(10, 2) = Ok(5.0)
        ASSERT DIVIDE(1, 0) is Error

    TEST test_overflow (should panic with "overflow")
        MULTIPLY(MAX_INT, 2)

Characteristics: Fast (milliseconds). No external dependencies (database, network). Test one thing per test. Run on every commit.

Integration Testing

Test interactions between components (modules, services, database).

ASYNC TEST test_user_creation_flow
    db ← AWAIT SETUP_TEST_DATABASE()
    service ← new UserService(db)

    user ← AWAIT service.CREATE_USER("alice", "alice@mail.com")
    ASSERT user.username = "alice"

    found ← AWAIT service.FIND_BY_EMAIL("alice@mail.com")
    ASSERT found.id = user.id

    AWAIT TEARDOWN_TEST_DATABASE(db)

Characteristics: Slower (seconds). May use real databases (test containers). Test component interactions.

System Testing

Test the entire system end-to-end. Verify that all components work together correctly.

Types: Functional testing, performance testing, security testing, usability testing.

Acceptance Testing

Verify the system meets business requirements. Often written with stakeholders.

Feature: User Registration
  Scenario: Successful registration
    Given I am on the registration page
    When I fill in "username" with "alice"
    And I fill in "email" with "alice@mail.com"
    And I click "Register"
    Then I should see "Registration successful"
    And I should receive a confirmation email

BDD (Behavior-Driven Development): Write acceptance tests in natural language (Given-When-Then). Tools: Cucumber, Behave.

Testing Approaches

Test-Driven Development (TDD)

Red → Green → Refactor:

  1. Red: Write a failing test for the desired behavior.
  2. Green: Write the minimum code to pass the test.
  3. Refactor: Improve the code without changing behavior (tests still pass).
1. Write test: test_fibonacci(10) == 55  → FAILS (function doesn't exist)
2. Implement fibonacci() minimally      → PASSES
3. Refactor for clarity/performance     → Still PASSES
4. Repeat for next feature

Benefits: Drives design (testable by construction). High test coverage. Confidence in refactoring. Living documentation.

Behavior-Driven Development (BDD)

Extension of TDD focused on behavior described in business language. Bridge between developers and stakeholders.

Property-Based Testing

Instead of specific examples, define properties that must hold for all inputs.

// Property-based tests: for ALL random inputs v (lists of 0..100 integers)

PROPERTY TEST sort_preserves_length
    FOR ALL v: random list of integers (size 0..100)
        sorted ← SORT(copy of v)
        ASSERT length(sorted) = length(v)

PROPERTY TEST sort_is_idempotent
    FOR ALL v: random list of integers (size 0..100)
        sorted1 ← SORT(copy of v)
        sorted2 ← SORT(copy of sorted1)
        ASSERT sorted1 = sorted2

PROPERTY TEST sort_produces_ordered_output
    FOR ALL v: random list of integers (size 0..100)
        sorted ← SORT(copy of v)
        FOR i ← 1 TO length(sorted) - 1
            ASSERT sorted[i-1] ≤ sorted[i]

Framework: proptest (Rust), QuickCheck (Haskell), Hypothesis (Python). Generates random inputs and shrinks failures to minimal examples.

Fuzzing

Feed random/malformed input to find crashes, hangs, and security vulnerabilities.

// Fuzz testing: feed random bytes
FUZZ_TARGET(data: random bytes)
    IF data is valid UTF-8 string s
        PARSE_JSON(s)   // should never panic

Coverage-guided fuzzing: Track code coverage. Mutate inputs that reach new code paths. Tools: AFL, libFuzzer, honggfuzz, cargo-fuzz.

Finds: Buffer overflows, integer overflows, null dereferences, infinite loops, denial of service.

Mutation Testing

Introduce small changes (mutations) to the code and check if tests detect them.

Original:  if x > 0 { ... }
Mutation:  if x >= 0 { ... }   ← If no test fails, test suite is weak!

Mutation score = killed mutations / total mutations. Higher is better.

Tools: cargo-mutants (Rust), Stryker (JS/C#), PITest (Java).

Test Doubles

Mock

Simulates an object with pre-programmed behavior and expectations.

// Mock testing
INTERFACE Database
    FUNCTION FIND_USER(id) -> User or NIL

TEST test_get_user
    mock_db ← MOCK(Database)
    EXPECT mock_db.FIND_USER(42)
        TO RETURN User(id: 42, name: "Alice")

    service ← new UserService(mock_db)
    user ← service.GET_USER(42)
    ASSERT user.name = "Alice"

Stub

Returns canned answers to calls. No expectations about how it's called.

Fake

A working implementation that's simpler than the real one (in-memory database, fake file system).

CLASS InMemoryUserRepo IMPLEMENTS UserRepository
    FIELDS: users (map of id -> User)
    FUNCTION FIND(id)    RETURN users[id] or NIL
    PROCEDURE SAVE(user) users[user.id] ← user

Spy

A wrapper that records calls for later verification.

Code Coverage

Coverage Metrics

| Metric | Measures | |---|---| | Line coverage | Which lines were executed | | Branch coverage | Which branches (if/else) were taken | | Path coverage | Which execution paths were followed | | MC/DC | Modified Condition/Decision Coverage (avionics: DO-178C) |

Line coverage: Easy to measure. 80-90% is a common target. 100% doesn't guarantee correctness.

Branch coverage: More thorough. Ensures both sides of every conditional are tested.

Tools: cargo-tarpaulin (Rust), gcov/lcov (C/C++), JaCoCo (Java), coverage.py (Python).

Test Pyramid

        /\
       /  \      E2E Tests (few, slow, expensive)
      /    \
     /------\    Integration Tests (moderate)
    /        \
   /----------\  Unit Tests (many, fast, cheap)
  /____________\

Many unit tests: Fast, cheap, catch most bugs. Run on every commit.

Moderate integration tests: Verify component interactions. Run on every PR.

Few E2E tests: Verify critical user flows. Slow, flaky, expensive. Run before release.

Anti-pattern: "Ice cream cone" — mostly E2E tests, few unit tests. Slow, fragile, expensive.

Contract Testing

Verify that services comply with their API contracts (request/response schemas, status codes).

// Provider: UserService guarantees this contract
// Consumer: OrderService depends on this contract

// Contract: GET /users/42 returns { "id": 42, "name": "Alice" }
// Provider verifies it fulfills the contract
// Consumer verifies it handles the contract correctly

Tools: Pact, Spring Cloud Contract. Especially important in microservices.

Applications in CS

  • CI/CD: Tests run automatically on every commit/PR. Failing tests block deployment.
  • Refactoring: Tests provide a safety net. Refactor with confidence.
  • Documentation: Tests serve as executable documentation of expected behavior.
  • Security: Fuzzing finds vulnerabilities. Security-focused tests verify access control.
  • Compliance: Regulated industries require specific coverage levels (DO-178C for avionics: MC/DC coverage).
  • Open source: Tests enable confident contributions from strangers. CI ensures quality.