6 min read
On this page

Concurrency Verification

Overview

Concurrent programs are notoriously difficult to get right. Bugs depend on specific thread interleavings, making them hard to reproduce and test. Formal methods for concurrency verification provide mathematical guarantees about all possible interleavings, addressing safety, liveness, and correctness of concurrent data structures and protocols.

Safety and Liveness

Every correctness property of a concurrent system is a combination of safety and liveness.

Safety Properties

"Nothing bad ever happens." A safety violation is witnessed by a finite prefix of an execution.

Examples:

  • Mutual exclusion: at most one thread in a critical section at any time
  • Absence of data races: no unsynchronized concurrent accesses to shared memory
  • Absence of deadlock: the system never reaches a state where no thread can proceed
  • Invariant preservation: shared data structure invariants always hold

Verification: safety properties can be checked by exploring reachable states. If no bad state is reachable, the property holds.

Liveness Properties

"Something good eventually happens." A liveness violation requires an infinite execution where the good thing never occurs.

Examples:

  • Starvation freedom: every requesting thread eventually enters its critical section
  • Termination: every thread eventually completes
  • Lock freedom: some thread always makes progress
  • Wait freedom: every thread makes progress in finite steps

Verification: liveness requires fairness assumptions -- without them, a scheduler could starve a thread forever. Common fairness conditions:

  • Weak fairness: a continuously enabled action is eventually taken
  • Strong fairness: an infinitely often enabled action is eventually taken

Linearizability

Linearizability is the gold standard correctness criterion for concurrent data structures.

Definition: an execution is linearizable if each operation appears to take effect atomically at some point between its invocation and response (the linearization point).

Thread 1:  |--enqueue(A)--|        |--dequeue()-->A--|
Thread 2:       |--enqueue(B)--|
                    ↑              ↑
              linearization    linearization
              points           point

Equivalent sequential execution: enqueue(A), enqueue(B), dequeue()→A

Verification approaches:

  • Specify linearization points: annotate where each operation takes logical effect
  • Forward/backward simulation: show the concurrent implementation refines a sequential spec
  • Prophecy variables: handle cases where the linearization point depends on future events

Challenge: some data structures have non-fixed linearization points (e.g., the Herlihy-Wing queue), requiring more sophisticated proof techniques.

Concurrent Separation Logic (CSL)

CSL, introduced by O'Hearn (2007, based on Brookes), extends separation logic to concurrent programs.

Core Idea

The separating conjunction P * Q naturally models disjoint ownership of memory by different threads:

{P₁ * P₂}
  {P₁} C₁ {Q₁}  ||  {P₂} C₂ {Q₂}
{Q₁ * Q₂}

If thread 1 operates on memory described by P₁ and thread 2 on P₂, and these are disjoint (*), then both threads can execute concurrently without interference.

Resource Invariants

Shared resources are protected by resource invariants associated with locks:

// Resource invariant R: x ↦ v * v ≥ 0
// (the lock protects location x, which holds a non-negative value)

{emp}
acquire(lock);       // gain R: now have x ↦ v * v ≥ 0
x := x + 1;          // modify under lock
release(lock);       // give back R: must re-establish x ↦ v' * v' ≥ 0
{emp}

Modern CSL Variants

Iris (Coq-based framework): the dominant modern framework for concurrent verification. Supports:

  • Impredicative invariants
  • Ghost state and resource algebras
  • Logical atomicity for linearizability proofs
  • Higher-order reasoning
  • Used in RustBelt (Rust soundness proof) and many other projects

FCSL (Fine-grained Concurrent Separation Logic): focuses on compositional reasoning about fine-grained concurrent data structures.

Verified concurrent data structures (using CSL-based tools):

  • Concurrent hash maps, queues, stacks
  • Read-copy-update (RCU)
  • Lock-free algorithms (Michael-Scott queue, Harris list)

Rely-Guarantee Reasoning

Rely-guarantee (Jones, 1983) handles fine-grained concurrency where threads share memory without explicit locks.

Specification format: {P} C {Q} rely R guar G

  • Rely R: what the environment (other threads) may do to shared state
  • Guarantee G: what this thread promises to do to shared state
// Thread 1 increments x:
{x = 0}
x := x + 1
{x ≥ 1}
rely: other threads only increment x     (R: x' ≥ x)
guar: this thread only increments x      (G: x' ≥ x)

Compositionality: thread 1's guarantee must imply thread 2's rely, and vice versa.

Advantages: handles interference without requiring lock-based synchronization. Limitation: does not inherently support heap reasoning (but can be combined with separation logic → RGSep, LRG).

Owicki-Gries Method

A classical approach (1976) for verifying shared-variable concurrent programs.

Steps:

  1. Annotate each thread with Hoare-logic assertions (including intermediate points)
  2. Prove each thread correct in isolation using sequential Hoare logic
  3. Check non-interference: every assignment in thread i preserves all assertions in thread j (and vice versa)
// Thread 1:        // Thread 2:
{x = 0}             {x = 0}
x := x + 1          x := x + 2
{x ≥ 1}             {x ≥ 2}

Non-interference check: does x := x + 2 preserve {x ≥ 1}? If x ≥ 1 before, then x + 2 ≥ 3 ≥ 1 after. Yes, preserved.

Limitations:

  • Not compositional: adding a thread requires rechecking all assertions
  • Annotation burden grows quadratically with thread count
  • Largely superseded by rely-guarantee and CSL, but important historically

Model Checking Concurrent Programs

State Space Explosion

With n threads and k states each, the interleaved state space is up to k^n * (n!)^k. Partial order reduction (POR) and symmetry reduction (see model checking chapter) are essential.

Tools

SPIN: explicit-state model checker for concurrent Promela models. Uses DFS with partial order reduction. Checks LTL properties, deadlock, assertion violations.

TLA+ / TLC: models concurrent/distributed systems as interleaved state machines. TLC explores all reachable states. Widely used for distributed protocol verification (Paxos, Raft, etc.).

CBMC: bounded model checker for C. Handles pthreads-style concurrency with bounded context switches.

GenMC / RCMC: stateless model checkers that explore concurrent executions under weak memory models (e.g., C11 memory model).

Weak Memory Models

Real hardware does not provide sequential consistency. Memory operations may be reordered.

Common models:

  • SC (Sequential Consistency): ideal, interleaved semantics
  • TSO (Total Store Order): x86 model, store buffers can delay writes
  • ARM/POWER relaxed: extensive reordering, requires barriers
  • C11/C++11 memory model: language-level model with atomics and ordering constraints

Model checkers for weak memory explore behaviors allowed by the memory model, not just SC interleavings.

Dynamic Analysis: ThreadSanitizer

ThreadSanitizer (TSan) is a dynamic data race detector, built into Clang/GCC and used at Google on virtually all C++ code.

How It Works

TSan uses a vector clock algorithm to detect data races:

Thread 1: write x  at time T1=(2,0)
Thread 2: read  x  at time T2=(0,1)

No happens-before relationship between T1 and T2 → DATA RACE

Key concepts:

  • Each thread maintains a vector clock: a vector of logical timestamps
  • Synchronization (mutex lock/unlock, thread join) updates vector clocks
  • On each memory access, TSan checks if there is an unordered concurrent access

Capabilities

  • Detects data races, use-after-free (in concurrent context)
  • Reports the two conflicting accesses with stack traces
  • 5-15x runtime overhead, 5-10x memory overhead
  • Supports C, C++, Go, and Rust

Limitations

  • Dynamic: only finds races on executed paths (not exhaustive)
  • Cannot prove absence of races (unlike CSL or model checking)
  • Some overhead makes it unsuitable for production (typically used in testing/CI)

Complementary tools: Helgrind (Valgrind-based), Intel Inspector, Go race detector.

Comparison of Approaches

| Method | Soundness | Completeness | Automation | Scalability | |--------|-----------|-------------|------------|-------------| | CSL (Iris) | Sound | N/A (proof-based) | Manual proofs | Moderate | | Rely-Guarantee | Sound | N/A (proof-based) | Manual proofs | Moderate | | Owicki-Gries | Sound | N/A (proof-based) | Manual proofs | Low | | SPIN | Sound | Complete (finite) | Automatic | Moderate | | CBMC (bounded) | Sound | Bounded | Automatic | Low-moderate | | ThreadSanitizer | Unsound | Incomplete | Automatic | High |

Key Takeaways

  • Safety (no bad states) and liveness (eventually good states) are the two pillars of concurrent correctness
  • Linearizability provides a clean correctness criterion for concurrent data structures
  • CSL's separating conjunction naturally models thread-local and shared state
  • Rely-guarantee handles fine-grained sharing without locks
  • Model checking concurrent programs faces combinatorial explosion but partial order reduction helps significantly
  • ThreadSanitizer provides practical, scalable race detection for production codebases but cannot provide formal guarantees
  • Weak memory models add another layer of complexity that formal methods must address