Concurrency Verification

Overview

Concurrent programs are notoriously difficult to get right. Bugs depend on specific thread interleavings, making them hard to reproduce and test. Formal methods for concurrency verification provide mathematical guarantees about all possible interleavings, addressing safety, liveness, and correctness of concurrent data structures and protocols.

Safety and Liveness

Every correctness property of a concurrent system is a combination of safety and liveness.

Safety Properties

"Nothing bad ever happens." A safety violation is witnessed by a finite prefix of an execution.

Examples:

Mutual exclusion: at most one thread in a critical section at any time
Absence of data races: no unsynchronized concurrent accesses to shared memory
Absence of deadlock: the system never reaches a state where no thread can proceed
Invariant preservation: shared data structure invariants always hold

Verification: safety properties can be checked by exploring reachable states. If no bad state is reachable, the property holds.

Liveness Properties

"Something good eventually happens." A liveness violation requires an infinite execution where the good thing never occurs.

Examples:

Starvation freedom: every requesting thread eventually enters its critical section
Termination: every thread eventually completes
Lock freedom: some thread always makes progress
Wait freedom: every thread makes progress in finite steps

Verification: liveness requires fairness assumptions -- without them, a scheduler could starve a thread forever. Common fairness conditions:

Weak fairness: a continuously enabled action is eventually taken
Strong fairness: an infinitely often enabled action is eventually taken

Linearizability

Linearizability is the gold standard correctness criterion for concurrent data structures.

Definition: an execution is linearizable if each operation appears to take effect atomically at some point between its invocation and response (the linearization point).

Thread 1:  |--enqueue(A)--|        |--dequeue()-->A--|
Thread 2:       |--enqueue(B)--|
                    ↑              ↑
              linearization    linearization
              points           point

Equivalent sequential execution: enqueue(A), enqueue(B), dequeue()→A

Verification approaches:

Specify linearization points: annotate where each operation takes logical effect
Forward/backward simulation: show the concurrent implementation refines a sequential spec
Prophecy variables: handle cases where the linearization point depends on future events

Challenge: some data structures have non-fixed linearization points (e.g., the Herlihy-Wing queue), requiring more sophisticated proof techniques.

Concurrent Separation Logic (CSL)

CSL, introduced by O'Hearn (2007, based on Brookes), extends separation logic to concurrent programs.

Core Idea

The separating conjunction P * Q naturally models disjoint ownership of memory by different threads:

{P₁ * P₂}
  {P₁} C₁ {Q₁}  ||  {P₂} C₂ {Q₂}
{Q₁ * Q₂}

If thread 1 operates on memory described by P₁ and thread 2 on P₂, and these are disjoint (*), then both threads can execute concurrently without interference.

Resource Invariants

Shared resources are protected by resource invariants associated with locks:

// Resource invariant R: x ↦ v * v ≥ 0
// (the lock protects location x, which holds a non-negative value)

{emp}
acquire(lock);       // gain R: now have x ↦ v * v ≥ 0
x := x + 1;          // modify under lock
release(lock);       // give back R: must re-establish x ↦ v' * v' ≥ 0
{emp}

Modern CSL Variants

Iris (Coq-based framework): the dominant modern framework for concurrent verification. Supports:

Impredicative invariants
Ghost state and resource algebras
Logical atomicity for linearizability proofs
Higher-order reasoning
Used in RustBelt (Rust soundness proof) and many other projects

FCSL (Fine-grained Concurrent Separation Logic): focuses on compositional reasoning about fine-grained concurrent data structures.

Verified concurrent data structures (using CSL-based tools):

Concurrent hash maps, queues, stacks
Read-copy-update (RCU)
Lock-free algorithms (Michael-Scott queue, Harris list)

Rely-Guarantee Reasoning

Rely-guarantee (Jones, 1983) handles fine-grained concurrency where threads share memory without explicit locks.

Specification format: {P} C {Q} rely R guar G

Rely R: what the environment (other threads) may do to shared state
Guarantee G: what this thread promises to do to shared state

// Thread 1 increments x:
{x = 0}
x := x + 1
{x ≥ 1}
rely: other threads only increment x     (R: x' ≥ x)
guar: this thread only increments x      (G: x' ≥ x)

Compositionality: thread 1's guarantee must imply thread 2's rely, and vice versa.

Advantages: handles interference without requiring lock-based synchronization. Limitation: does not inherently support heap reasoning (but can be combined with separation logic → RGSep, LRG).

Owicki-Gries Method

A classical approach (1976) for verifying shared-variable concurrent programs.

Steps:

Annotate each thread with Hoare-logic assertions (including intermediate points)
Prove each thread correct in isolation using sequential Hoare logic
Check non-interference: every assignment in thread i preserves all assertions in thread j (and vice versa)

// Thread 1:        // Thread 2:
{x = 0}             {x = 0}
x := x + 1          x := x + 2
{x ≥ 1}             {x ≥ 2}

Non-interference check: does x := x + 2 preserve {x ≥ 1}? If x ≥ 1 before, then x + 2 ≥ 3 ≥ 1 after. Yes, preserved.

Limitations:

Not compositional: adding a thread requires rechecking all assertions
Annotation burden grows quadratically with thread count
Largely superseded by rely-guarantee and CSL, but important historically

Model Checking Concurrent Programs

State Space Explosion

With n threads and k states each, the interleaved state space is up to k^n * (n!)^k. Partial order reduction (POR) and symmetry reduction (see model checking chapter) are essential.

Tools

SPIN: explicit-state model checker for concurrent Promela models. Uses DFS with partial order reduction. Checks LTL properties, deadlock, assertion violations.

TLA+ / TLC: models concurrent/distributed systems as interleaved state machines. TLC explores all reachable states. Widely used for distributed protocol verification (Paxos, Raft, etc.).

CBMC: bounded model checker for C. Handles pthreads-style concurrency with bounded context switches.

GenMC / RCMC: stateless model checkers that explore concurrent executions under weak memory models (e.g., C11 memory model).

Weak Memory Models

Real hardware does not provide sequential consistency. Memory operations may be reordered.

Common models:

SC (Sequential Consistency): ideal, interleaved semantics
TSO (Total Store Order): x86 model, store buffers can delay writes
ARM/POWER relaxed: extensive reordering, requires barriers
C11/C++11 memory model: language-level model with atomics and ordering constraints

Model checkers for weak memory explore behaviors allowed by the memory model, not just SC interleavings.

Dynamic Analysis: ThreadSanitizer

ThreadSanitizer (TSan) is a dynamic data race detector, built into Clang/GCC and used at Google on virtually all C++ code.

How It Works

TSan uses a vector clock algorithm to detect data races:

Thread 1: write x  at time T1=(2,0)
Thread 2: read  x  at time T2=(0,1)

No happens-before relationship between T1 and T2 → DATA RACE

Key concepts:

Each thread maintains a vector clock: a vector of logical timestamps
Synchronization (mutex lock/unlock, thread join) updates vector clocks
On each memory access, TSan checks if there is an unordered concurrent access

Capabilities

Detects data races, use-after-free (in concurrent context)
Reports the two conflicting accesses with stack traces
5-15x runtime overhead, 5-10x memory overhead
Supports C, C++, Go, and Rust

Limitations

Dynamic: only finds races on executed paths (not exhaustive)
Cannot prove absence of races (unlike CSL or model checking)
Some overhead makes it unsuitable for production (typically used in testing/CI)

Complementary tools: Helgrind (Valgrind-based), Intel Inspector, Go race detector.

Comparison of Approaches

Method	Soundness	Completeness	Automation	Scalability
CSL (Iris)	Sound	N/A (proof-based)	Manual proofs	Moderate
Rely-Guarantee	Sound	N/A (proof-based)	Manual proofs	Moderate
Owicki-Gries	Sound	N/A (proof-based)	Manual proofs	Low
SPIN	Sound	Complete (finite)	Automatic	Moderate
CBMC (bounded)	Sound	Bounded	Automatic	Low-moderate
ThreadSanitizer	Unsound	Incomplete	Automatic	High

Key Takeaways

Safety (no bad states) and liveness (eventually good states) are the two pillars of concurrent correctness
Linearizability provides a clean correctness criterion for concurrent data structures
CSL's separating conjunction naturally models thread-local and shared state
Rely-guarantee handles fine-grained sharing without locks
Model checking concurrent programs faces combinatorial explosion but partial order reduction helps significantly
ThreadSanitizer provides practical, scalable race detection for production codebases but cannot provide formal guarantees
Weak memory models add another layer of complexity that formal methods must address