Formal Semantics

Overview

Comparison of Operational, Denotational, and Axiomatic Semantics

Formal semantics assigns rigorous mathematical meaning to programs, enabling reasoning about correctness, equivalence, and behavior. Three major frameworks exist: operational, denotational, and axiomatic semantics.

Operational Semantics

Operational semantics defines meaning by describing how programs execute on an abstract machine.

Big-Step (Natural) Semantics

Relates an expression directly to its final value. Judgments take the form e => v (expression e evaluates to value v).

          e1 => n1    e2 => n2    n1 + n2 = n
          ─────────────────────────────────────
                    e1 + e2 => n

Advantages: clean, compositional rules; natural for interpreters
Limitations: cannot express non-termination (no derivation tree exists); cannot observe intermediate states
Co-big-step semantics extends the framework to handle divergence via coinductive derivations

Small-Step (Structural) Semantics

Defines a single-step reduction relation e -> e'. Execution is the reflexive-transitive closure ->*.

          e1 -> e1'                    v1 value, e2 -> e2'
          ──────────────               ──────────────────────
          e1 + e2 -> e1' + e2         v1 + e2 -> v1 + e2'

Evaluation contexts: factor out congruence rules. Define E ::= [] | E + e | v + E, then one rule: E[r] -> E[r'] when r -> r'
Advantages: models intermediate states, non-termination is an infinite reduction sequence, composable with concurrency
Relates to abstract machines: reflexive-transitive closure of small-step gives the same result as big-step (when both are defined)

Relating Big-Step and Small-Step

Theorem (equivalence): e => v if and only if e ->* v and v is a value. Proved by induction on derivation/reduction length.

Denotational Semantics

Maps programs to mathematical objects (denotations) in a semantic domain, typically using domain theory.

Semantic Domains

Flat domains: discrete values lifted with a bottom element. Z_bot = {bot, ..., -1, 0, 1, ...}
Function domains: continuous functions between domains. D -> E is the space of Scott-continuous functions
Recursive domains: solved via domain equations. D ≅ D -> D requires techniques from domain theory (inverse limit construction)

Semantic Functions

A compositional mapping [[·]] from syntax to domains:

[[n]]         = n
[[e1 + e2]]   = [[e1]] +_Z [[e2]]
[[if e1 then e2 else e3]] = cond([[e1]], [[e2]], [[e3]])

Compositionality: the meaning of a compound expression depends only on the meanings of its subexpressions.

Fixed Points and Recursion

Recursive definitions are interpreted as least fixed points:

[[while b do c]] = fix(F)
  where F(f) = lambda sigma. if [[b]]sigma then f([[c]]sigma) else sigma

Kleene's fixed-point theorem: every Scott-continuous function on a pointed CPO has a least fixed point, computed as sup{F^n(bot) | n >= 0}
Scott continuity: F preserves directed suprema. Required for the fixed-point theorem to hold

Adequacy and Full Abstraction

Adequacy: if [[e]] != bot, then e terminates operationally
Full abstraction: [[e1]] = [[e2]] iff e1 and e2 are observationally equivalent. PCF's standard denotational semantics is adequate but not fully abstract (parallel-or problem)

Axiomatic Semantics

Specifies program behavior through logical assertions, abstracting away execution details.

Hoare Logic

Judgments (Hoare triples): {P} c {Q} means if precondition P holds before c, and c terminates, then postcondition Q holds after.

Core rules:

Assignment:    {Q[e/x]} x := e {Q}

Sequence:      {P} c1 {R}    {R} c2 {Q}
               ──────────────────────────
                     {P} c1; c2 {Q}

Conditional:   {P ∧ b} c1 {Q}    {P ∧ ¬b} c2 {Q}
               ─────────────────────────────────────
                   {P} if b then c1 else c2 {Q}

While:         {P ∧ b} c {P}
               ─────────────────────────
               {P} while b do c {P ∧ ¬b}
               (P is the loop invariant)

Consequence:   P' => P    {P} c {Q}    Q => Q'
               ──────────────────────────────────
                        {P'} c {Q'}

Partial correctness: {P} c {Q} assumes termination
Total correctness: [P] c [Q] additionally requires termination; the while rule needs a variant (decreasing measure)

Weakest Precondition (wp)

Dijkstra's predicate transformer semantics. wp(c, Q) is the weakest precondition such that executing c establishes Q.

wp(x := e, Q)              = Q[e/x]
wp(c1; c2, Q)              = wp(c1, wp(c2, Q))
wp(if b then c1 else c2, Q) = (b => wp(c1,Q)) ∧ (¬b => wp(c2,Q))
wp(while b do c, Q)         = requires finding invariant

Properties: wp(c, Q) is the strongest condition P such that {P} c {Q} holds. It is monotone, conjunctive, and (for deterministic programs) disjunctive.

Separation Logic

Extends Hoare logic for heap-manipulating programs:

Separating conjunction P * Q: heap splits into disjoint parts satisfying P and Q
Frame rule: {P} c {Q} implies {P * R} c {Q * R} when c does not modify R's footprint
Enables local reasoning about mutable state and concurrency (concurrent separation logic)

Definitional Interpreters

Use a metalanguage (often a functional language) as the semantic framework.

Reynolds' Approach

Write an interpreter in a well-understood language; the semantics of the object language is defined by the interpreter plus the semantics of the metalanguage.

eval (Num n)     env = n
eval (Var x)     env = lookup env x
eval (Add e1 e2) env = eval e1 env + eval e2 env
eval (Lam x e)   env = Closure x e env
eval (App e1 e2) env = let Closure x body cenv = eval e1 env
                           arg = eval e2 env
                       in eval body (extend cenv x arg)

Key Considerations

If the metalanguage is call-by-value, the interpreter naturally defines call-by-value semantics. Using thunks in the interpreter shifts to call-by-name/need
Defunctionalization of the interpreter yields an abstract machine (connection to CEK, SECD)
Monadic interpreters: parameterize over effects (state, exceptions, nondeterminism) by abstracting the interpreter into a monad

Mechanized Semantics

Modern practice uses proof assistants (Coq, Isabelle, Lean) to define semantics and prove properties:

CompCert: verified C compiler with formal semantics in Coq
CakeML: verified ML compiler with big-step semantics
K Framework: executable semantics from rewriting rules

Connections Between Frameworks

Property	Operational	Denotational	Axiomatic
Focus	How programs execute	What programs compute	What programs guarantee
Abstraction	Low (step-by-step)	High (mathematical)	Specification-level
Compositionality	Structural rules	By construction	Via proof rules
Concurrency	Natural extension	Challenging	Separation logic

The frameworks are complementary. Soundness results connect them: Hoare logic is sound with respect to operational semantics; denotational semantics is adequate for operational semantics. A complete understanding typically uses all three.