Program Synthesis

Foundations

Program synthesis is the automatic generation of programs from high-level specifications. The specification may be a logical formula, input-output examples, natural language, or a combination. Synthesis searches for a program P in some search space such that P satisfies the specification. The challenge is that the search space is typically vast (all programs up to some size) and the specification may be incomplete or ambiguous.

Formal Problem Statement

Given a background theory T, a specification φ(x, P(x)) relating inputs to outputs, and a set of candidate programs L (the search space, often defined by a grammar), find P ∈ L such that:

∀x. T ⊨ φ(x, P(x))

This is a second-order problem (existential quantification over programs, universal over inputs) — inherently harder than standard constraint solving.

Syntax-Guided Synthesis (SyGuS)

The SyGuS Framework

SyGuS (Alur et al., 2013) standardizes the synthesis problem: given a background theory, a semantic specification (a logical formula), and a syntactic template (a context-free grammar defining the search space), find a term derivable from the grammar that satisfies the specification.

The grammar restricts the search space, making synthesis tractable. Example grammar for linear arithmetic expressions over x, y:

E → x | y | 0 | 1 | E + E | E - E | ite(B, E, E)
B → E ≤ E | E = E | B ∧ B | ¬B

CEGIS (Counterexample-Guided Inductive Synthesis)

The dominant solving approach iterates between two phases:

Synthesis phase: Find a candidate program P that satisfies the specification on a finite set of examples {(x₁,y₁), ..., (xₖ,yₖ)}.
Verification phase: Check ∀x. φ(x, P(x)) using an SMT solver. If valid, return P. If not, the solver provides a counterexample xₙₑw where P fails. Add xₙₑw to the example set and repeat.

CEGIS typically converges quickly because counterexamples efficiently prune the search space. The verification query is a standard SMT problem; the synthesis phase is the bottleneck.

Search Strategies

Enumerative Search

Enumerate candidate programs in order of increasing size (bottom-up). Evaluate each candidate against the specification. Optimizations:

Observational equivalence: Two programs are equivalent on the current example set if they produce identical outputs. Keep only one representative from each equivalence class — dramatic pruning.
Divide and conquer: Decompose the synthesis problem into subproblems, synthesize components independently, compose.
Probabilistic guidance: Use learned probabilities to prioritize likely-correct candidates.

EUSolver (Alur et al., 2017): Enumerative synthesizer that handles conditional expressions by separately synthesizing the conditions (decision tree learning) and the leaf expressions (term enumeration).

Constraint-Based Synthesis

Encode the synthesis problem as a constraint satisfaction problem. Represent the unknown program as a template with unknown coefficients:

P(x, y) = c₁·x + c₂·y + c₃

The synthesis problem becomes: find c₁, c₂, c₃ such that ∀x,y. φ(x, y, c₁x + c₂y + c₃). This is a quantified SMT problem, solvable by CEGIS or specialized quantifier elimination.

Advantages: Works well when the program structure is known (linear expressions, polynomial expressions, bit-manipulation patterns). Limitations: Requires a fixed template; cannot discover new program structures.

Deductive Synthesis

Derive programs from proofs of their specifications. If you have a constructive proof that ∀x. ∃y. φ(x,y), the proof contains an algorithm for computing y from x. Proof-theoretic synthesis extracts programs from formal proofs.

Leon/Stainless (EPFL): Synthesizes Scala functions from postcondition specifications. Uses a deductive approach combined with constraint solving. The specification is a postcondition; the synthesizer searches for a function body satisfying it, using algebraic reasoning and recursive decomposition.

Component-Based Synthesis

Given a library of components (functions/operators), synthesize a program by composing them. Each component has a known semantics. The synthesis problem: find a composition (wiring) of components that satisfies the specification.

Jha et al. (2010): Formulate as an SMT problem. Introduce variables for the wiring (which component's output connects to which component's input). The correctness constraint and wiring constraints form an SMT query.

Prospector, InSynth: Component-based synthesis for API usage. Given a target type, find a sequence of API calls that constructs a value of that type. Uses type-directed search with heuristic prioritization.

Program Sketching

Sketch (Solar-Lezama, 2008)

The programmer provides a sketch — a partial program with holes (unknown expressions or control flow). The synthesizer fills the holes to satisfy an assertion-based specification.

Example sketch:

int abs(int x) {
    if (x < ??) return ??*x + ??;
    else return x;
}

The synthesizer determines ?? = 0, ?? = -1, ?? = 0 yielding the standard absolute value function.

Solving: Sketches are compiled to Boolean SAT problems via bounded bit-width encoding. CEGIS iterates between finding candidate hole values and verifying against assertions. Bounded integers (e.g., 5-bit) keep the SAT problem tractable.

Impact: Sketching bridges the gap between manual programming and full synthesis. The programmer provides the algorithmic structure; the synthesizer handles low-level details (constants, boundary conditions, bit-manipulation tricks).

Programming by Examples (PBE)

FlashFill (Gulwani, 2011)

FlashFill synthesizes string transformations from input-output examples, deployed in Microsoft Excel. Given examples like:

"John Smith" → "J. Smith"
"Jane Doe" → "J. Doe"

FlashFill infers the transformation: concatenate first character, ". ", last word.

Technical approach: A domain-specific language (DSL) for string transformations (concatenation, substring, regular expression matches, conditionals). A version space algebra compactly represents the set of all consistent programs. The version space is intersected across examples, and a ranking function selects the most likely program.

FlashMeta (Polozov and Gulwani, 2015): Generalizes PBE to arbitrary DSLs via witness functions that decompose specifications top-down through DSL operators.

Prose Framework

Microsoft's PROSE (PROgram Synthesis by Examples) generalizes FlashFill to a framework for building PBE synthesizers. The developer defines a DSL and witness functions; PROSE handles the search, ranking, and user interaction. Powers features in Excel, PowerBI, and other Microsoft products.

Program Repair

Automatic Program Repair

Given a buggy program and a specification (usually a test suite), modify the program minimally to satisfy the specification.

GenProg (Le Goues et al., 2012): Uses genetic programming to evolve patches. Mutations: delete a statement, insert a statement copied from elsewhere in the program, swap statements. Fitness function: number of passing tests. Found patches for real bugs in large C programs but raised concerns about patch quality — many patches overfit the test suite (pass tests but do not fix the underlying bug).

SemFix/Angelix: Constraint-based repair. Use symbolic execution to extract a repair constraint (what value should the buggy expression produce?). Synthesize a new expression satisfying the constraint via component-based synthesis.

Prophet/HDRepair: Use learned models of correct patches (from historical bug-fix data) to guide search. Rank candidate patches by likelihood of being correct.

APR quality concerns: Overfitting to the test specification is the central challenge. A weak test suite admits many "patches" that pass all tests but are semantically wrong. Formal specification or independent test suites are needed to validate patches.

Neural and LLM-Guided Synthesis

Neural Program Synthesis

DeepCoder (Balog et al., 2017): Predict which DSL components are likely needed from input-output examples, then use the predictions to prune enumerative search. Neural network guides combinatorial search.

RobustFill (Devlin et al., 2017): Sequence-to-sequence model directly generates string transformation programs from I/O examples. End-to-end neural approach, competitive with FlashFill on many benchmarks.

Neurosymbolic synthesis: Combine neural pattern recognition with symbolic search. The neural component proposes candidates or guides search; the symbolic component verifies correctness. This addresses the core weakness of neural approaches (no correctness guarantees) while leveraging their ability to capture patterns.

LLM-Guided Synthesis

Large language models (Codex/GPT-4, Claude, Gemini) generate code from natural language specifications. While not classical synthesis (no formal guarantees), LLMs have transformed practical program generation.

AlphaCode/AlphaCode 2 (DeepMind): Generate massive numbers of candidate programs for competitive programming problems, then filter and cluster. Achieves competitive performance by compensating quality with quantity.

Verified LLM synthesis: Use LLMs to propose candidates, then verify with formal methods. If verification fails, feed the counterexample back to the LLM. This CEGIS-like loop combines LLM generality with formal correctness. Tools like Copilot + Dafny explore this direction.

Limitations: LLMs produce plausible but potentially incorrect code. For safety-critical applications, LLM-generated code must be formally verified. The gap between "looks right" and "is right" remains significant.

Synthesis from Demonstrations and Traces

Programming by Demonstration (PBD)

Record user actions (clicks, edits, transformations) and infer a generalizable program. Application: robotic process automation (RPA), web scraping macros, spreadsheet transformations.

Trace-Based Synthesis

Given execution traces (not just I/O but intermediate states), synthesize a program that produces those traces. More information than I/O examples; helps disambiguate programs.

Reactive Synthesis

Synthesize reactive systems (controllers, protocols) from temporal logic specifications (LTL, CTL). The problem: given an LTL formula φ over input and output signals, construct a finite-state transducer that satisfies φ for all possible input sequences.

Reactive synthesis is 2EXPTIME-complete for LTL. Practical approaches: bounded synthesis (search for transducers up to k states), GR(1) fragment (generalized reactivity, polynomial synthesis for an expressive fragment), and compositional approaches.

Applications and Impact

End-user programming: FlashFill, PROSE, and PBE tools empower non-programmers to automate data transformations.
Superoptimization: Synthesize optimal instruction sequences (STOKE, Souper). Enumerate/search for the shortest/fastest sequence of machine instructions implementing a given function.
Invariant inference: Synthesize loop invariants for verification (ICE learning, data-driven invariant inference).
API migration: Synthesize code transformations to migrate between API versions.
Concurrency: Synthesize synchronization (locks, barriers) to make concurrent programs correct.