Simulation Optimization

Overview

Simulation optimization finds the best input parameters for a system that can only be evaluated through simulation. Unlike classical optimization where the objective function has a closed form, here each function evaluation requires running a stochastic simulation — expensive, noisy, and often black-box.

┌────────────┐    parameters    ┌────────────┐    noisy output
│ Optimization│───────────────►│ Simulation  │──────────────►  statistics
│  Algorithm  │◄───────────────│   Model     │
└────────────┘   performance    └────────────┘

Ranking and Selection

Given k alternative system configurations, identify the best (highest or lowest mean performance) with a statistical guarantee.

Indifference-Zone Procedures

Guarantee: select the best system with probability >= 1 - alpha, whenever the best is at least delta better than the second best.

Procedure (Rinott's two-stage):

Take n_0 replications of each system
Compute sample means and variances
Calculate total replications needed per system: n_i = max(n_0, ceil((h·S_i / delta)²))
Take additional replications as needed
Select system with best sample mean

// Select the system with the smallest mean from simulation results
PROCEDURE RANKING_AND_SELECTION(means, std_errs, delta) → index
    // Simple: return index of minimum mean
    // In practice: apply Rinott or KN procedure for guarantees
    best_index ← 0
    FOR i ← 1 TO LENGTH(means) - 1 DO
        IF means[i] < means[best_index] THEN
            best_index ← i
    RETURN best_index

Procedures for Many Systems

KN procedure (Kim-Nelson): Sequential elimination. After each batch of replications, eliminate systems that are statistically dominated.
OCBA (Optimal Computing Budget Allocation): Allocate replications proportionally to uncertainty about which system is best.

Response Surface Methodology (RSM)

Approximate the simulation response with a surrogate model, then optimize the surrogate.

Workflow

Screening — Identify influential factors (fractional factorial, Morris method)
First-order model — Fit y = beta_0 + sum(beta_i·x_i) + epsilon using a 2^k factorial design
Steepest ascent/descent — Follow the gradient of the fitted model
Second-order model — Near the optimum, fit y = beta_0 + sum(beta_i·x_i) + sum(beta_ii·x_i²) + sum(beta_ij·x_i·x_j) using central composite or Box-Behnken design
Optimize — Find the stationary point of the quadratic surface

Metamodel-Based Optimization

Modern alternatives to classical RSM:

Kriging / Gaussian Process (GP) — Interpolating surrogate with uncertainty estimates. Enables Bayesian optimization.
Radial Basis Functions — Flexible interpolation for high-dimensional spaces.
Neural network surrogates — For very complex response surfaces with large training budgets.

Bayesian Optimization

Use a GP surrogate with an acquisition function to balance exploration and exploitation:

Expected Improvement (EI): E[max(f(x) - f_best, 0)]
Upper Confidence Bound (UCB): mu(x) + kappa·sigma(x)
Knowledge Gradient: value of information from one additional sample

Particularly effective when simulation is very expensive (minutes to hours per run) and the parameter space is moderate-dimensional (< ~20).

Simulation-Based Optimization

Stochastic Approximation

Robbins-Monro: iteratively update parameters along noisy gradient estimates:

theta_{n+1} = theta_n - a_n · g_hat(theta_n)

where g_hat is a noisy gradient estimate and {a_n} is a diminishing step-size sequence (a_n -> 0, sum(a_n) = inf, sum(a_n²) < inf).

Stochastic Gradient Methods

Finite differences: Estimate gradient by perturbing each parameter. Cost: 2d simulations per iteration for d parameters.
Simultaneous Perturbation (SPSA): Perturb all parameters at once with random direction. Cost: 2 simulations per iteration regardless of d.
Likelihood ratio / score function: Differentiate the expectation analytically through the probability distribution.

Metaheuristics

For combinatorial or non-smooth problems:

Genetic algorithms — Population of solutions evolved through selection, crossover, mutation
Simulated annealing — Random perturbations accepted with probability depending on temperature schedule
Particle swarm — Solutions move through parameter space influenced by personal and global best
Cross-entropy method — Iteratively update sampling distribution toward elite solutions

Reinforcement Learning for Simulation

RL treats the simulation as an environment. An agent learns a policy mapping states to actions to maximize cumulative reward.

Simulation as RL Environment

┌─────────┐  action   ┌────────────┐  next state, reward
│ RL Agent│──────────►│ Simulation │──────────────────────►
│ (Policy)│◄──────────│Environment │
└─────────┘ observation└────────────┘

Advantages over classical simulation optimization:

Handles sequential decision-making naturally
Learns adaptive policies, not just static parameter settings
Generalizes across states (neural network function approximation)

Challenges:

Sample efficiency — millions of simulation episodes may be needed
Sim-to-real gap — policy trained in simulation may fail in reality
Reward shaping — defining the right reward signal is critical

Applications

Inventory management — RL agent learns reorder policies
Traffic signal control — Optimize signal timing based on real-time state
Manufacturing scheduling — Sequence jobs across machines
Robot control — Learn locomotion or manipulation in physics simulator

Digital Twins

A digital twin is a live virtual replica of a physical asset, process, or system, continuously synchronized with real-world data.

Architecture

┌─────────────────────────────────────────────┐
│              Physical System                 │
│  sensors ──► data stream ──► actuator cmds   │
└──────┬──────────────────────────────┬────────┘
       │ IoT / SCADA                  ▲
       ▼                              │
┌──────────────────────────────────────────────┐
│              Digital Twin Platform           │
│  ┌──────────┐  ┌──────────┐  ┌───────────┐  │
│  │  Data    │  │Simulation│  │ Analytics │  │
│  │Ingestion │─►│  Model   │─►│& Decision │  │
│  └──────────┘  └──────────┘  └───────────┘  │
│  ┌──────────┐  ┌──────────┐  ┌───────────┐  │
│  │  State   │  │ What-If  │  │Visualization│ │
│  │Estimation│  │ Scenarios│  │    & UI   │  │
│  └──────────┘  └──────────┘  └───────────┘  │
└──────────────────────────────────────────────┘

Synchronization

The twin must reflect current physical state:

State estimation — Kalman filters, particle filters, or data assimilation techniques fuse sensor data with model predictions
Parameter updating — Online calibration adjusts model parameters as the physical system drifts
Update frequency — Ranges from real-time (milliseconds) for control to periodic (hourly/daily) for strategic planning

Predictive Maintenance

A primary digital twin application. The twin tracks degradation state and predicts remaining useful life (RUL):

Ingest sensor data (vibration, temperature, current)
Update degradation model (physics-based, data-driven, or hybrid)
Simulate forward to predict failure time distribution
Optimize maintenance schedule: minimize cost = f(downtime, repair, spare parts)

// Simple degradation model with digital twin update
STRUCTURE DegradationTwin
    degradation: real       // current estimated degradation level
    drift_rate: real        // degradation rate parameter
    threshold: real         // failure threshold
    process_noise: real
    measurement_noise: real
    estimation_var: real    // Kalman filter variance

// Kalman filter update with new sensor reading
PROCEDURE UPDATE(twin, measurement, dt)
    // Predict
    predicted ← twin.degradation + twin.drift_rate * dt
    predicted_var ← twin.estimation_var + twin.process_noise * dt

    // Update
    kalman_gain ← predicted_var / (predicted_var + twin.measurement_noise)
    twin.degradation ← predicted + kalman_gain * (measurement - predicted)
    twin.estimation_var ← (1.0 - kalman_gain) * predicted_var

// Estimate remaining useful life
PROCEDURE PREDICT_RUL(twin) → real
    remaining ← twin.threshold - twin.degradation
    IF remaining ≤ 0.0 THEN RETURN 0.0
    RETURN remaining / twin.drift_rate

Parallel and Distributed Simulation

Parallel Discrete-Event Simulation (PDES)

Partition the simulation model into Logical Processes (LPs) executing on separate processors.

Conservative approach (Chandy-Misra-Bryant):

LPs only process events when safe (no earlier event can arrive from other LPs)
Uses null messages or lookahead to avoid deadlock
No rollback but may have idle time waiting for safety guarantees

Optimistic approach (Time Warp):

LPs process events speculatively
If a straggler event arrives with earlier timestamp, rollback via state restoration and anti-messages
Global Virtual Time (GVT) provides a floor below which no rollback can occur — enables fossil collection

Scaling Considerations

Aspect	Conservative	Optimistic
Correctness	Guaranteed	Via rollback
Memory	Low	High (state saving)
Performance	Limited by slowest LP	Better for unbalanced loads
Complexity	Moderate	High (anti-messages, GVT)

Cloud and HPC Simulation

Embarrassingly parallel: Independent replications across nodes (trivial parallelism)
Parameter sweeps: Each node runs a different parameter combination
Large-scale PDES: Federated simulations using HLA (High Level Architecture) standard
GPU acceleration: Thousands of agents or Monte Carlo paths computed in parallel on GPU

Distributed Simulation Standards

HLA (IEEE 1516): Federates communicate through a Runtime Infrastructure (RTI). Used in defense and large-scale training simulations.
DIS (IEEE 1278): Protocol for real-time military simulations. Simpler than HLA, uses PDUs (Protocol Data Units) over UDP.

Choosing an Approach

Problem Type	Recommended Method
Few alternatives, need guarantee	Ranking and Selection
Continuous parameters, expensive simulation	Bayesian Optimization
High-dimensional continuous	SPSA, stochastic gradient
Combinatorial	Metaheuristics (GA, SA)
Sequential decisions	Reinforcement Learning
Live system monitoring	Digital Twin
Many independent runs	Parallel replications