Advanced Control
Robust Control
Addresses model uncertainty systematically. Uncertainty types:
- Parametric: known structure, uncertain parameters
- Unstructured: bounded frequency-domain error,
|Delta(j*omega)| <= w(omega)
H-infinity Control
Minimize the infinity-norm (peak over frequency) of a transfer function:
||T_{zw}||_inf = sup_omega sigma_max(T_{zw}(j*omega))
Mixed sensitivity formulation: minimize:
|| [W1*S; W2*K*S; W3*T] ||_inf < gamma
Where S = sensitivity, T = complementary sensitivity, and W1, W2, W3 are frequency-dependent weighting functions encoding performance specs.
- W1 large at low frequencies: good tracking/disturbance rejection
- W3 large at high frequencies: robustness to unmodeled dynamics, noise rejection
- W2: penalizes control effort
Solution: Riccati-based (two coupled AREs) or LMI-based (convex optimization). The gamma-iteration finds the minimal achievable gamma.
Mu-Synthesis (Structured Singular Value)
H-infinity treats uncertainty as a single unstructured block. For structured uncertainty (e.g., independent parameter variations), the structured singular value mu provides a tighter bound:
mu(M) = 1 / min{sigma_max(Delta) : det(I - M*Delta) = 0, Delta in structure}
Robust stability: mu(M) < 1 for all frequencies.
D-K iteration: alternates between:
- K-step: synthesize H-infinity controller for scaled plant
- D-step: fit scaling matrices D(omega) to minimize mu
Not guaranteed to converge but works well in practice.
Adaptive Control
The controller adjusts its parameters online to handle unknown or time-varying plant dynamics.
Model Reference Adaptive Control (MRAC)
The plant output tracks a reference model output:
Reference model: y_m' = A_m * y_m + B_m * r
Plant: y' = A * y + B * u (A, B unknown)
Control: u = theta(t)^T * phi(x, r)
MIT rule (gradient descent on error):
theta' = -gamma * e * (de/d_theta)
Lyapunov-based MRAC guarantees stability:
theta' = -Gamma * phi * e^T * P * B_m
where P satisfies A_m^T * P + P * A_m = -Q.
Challenges: requires persistent excitation for parameter convergence, can exhibit bursting or instability with insufficient excitation.
Self-Tuning Regulators
- Identify plant model online (e.g., recursive least squares)
- Design controller based on current model estimate
- Repeat at each time step
Certainty equivalence: treat estimated parameters as true. Works under appropriate excitation conditions.
Nonlinear Control
Feedback Linearization
Transform a nonlinear system into a linear one via nonlinear state feedback and coordinate change.
For x' = f(x) + g(x)*u, y = h(x):
Differentiate y until u appears explicitly (after r differentiations, r = relative degree):
y^(r) = L_f^r(h) + L_g*L_f^{r-1}(h) * u
where L_f is the Lie derivative along f. Choose:
u = (1 / L_g*L_f^{r-1}(h)) * (-L_f^r(h) + v)
Result: y^(r) = v (chain of integrators), controlled by linear techniques.
Requirements: relative degree must be well-defined, and the internal dynamics (zero dynamics) must be stable.
Sliding Mode Control
Forces the state onto a sliding surface s(x) = 0, then constrains it there.
For system x' = f(x) + g(x)*u + d(t) (d = disturbance):
Sliding surface: s(x) = c^T * e where e = tracking error.
Control law:
u = u_eq - k * sign(s)
u_eq: equivalent control (keeps state on surface)k * sign(s): switching term that drives state to surface
Reaching condition: s * s' < -eta * |s| guarantees finite-time reaching.
Properties:
- Invariant to matched disturbances and parameter variations once on surface
- Chattering problem: high-frequency switching. Mitigated by boundary layer (replace sign with saturation function)
Backstepping
Recursive design for systems in strict-feedback form:
x_1' = f_1(x_1) + g_1(x_1) * x_2
x_2' = f_2(x_1, x_2) + g_2(x_1, x_2) * x_3
...
x_n' = f_n(x) + g_n(x) * u
Design proceeds from x_1 to x_n, treating each x_{i+1} as a virtual control for the x_i subsystem. Each step constructs a Lyapunov function, building upon the previous one.
Produces a globally stabilizing controller with a constructive Lyapunov proof.
Model Predictive Control (MPC)
At each time step, solve a finite-horizon optimal control problem online:
min_{u_0,...,u_{N-1}} sum_{k=0}^{N-1} [x_k^T*Q*x_k + u_k^T*R*u_k] + x_N^T*P_f*x_N
Subject to:
x_{k+1} = A*x_k + B*u_k
x_k in X (state constraints)
u_k in U (input constraints)
Apply only the first control action u_0, then re-solve at the next step (receding horizon).
Key Features
- Systematic constraint handling (unique among control methods)
- Preview/feedforward of known future references
- Multivariable naturally
- Computational cost: solves a QP (linear MPC) or NLP (nonlinear MPC) at each step
Stability Guarantees
- Terminal cost P_f: chosen as the solution of the Lyapunov or Riccati equation
- Terminal constraint:
x_N in X_f(invariant set under local controller) - With appropriate terminal ingredients, the cost function serves as a Lyapunov function
Practical Variants
- Explicit MPC: precompute the control law offline as a piecewise affine function of the state (feasible for small systems)
- Robust MPC: min-max formulation or tube-based MPC for bounded disturbances
- Economic MPC: optimize economic cost directly rather than tracking error
Optimal Control
Pontryagin's Maximum Principle
For x' = f(x, u), minimize J = phi(x(t_f)) + integral_0^{t_f} L(x, u) dt:
Define Hamiltonian: H = L(x, u) + lambda^T * f(x, u)
Necessary conditions:
x' = dH/d_lambda = f(x, u) (state equation)
lambda' = -dH/dx (costate equation)
dH/du = 0 or u* = argmin_u H (optimality)
Boundary conditions: x(0) = x_0, lambda(t_f) = d_phi/dx(t_f).
Results in a two-point boundary value problem (TPBVP), generally solved numerically.
Hamilton-Jacobi-Bellman (HJB) Equation
The dynamic programming approach. Define the value function V(x, t) = optimal cost-to-go from state x at time t:
-dV/dt = min_u [L(x, u) + (dV/dx)^T * f(x, u)]
with boundary condition V(x, t_f) = phi(x(t_f)).
For LTI systems with quadratic cost, reduces to the Riccati equation (LQR).
Curse of dimensionality: HJB requires gridding the state space -- exponential in dimension. Practical only for low-dimensional systems.
Reinforcement Learning for Control
RL learns control policies from interaction, without requiring an explicit plant model.
Policy Gradient Methods
Parametrize the policy u = pi_theta(x) and optimize:
J(theta) = E[sum gamma^k * r_k]
Gradient: nabla_theta J = E[sum nabla_theta log pi_theta(x_k, u_k) * Q^pi(x_k, u_k)].
Value-Based Methods (Q-Learning)
Learn the action-value function:
Q(x, u) = r(x, u) + gamma * max_{u'} Q(x', u')
Discrete: tabular Q-learning converges to optimal policy. Continuous: use function approximation (neural networks = Deep Q-Networks).
Actor-Critic
Combines policy (actor) and value function (critic) learning. Examples: DDPG, SAC, PPO.
RL vs Classical Control
| Aspect | Classical/Optimal | RL | |--------|------------------|-----| | Model required | Yes | No (model-free) or learned | | Constraints | Naturally handled (MPC) | Difficult (penalty methods) | | Stability guarantees | Strong (Lyapunov, Riccati) | Limited (active research) | | High-dimensional | Curse of dimensionality | Scales via function approximation | | Sample efficiency | High (model-based) | Low (model-free) |
Bridging the gap: model-based RL, safe RL with Lyapunov constraints, and physics-informed neural network controllers combine strengths of both paradigms.