Advanced Control

Robust Control

Addresses model uncertainty systematically. Uncertainty types:

Parametric: known structure, uncertain parameters
Unstructured: bounded frequency-domain error, |Delta(j*omega)| <= w(omega)

H-infinity Control

Minimize the infinity-norm (peak over frequency) of a transfer function:

||T_{zw}||_inf = sup_omega sigma_max(T_{zw}(j*omega))

Mixed sensitivity formulation: minimize:

|| [W1*S; W2*K*S; W3*T] ||_inf < gamma

Where S = sensitivity, T = complementary sensitivity, and W1, W2, W3 are frequency-dependent weighting functions encoding performance specs.

W1 large at low frequencies: good tracking/disturbance rejection
W3 large at high frequencies: robustness to unmodeled dynamics, noise rejection
W2: penalizes control effort

Solution: Riccati-based (two coupled AREs) or LMI-based (convex optimization). The gamma-iteration finds the minimal achievable gamma.

Mu-Synthesis (Structured Singular Value)

H-infinity treats uncertainty as a single unstructured block. For structured uncertainty (e.g., independent parameter variations), the structured singular value mu provides a tighter bound:

mu(M) = 1 / min{sigma_max(Delta) : det(I - M*Delta) = 0, Delta in structure}

Robust stability: mu(M) < 1 for all frequencies.

D-K iteration: alternates between:

K-step: synthesize H-infinity controller for scaled plant
D-step: fit scaling matrices D(omega) to minimize mu

Not guaranteed to converge but works well in practice.

Adaptive Control

The controller adjusts its parameters online to handle unknown or time-varying plant dynamics.

Model Reference Adaptive Control (MRAC)

The plant output tracks a reference model output:

Reference model:  y_m' = A_m * y_m + B_m * r
Plant:            y' = A * y + B * u    (A, B unknown)
Control:          u = theta(t)^T * phi(x, r)

MIT rule (gradient descent on error):

theta' = -gamma * e * (de/d_theta)

Lyapunov-based MRAC guarantees stability:

theta' = -Gamma * phi * e^T * P * B_m

where P satisfies A_m^T * P + P * A_m = -Q.

Challenges: requires persistent excitation for parameter convergence, can exhibit bursting or instability with insufficient excitation.

Self-Tuning Regulators

Identify plant model online (e.g., recursive least squares)
Design controller based on current model estimate
Repeat at each time step

Certainty equivalence: treat estimated parameters as true. Works under appropriate excitation conditions.

Nonlinear Control

Feedback Linearization

Transform a nonlinear system into a linear one via nonlinear state feedback and coordinate change.

For x' = f(x) + g(x)*u, y = h(x):

Differentiate y until u appears explicitly (after r differentiations, r = relative degree):

y^(r) = L_f^r(h) + L_g*L_f^{r-1}(h) * u

where L_f is the Lie derivative along f. Choose:

u = (1 / L_g*L_f^{r-1}(h)) * (-L_f^r(h) + v)

Result: y^(r) = v (chain of integrators), controlled by linear techniques.

Requirements: relative degree must be well-defined, and the internal dynamics (zero dynamics) must be stable.

Sliding Mode Control

Forces the state onto a sliding surface s(x) = 0, then constrains it there.

For system x' = f(x) + g(x)*u + d(t) (d = disturbance):

Sliding surface: s(x) = c^T * e where e = tracking error.

Control law:

u = u_eq - k * sign(s)

u_eq: equivalent control (keeps state on surface)
k * sign(s): switching term that drives state to surface

Reaching condition: s * s' < -eta * |s| guarantees finite-time reaching.

Properties:

Invariant to matched disturbances and parameter variations once on surface
Chattering problem: high-frequency switching. Mitigated by boundary layer (replace sign with saturation function)

Backstepping

Recursive design for systems in strict-feedback form:

x_1' = f_1(x_1) + g_1(x_1) * x_2
x_2' = f_2(x_1, x_2) + g_2(x_1, x_2) * x_3
...
x_n' = f_n(x) + g_n(x) * u

Design proceeds from x_1 to x_n, treating each x_{i+1} as a virtual control for the x_i subsystem. Each step constructs a Lyapunov function, building upon the previous one.

Produces a globally stabilizing controller with a constructive Lyapunov proof.

Model Predictive Control (MPC)

At each time step, solve a finite-horizon optimal control problem online:

min_{u_0,...,u_{N-1}} sum_{k=0}^{N-1} [x_k^T*Q*x_k + u_k^T*R*u_k] + x_N^T*P_f*x_N

Subject to:

x_{k+1} = A*x_k + B*u_k
x_k in X  (state constraints)
u_k in U  (input constraints)

Apply only the first control action u_0, then re-solve at the next step (receding horizon).

Key Features

Systematic constraint handling (unique among control methods)
Preview/feedforward of known future references
Multivariable naturally
Computational cost: solves a QP (linear MPC) or NLP (nonlinear MPC) at each step

Stability Guarantees

Terminal cost P_f: chosen as the solution of the Lyapunov or Riccati equation
Terminal constraint: x_N in X_f (invariant set under local controller)
With appropriate terminal ingredients, the cost function serves as a Lyapunov function

Practical Variants

Explicit MPC: precompute the control law offline as a piecewise affine function of the state (feasible for small systems)
Robust MPC: min-max formulation or tube-based MPC for bounded disturbances
Economic MPC: optimize economic cost directly rather than tracking error

Optimal Control

Pontryagin's Maximum Principle

For x' = f(x, u), minimize J = phi(x(t_f)) + integral_0^{t_f} L(x, u) dt:

Define Hamiltonian: H = L(x, u) + lambda^T * f(x, u)

Necessary conditions:

x' = dH/d_lambda = f(x, u)             (state equation)
lambda' = -dH/dx                        (costate equation)
dH/du = 0  or  u* = argmin_u H          (optimality)

Boundary conditions: x(0) = x_0, lambda(t_f) = d_phi/dx(t_f).

Results in a two-point boundary value problem (TPBVP), generally solved numerically.

Hamilton-Jacobi-Bellman (HJB) Equation

The dynamic programming approach. Define the value function V(x, t) = optimal cost-to-go from state x at time t:

-dV/dt = min_u [L(x, u) + (dV/dx)^T * f(x, u)]

with boundary condition V(x, t_f) = phi(x(t_f)).

For LTI systems with quadratic cost, reduces to the Riccati equation (LQR).

Curse of dimensionality: HJB requires gridding the state space -- exponential in dimension. Practical only for low-dimensional systems.

Reinforcement Learning for Control

RL learns control policies from interaction, without requiring an explicit plant model.

Policy Gradient Methods

Parametrize the policy u = pi_theta(x) and optimize:

J(theta) = E[sum gamma^k * r_k]

Gradient: nabla_theta J = E[sum nabla_theta log pi_theta(x_k, u_k) * Q^pi(x_k, u_k)].

Value-Based Methods (Q-Learning)

Learn the action-value function:

Q(x, u) = r(x, u) + gamma * max_{u'} Q(x', u')

Discrete: tabular Q-learning converges to optimal policy. Continuous: use function approximation (neural networks = Deep Q-Networks).

Actor-Critic

Combines policy (actor) and value function (critic) learning. Examples: DDPG, SAC, PPO.

RL vs Classical Control

Aspect	Classical/Optimal	RL
Model required	Yes	No (model-free) or learned
Constraints	Naturally handled (MPC)	Difficult (penalty methods)
Stability guarantees	Strong (Lyapunov, Riccati)	Limited (active research)
High-dimensional	Curse of dimensionality	Scales via function approximation
Sample efficiency	High (model-based)	Low (model-free)

Bridging the gap: model-based RL, safe RL with Lyapunov constraints, and physics-informed neural network controllers combine strengths of both paradigms.