Adaptive Signal Processing

Adaptive filters automatically adjust their coefficients based on the input signal — essential when the signal environment is unknown or changing.

Adaptive Filter Framework

x[n] (input) ──→ [Adaptive Filter w[n]] ──→ y[n] (output)
                         ↑
d[n] (desired) ─────→ (+) ─→ e[n] = d[n] - y[n] (error)
                    (compare)        ↓
                              [Update w[n+1]]

Goal: Minimize e[n] by adjusting filter weights w[n] to make y[n] ≈ d[n].

Wiener Filter

The optimal linear filter (minimizes mean squared error) when signal and noise statistics are known.

W_opt(f) = S_xd(f) / S_xx(f)

where S_xd is the cross-spectral density between input and desired signal, S_xx is the input PSD.

In time domain: Solve the Wiener-Hopf equation: R_xx · w = r_xd (autocorrelation matrix × optimal weights = cross-correlation vector).

Limitation: Requires knowledge of signal statistics — rarely available in practice. Adaptive filters estimate the Wiener solution iteratively.

LMS (Least Mean Squares)

The most widely used adaptive algorithm. Gradient descent on the MSE cost function using instantaneous estimates of the gradient.

y[n] = wᵀ[n] · x[n]
e[n] = d[n] - y[n]
w[n+1] = w[n] + μ · e[n] · x[n]    ← weight update

μ (step size / learning rate): Controls convergence speed vs stability.

Too large: Diverges.
Too small: Slow convergence.
Stability condition: 0 < μ < 2/(M · σ²_x) where M = filter order, σ²_x = input power.

STRUCTURE LmsFilter
    weights: array of real
    mu: real

PROCEDURE UPDATE(filter, x, desired)
    output ← 0
    FOR i ← 0 TO LENGTH(weights) - 1 DO
        output ← output + filter.weights[i] * x[i]
    error ← desired - output
    FOR i ← 0 TO LENGTH(weights) - 1 DO
        filter.weights[i] ← filter.weights[i] + filter.mu * error * x[i]
    RETURN output

Complexity: O(M) per sample (M = filter length). Very simple. Suitable for real-time.

Convergence: Converges to the Wiener solution in the mean. Excess MSE (misadjustment) ∝ μ × M.

NLMS (Normalized LMS)

Normalize the step size by the input power:

w[n+1] = w[n] + (μ / (ε + ||x[n]||²)) · e[n] · x[n]

ε is a small constant preventing division by zero.

Advantages over LMS: Step size adapts to signal level. More robust. Faster convergence for non-stationary signals.

RLS (Recursive Least Squares)

Minimizes the weighted sum of all past squared errors (exponentially weighted).

Much faster convergence than LMS (O(M²) per sample).
Optimal for stationary environments.
Higher computational cost: O(M²) vs O(M) for LMS.

Forgetting factor λ (0.95 - 1.0): Lower λ → more weight on recent data → faster tracking of changes but noisier estimate.

Affine Projection Algorithm (APA)

Generalization of NLMS using multiple past input vectors. Projects the weight update onto the subspace spanned by the last P input vectors.

P = 1: NLMS. P = M: RLS-like performance. Intermediate P gives a tradeoff between complexity and convergence speed.

Kalman Filter

The optimal adaptive filter for linear systems with Gaussian noise.

State-space model:

State:       x[k+1] = A·x[k] + B·u[k] + w[k]    (process model)
Observation: z[k]   = H·x[k] + v[k]               (measurement model)

w[k] ~ N(0, Q), v[k] ~ N(0, R) — process and measurement noise.

Algorithm

Predict:

x̂[k|k-1] = A·x̂[k-1|k-1] + B·u[k]     (predicted state)
P[k|k-1] = A·P[k-1|k-1]·Aᵀ + Q         (predicted covariance)

Update:

K[k] = P[k|k-1]·Hᵀ / (H·P[k|k-1]·Hᵀ + R)   (Kalman gain)
x̂[k|k] = x̂[k|k-1] + K[k]·(z[k] - H·x̂[k|k-1])  (updated state)
P[k|k] = (I - K[k]·H)·P[k|k-1]                  (updated covariance)

Variants

Extended Kalman Filter (EKF): For nonlinear systems. Linearize around the current estimate. First-order Taylor approximation.

Unscented Kalman Filter (UKF): For nonlinear systems. Uses sigma points to propagate mean and covariance through the nonlinearity. Better accuracy than EKF, no Jacobian computation needed.

Applications

Echo Cancellation

Problem: In teleconferencing, the microphone picks up the speaker output → echo. The adaptive filter models the echo path and subtracts the predicted echo.

Far-end signal → [Speaker] → [Room acoustics] → Microphone
                                    ↓
           [Adaptive Filter] ──→ subtract echo from microphone signal

NLMS or APA with 128-1024 taps. Must handle double-talk (both parties speaking simultaneously).

Noise Cancellation

Reference microphone picks up noise. Adaptive filter predicts the noise component at the primary microphone. Subtract to get clean signal.

[Primary mic] = signal + noise
[Reference mic] = noise (different path)
[Adaptive filter] models noise path from reference to primary
Output = primary - filter(reference) ≈ signal

Active noise cancellation (ANC) in headphones uses this principle with a feedback path.

Equalization

Compensate for channel distortion. Adaptive equalizer adjusts to the unknown/changing channel.

Decision-directed: Use the detected symbol as the desired signal after initial training.

System Identification

Identify an unknown system by feeding a known input and observing the output. The adaptive filter converges to the system's impulse response.

Beamforming

Array of microphones/antennas. Adaptive weights steer the beam toward the desired signal and null toward interferers.

y[n] = Σ wₖ* · xₖ[n]    (weighted sum of sensor signals)

MVDR (Minimum Variance Distortionless Response): Minimize output power subject to unity gain in the desired direction.

GSC (Generalized Sidelobe Canceller): Decompose into desired signal path and blocking matrix. Adaptively cancel interference.

Applications in CS

Telecommunications: Echo cancellation, channel equalization, interference cancellation.
Audio: Noise cancellation headphones, hearing aids, voice enhancement.
Robotics: Kalman filter for state estimation (position, velocity, orientation). Sensor fusion.
Finance: Kalman filter for tracking hidden states (volatility, trend). Adaptive trading strategies.
Control systems: Adaptive controllers, online parameter estimation.
Machine learning: LMS is essentially stochastic gradient descent on a linear model. Conceptual foundation for deep learning optimization.