Image Filtering

Convolution

1D Convolution

The discrete convolution of signal f with kernel h:

(f * h)[n] = sum_{k=-inf}^{inf} f[k] * h[n - k]

Key properties:

Commutative: f * h = h * f
Associative: f * (g * h) = (f * g) * h -- chain filters by convolving kernels
Distributive: f * (g + h) = f * g + f * h

2D Convolution

For image I and kernel K of size (2a+1) x (2b+1):

(I * K)[x, y] = sum_{i=-a}^{a} sum_{j=-b}^{b} I[x-i, y-j] * K[i, j]

Correlation differs from convolution by not flipping the kernel. For symmetric kernels they are identical.

Boundary Handling

Method	Description
Zero padding	Extend with zeros (darkens borders)
Replicate	Extend with nearest edge pixel
Reflect	Mirror pixels at boundary
Wrap	Periodic extension (assumes tiled image)

Separable Filters

A 2D kernel K is separable if K = k_col * k_row^T. Reduces complexity from O(m*n*k^2) to O(m*n*2k) where k is kernel width. Gaussian kernels are separable.

Linear Filters

Box (Mean) Filter

K_box = (1/k^2) * ones(k, k)

Simple averaging. Fast with integral images (summed area table) in O(1) per pixel regardless of kernel size. Produces boxy artifacts.

Gaussian Filter

G(x, y) = (1 / (2*pi*sigma^2)) * exp(-(x^2 + y^2) / (2*sigma^2))

Kernel size typically 6*sigma + 1 (captures 99.7% of distribution)
Separable: apply 1D Gaussian along rows then columns
Removes high-frequency noise while preserving low-frequency structure
Cascade property: G(sigma_1) * G(sigma_2) = G(sqrt(sigma_1^2 + sigma_2^2))

Sharpening

Unsharp mask: enhance edges by subtracting a blurred version:

I_sharp = I + alpha * (I - G * I) = (1 + alpha) * I - alpha * G * I

Laplacian sharpening: add the Laplacian (second derivative) to the original:

I_sharp = I - alpha * nabla^2(I)

Laplacian kernel (4-connected):

[ 0  1  0]
[ 1 -4  1]
[ 0  1  0]

Non-Linear Filters

Median Filter

Replaces each pixel with the median of its neighborhood. Excellent for salt-and-pepper noise removal while preserving edges. Not separable in general but fast approximations exist.

Bilateral Filter

Smooths while preserving edges by weighting based on both spatial and intensity distance:

BF[I](x) = (1/W) * sum_{y in N(x)} G_s(||x-y||) * G_r(|I(x)-I(y)|) * I(y)

G_s: spatial Gaussian (sigma_s controls neighborhood size)
G_r: range Gaussian (sigma_r controls edge sensitivity)
W: normalization factor

Complexity: O(n * k^2) per pixel. Approximations: grid-based (Paris & Durand), permutohedral lattice.

Guided Filter

Uses a guidance image G to filter input I. Assumes a local linear model:

q_i = a_k * G_i + b_k,  for all i in window w_k

Coefficients solved by minimizing sum(q_i - I_i)^2 + epsilon * a_k^2. Properties:

O(n) complexity regardless of window size (using integral images)
Edge-preserving when guidance = input
Transfer structure from guidance to input (e.g., flash/no-flash photography)

Non-Local Means (NLM)

Averages pixels with similar patches anywhere in the image:

NLM(x) = (1/Z(x)) * sum_y w(x, y) * I(y)
w(x, y) = exp(-||P(x) - P(y)||^2 / h^2)

where P(x) is the patch centered at x. Superior denoising at high computational cost. BM3D extends this idea with 3D collaborative filtering.

Edge Detection

Image Gradients

First derivatives approximate edges. For discrete images:

G_x = dI/dx ~ I(x+1,y) - I(x-1,y)  (central difference)
G_y = dI/dy ~ I(x,y+1) - I(x,y-1)

Gradient magnitude: |G| = sqrt(G_x^2 + G_y^2) Gradient direction: theta = atan2(G_y, G_x)

Sobel Operator

Combines smoothing with differentiation:

S_x = [-1 0 1]     S_y = [-1 -2 -1]
      [-2 0 2]           [ 0  0  0]
      [-1 0 1]           [ 1  2  1]

Separable: S_x = [1; 2; 1] * [-1, 0, 1]. Scharr operator provides better rotational symmetry.

Canny Edge Detector

The gold standard multi-stage edge detector:

Gaussian smoothing: reduce noise
Gradient computation: magnitude and direction (Sobel)
Non-maximum suppression (NMS): thin edges to 1-pixel width by suppressing pixels that are not local maxima along the gradient direction
Double thresholding: classify pixels as strong (> T_high), weak (T_low < . < T_high), or suppressed (< T_low)
Hysteresis: weak pixels connected to strong pixels are kept; isolated weak pixels are removed

Typical ratio: T_high / T_low ~ 2:1 or 3:1.

Laplacian of Gaussian (LoG)

Second-derivative operator that detects edges at zero crossings:

LoG(x,y) = -(1/(pi*sigma^4)) * (1 - (x^2+y^2)/(2*sigma^2)) * exp(-(x^2+y^2)/(2*sigma^2))

Approximated by Difference of Gaussians (DoG): LoG ~ G(k*sigma) - G(sigma).

Morphological Operations

Operations on binary (or grayscale) images using a structuring element B.

Binary Morphology

Operation	Definition	Effect
Erosion	`(I ⊖ B)(x) = 1 iff B ⊆ I at x`	Shrinks foreground, removes small objects
Dilation	`(I ⊕ B)(x) = 1 iff B ∩ I ≠ ∅ at x`	Expands foreground, fills small holes
Opening	`I ∘ B = (I ⊖ B) ⊕ B`	Removes small protrusions
Closing	`I • B = (I ⊕ B) ⊖ B`	Fills small gaps

Derived Operations

Morphological gradient: (I ⊕ B) - (I ⊖ B) -- edge detection
Top-hat: I - (I ∘ B) -- extracts bright details on dark background
Black-hat: (I • B) - I -- extracts dark details on bright background
Hit-or-miss: matches specific patterns using two structuring elements

Grayscale Morphology

Extends to grayscale by replacing set operations with min/max:

Erosion: local minimum (darkens)
Dilation: local maximum (brightens)

Structuring Elements

Common shapes: disk, square, cross, diamond. Size controls the scale of features affected. Decomposable structuring elements (e.g., large disks approximated by lines) reduce computation.

Frequency Domain Filtering

Convolution Theorem

F{f * g} = F{f} . F{g}

Convolution in spatial domain equals element-wise multiplication in frequency domain. For large kernels, FFT-based filtering is faster: O(n log n) vs O(n * k^2).

Low-Pass and High-Pass

Ideal low-pass: sharp cutoff at frequency D_0 -- causes ringing (Gibbs phenomenon)
Butterworth: smooth rolloff, parameterized by order n
```
H(u,v) = 1 / (1 + (D(u,v)/D_0)^(2n))
```
Gaussian low-pass: H = exp(-D^2 / (2*D_0^2)) -- no ringing
High-pass: H_hp = 1 - H_lp

Band-Pass and Notch Filters

Used for periodic noise removal (e.g., stripe patterns from sensors). Identify noise spikes in the Fourier spectrum and suppress them with notch reject filters.

Practical Considerations

Gaussian blur before edge detection to suppress noise (Canny does this internally)
Bilateral filter is the default choice for edge-preserving smoothing in photography apps
Morphological operations require careful structuring element selection -- too large destroys features
For real-time applications, separable and integral image methods are essential
Non-local means and BM3D remain competitive with deep learning denoisers on many benchmarks

Key Takeaways

Concept	Core Idea
Convolution	Foundation of filtering; spatial domain sliding window
Gaussian filter	Optimal smoothing; separable, cascade property
Bilateral filter	Edge-preserving smoothing via joint spatial-range weighting
Canny	Multi-stage edge detection with hysteresis
Morphology	Set-theoretic operations for shape analysis
Frequency domain	Convolution theorem enables fast large-kernel filtering