4 min read
On this page

Image Filtering

Convolution

1D Convolution

The discrete convolution of signal f with kernel h:

(f * h)[n] = sum_{k=-inf}^{inf} f[k] * h[n - k]

Key properties:

  • Commutative: f * h = h * f
  • Associative: f * (g * h) = (f * g) * h -- chain filters by convolving kernels
  • Distributive: f * (g + h) = f * g + f * h

2D Convolution

For image I and kernel K of size (2a+1) x (2b+1):

(I * K)[x, y] = sum_{i=-a}^{a} sum_{j=-b}^{b} I[x-i, y-j] * K[i, j]

Correlation differs from convolution by not flipping the kernel. For symmetric kernels they are identical.

Boundary Handling

| Method | Description | |--------|-------------| | Zero padding | Extend with zeros (darkens borders) | | Replicate | Extend with nearest edge pixel | | Reflect | Mirror pixels at boundary | | Wrap | Periodic extension (assumes tiled image) |

Separable Filters

A 2D kernel K is separable if K = k_col * k_row^T. Reduces complexity from O(m*n*k^2) to O(m*n*2k) where k is kernel width. Gaussian kernels are separable.


Linear Filters

Box (Mean) Filter

K_box = (1/k^2) * ones(k, k)

Simple averaging. Fast with integral images (summed area table) in O(1) per pixel regardless of kernel size. Produces boxy artifacts.

Gaussian Filter

G(x, y) = (1 / (2*pi*sigma^2)) * exp(-(x^2 + y^2) / (2*sigma^2))
  • Kernel size typically 6*sigma + 1 (captures 99.7% of distribution)
  • Separable: apply 1D Gaussian along rows then columns
  • Removes high-frequency noise while preserving low-frequency structure
  • Cascade property: G(sigma_1) * G(sigma_2) = G(sqrt(sigma_1^2 + sigma_2^2))

Sharpening

Unsharp mask: enhance edges by subtracting a blurred version:

I_sharp = I + alpha * (I - G * I) = (1 + alpha) * I - alpha * G * I

Laplacian sharpening: add the Laplacian (second derivative) to the original:

I_sharp = I - alpha * nabla^2(I)

Laplacian kernel (4-connected):

[ 0  1  0]
[ 1 -4  1]
[ 0  1  0]

Non-Linear Filters

Median Filter

Replaces each pixel with the median of its neighborhood. Excellent for salt-and-pepper noise removal while preserving edges. Not separable in general but fast approximations exist.

Bilateral Filter

Smooths while preserving edges by weighting based on both spatial and intensity distance:

BF[I](x) = (1/W) * sum_{y in N(x)} G_s(||x-y||) * G_r(|I(x)-I(y)|) * I(y)
  • G_s: spatial Gaussian (sigma_s controls neighborhood size)
  • G_r: range Gaussian (sigma_r controls edge sensitivity)
  • W: normalization factor

Complexity: O(n * k^2) per pixel. Approximations: grid-based (Paris & Durand), permutohedral lattice.

Guided Filter

Uses a guidance image G to filter input I. Assumes a local linear model:

q_i = a_k * G_i + b_k,  for all i in window w_k

Coefficients solved by minimizing sum(q_i - I_i)^2 + epsilon * a_k^2. Properties:

  • O(n) complexity regardless of window size (using integral images)
  • Edge-preserving when guidance = input
  • Transfer structure from guidance to input (e.g., flash/no-flash photography)

Non-Local Means (NLM)

Averages pixels with similar patches anywhere in the image:

NLM(x) = (1/Z(x)) * sum_y w(x, y) * I(y)
w(x, y) = exp(-||P(x) - P(y)||^2 / h^2)

where P(x) is the patch centered at x. Superior denoising at high computational cost. BM3D extends this idea with 3D collaborative filtering.


Edge Detection

Image Gradients

First derivatives approximate edges. For discrete images:

G_x = dI/dx ~ I(x+1,y) - I(x-1,y)  (central difference)
G_y = dI/dy ~ I(x,y+1) - I(x,y-1)

Gradient magnitude: |G| = sqrt(G_x^2 + G_y^2) Gradient direction: theta = atan2(G_y, G_x)

Sobel Operator

Combines smoothing with differentiation:

S_x = [-1 0 1]     S_y = [-1 -2 -1]
      [-2 0 2]           [ 0  0  0]
      [-1 0 1]           [ 1  2  1]

Separable: S_x = [1; 2; 1] * [-1, 0, 1]. Scharr operator provides better rotational symmetry.

Canny Edge Detector

The gold standard multi-stage edge detector:

  1. Gaussian smoothing: reduce noise
  2. Gradient computation: magnitude and direction (Sobel)
  3. Non-maximum suppression (NMS): thin edges to 1-pixel width by suppressing pixels that are not local maxima along the gradient direction
  4. Double thresholding: classify pixels as strong (> T_high), weak (T_low < . < T_high), or suppressed (< T_low)
  5. Hysteresis: weak pixels connected to strong pixels are kept; isolated weak pixels are removed

Typical ratio: T_high / T_low ~ 2:1 or 3:1.

Laplacian of Gaussian (LoG)

Second-derivative operator that detects edges at zero crossings:

LoG(x,y) = -(1/(pi*sigma^4)) * (1 - (x^2+y^2)/(2*sigma^2)) * exp(-(x^2+y^2)/(2*sigma^2))

Approximated by Difference of Gaussians (DoG): LoG ~ G(k*sigma) - G(sigma).


Morphological Operations

Operations on binary (or grayscale) images using a structuring element B.

Binary Morphology

| Operation | Definition | Effect | |-----------|------------|--------| | Erosion | (I ⊖ B)(x) = 1 iff B ⊆ I at x | Shrinks foreground, removes small objects | | Dilation | (I ⊕ B)(x) = 1 iff B ∩ I ≠ ∅ at x | Expands foreground, fills small holes | | Opening | I ∘ B = (I ⊖ B) ⊕ B | Removes small protrusions | | Closing | I • B = (I ⊕ B) ⊖ B | Fills small gaps |

Derived Operations

  • Morphological gradient: (I ⊕ B) - (I ⊖ B) -- edge detection
  • Top-hat: I - (I ∘ B) -- extracts bright details on dark background
  • Black-hat: (I • B) - I -- extracts dark details on bright background
  • Hit-or-miss: matches specific patterns using two structuring elements

Grayscale Morphology

Extends to grayscale by replacing set operations with min/max:

  • Erosion: local minimum (darkens)
  • Dilation: local maximum (brightens)

Structuring Elements

Common shapes: disk, square, cross, diamond. Size controls the scale of features affected. Decomposable structuring elements (e.g., large disks approximated by lines) reduce computation.


Frequency Domain Filtering

Convolution Theorem

F{f * g} = F{f} . F{g}

Convolution in spatial domain equals element-wise multiplication in frequency domain. For large kernels, FFT-based filtering is faster: O(n log n) vs O(n * k^2).

Low-Pass and High-Pass

  • Ideal low-pass: sharp cutoff at frequency D_0 -- causes ringing (Gibbs phenomenon)
  • Butterworth: smooth rolloff, parameterized by order n
    H(u,v) = 1 / (1 + (D(u,v)/D_0)^(2n))
    
  • Gaussian low-pass: H = exp(-D^2 / (2*D_0^2)) -- no ringing
  • High-pass: H_hp = 1 - H_lp

Band-Pass and Notch Filters

Used for periodic noise removal (e.g., stripe patterns from sensors). Identify noise spikes in the Fourier spectrum and suppress them with notch reject filters.


Practical Considerations

  • Gaussian blur before edge detection to suppress noise (Canny does this internally)
  • Bilateral filter is the default choice for edge-preserving smoothing in photography apps
  • Morphological operations require careful structuring element selection -- too large destroys features
  • For real-time applications, separable and integral image methods are essential
  • Non-local means and BM3D remain competitive with deep learning denoisers on many benchmarks

Key Takeaways

| Concept | Core Idea | |---------|-----------| | Convolution | Foundation of filtering; spatial domain sliding window | | Gaussian filter | Optimal smoothing; separable, cascade property | | Bilateral filter | Edge-preserving smoothing via joint spatial-range weighting | | Canny | Multi-stage edge detection with hysteresis | | Morphology | Set-theoretic operations for shape analysis | | Frequency domain | Convolution theorem enables fast large-kernel filtering |