Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors reveal the intrinsic behavior of linear transformations — the directions that are merely scaled (not rotated) by the transformation.

Definition

For a square matrix A, a non-zero vector v is an eigenvector and scalar λ is the corresponding eigenvalue if:

Av = λv

The transformation A acts on v by simply scaling it by λ. The direction of v is preserved (or reversed if λ < 0).

Example:

A = [2 1],  v = [1],  Av = [3] = 3[1]
    [0 3]       [1]        [3]    [1]

So v = (1, 1) is an eigenvector with eigenvalue λ = 3.

Finding Eigenvalues

Rearrange Av = λv:

(A - λI)v = 0

This has a non-zero solution v iff A - λI is singular:

det(A - λI) = 0

Characteristic Polynomial

The characteristic polynomial of A is:

p(λ) = det(A - λI)

This is a degree-n polynomial in λ. Its roots are the eigenvalues.

Example:

A = [4  2]
    [1  3]

p(λ) = det [4-λ   2 ] = (4-λ)(3-λ) - 2 = λ² - 7λ + 10 = (λ-5)(λ-2)
           [ 1   3-λ]

Eigenvalues: λ₁ = 5, λ₂ = 2.

Properties of Eigenvalues

Sum of eigenvalues = trace(A) = Σ Aᵢᵢ
Product of eigenvalues = det(A)
A is invertible iff all eigenvalues are non-zero
Eigenvalues of A⁻¹ are 1/λᵢ
Eigenvalues of Aᵏ are λᵢᵏ
Eigenvalues of A + cI are λᵢ + c
Eigenvalues of cA are cλᵢ

Finding Eigenvectors

For each eigenvalue λ, solve (A - λI)v = 0.

The solution space is the eigenspace E_λ = ker(A - λI).

Example (continuing):

For λ₁ = 5:

A - 5I = [-1  2] → v = t[2]  →  eigenvector: (2, 1)
         [ 1 -2]       [1]

For λ₂ = 2:

A - 2I = [2  2] → v = t[-1]  →  eigenvector: (-1, 1)
         [1  1]       [ 1]

Algebraic vs Geometric Multiplicity

Algebraic multiplicity of λ: Its multiplicity as a root of the characteristic polynomial.

Geometric multiplicity of λ: dim(E_λ) = dim(ker(A - λI)) = number of independent eigenvectors.

Key relationship:

1 ≤ geometric multiplicity ≤ algebraic multiplicity

When geometric < algebraic, the eigenvalue is called defective. The matrix cannot be diagonalized.

Example: A = [[2, 1], [0, 2]]. Eigenvalue λ = 2 with algebraic multiplicity 2 but geometric multiplicity 1 (only one independent eigenvector: (1, 0)).

Diagonalization

A matrix A is diagonalizable if there exists an invertible P and diagonal D such that:

A = PDP⁻¹

where D = diag(λ₁, λ₂, ..., λₙ) and P = [v₁ | v₂ | ... | vₙ] (eigenvectors as columns).

When is A Diagonalizable?

A is diagonalizable iff it has n linearly independent eigenvectors, which happens iff geometric multiplicity = algebraic multiplicity for every eigenvalue.

Sufficient conditions:

A has n distinct eigenvalues (always diagonalizable)
A is symmetric (real eigenvalues, orthogonal eigenvectors)
A is normal (AA* = A*A)

Power of Diagonalization

If A = PDP⁻¹, then:

Aᵏ = PDᵏP⁻¹

Computing Aᵏ becomes trivial: Dᵏ = diag(λ₁ᵏ, λ₂ᵏ, ..., λₙᵏ).

Application: Solve linear recurrences. If xₙ₊₁ = Axₙ, then xₙ = Aⁿx₀ = PDⁿP⁻¹x₀.

Example: Fibonacci numbers. F(n+1) = F(n) + F(n-1).

[F(n+1)] = [1 1] [F(n)  ]
[F(n)  ]   [1 0] [F(n-1)]

Diagonalizing the matrix gives eigenvalues φ = (1+√5)/2 and ψ = (1-√5)/2, yielding Binet's formula: F(n) = (φⁿ - ψⁿ)/√5.

Cayley-Hamilton Theorem

Every square matrix satisfies its own characteristic polynomial:

p(A) = 0

If p(λ) = λⁿ + cₙ₋₁λⁿ⁻¹ + ... + c₁λ + c₀, then:

Aⁿ + cₙ₋₁Aⁿ⁻¹ + ... + c₁A + c₀I = 0

Applications

Computing A⁻¹: From p(A) = 0, isolate the constant term:

A(Aⁿ⁻¹ + cₙ₋₁Aⁿ⁻² + ... + c₁I) = -c₀I
A⁻¹ = -(1/c₀)(Aⁿ⁻¹ + cₙ₋₁Aⁿ⁻² + ... + c₁I)

(Only works when c₀ = (-1)ⁿ det(A) ≠ 0.)

Reducing matrix powers: Any power Aᵏ for k ≥ n can be expressed as a polynomial in A of degree < n.

Minimal polynomial: The minimal polynomial divides the characteristic polynomial and is the lowest-degree monic polynomial m(λ) such that m(A) = 0.

Spectral Theorem

Real Symmetric Matrices

If A = Aᵀ (real symmetric), then:

All eigenvalues are real.
Eigenvectors for distinct eigenvalues are orthogonal.
A is orthogonally diagonalizable: A = QΛQᵀ where Q is orthogonal (Qᵀ = Q⁻¹).

This is the spectral theorem for real symmetric matrices.

Normal Matrices

A matrix A is normal if AA* = AA (where A = Āᵀ).

Normal matrices include: symmetric, Hermitian, orthogonal, unitary, skew-symmetric.

The spectral theorem generalizes: normal matrices are unitarily diagonalizable: A = UΛU* where U is unitary.

Gershgorin Circle Theorem

Every eigenvalue of A lies in at least one Gershgorin disc:

Dᵢ = {z ∈ ℂ : |z - Aᵢᵢ| ≤ Σⱼ≠ᵢ |Aᵢⱼ|}

The disc is centered at the diagonal entry with radius equal to the sum of absolute values of off-diagonal entries in that row.

Union of all discs contains all eigenvalues. Useful for quick eigenvalue estimates without computing them.

Eigenvalues of Special Matrices

Matrix Type	Eigenvalue Property
Symmetric (A = Aᵀ)	All real
Skew-symmetric (A = -Aᵀ)	Purely imaginary (or zero)
Orthogonal (QᵀQ = I)
Positive definite	All positive
Positive semidefinite	All non-negative
Nilpotent (Aᵏ = 0)	All zero
Idempotent (A² = A)	0 or 1
Involutory (A² = I)	+1 or -1
Stochastic	1 is an eigenvalue, all

Numerical Computation

In practice, eigenvalues are not computed by finding roots of the characteristic polynomial (numerically unstable for large n).

Methods:

Power iteration: Finds the dominant eigenvalue. O(n²) per iteration. Converges linearly.
Inverse iteration: Finds eigenvalue closest to a shift μ. Uses (A - μI)⁻¹.
QR algorithm: The workhorse. Iteratively computes A = QR, then A ← RQ. Converges to upper triangular (Schur form). O(n³) total.
Lanczos algorithm: For large sparse symmetric matrices. Finds extreme eigenvalues efficiently.
Arnoldi algorithm: Generalization of Lanczos for non-symmetric matrices.

Real-World Applications

PageRank: The dominant eigenvector of the web graph's transition matrix.
PCA: Eigenvectors of the covariance matrix give principal components.
Vibrational analysis: Natural frequencies of a structure are eigenvalues of the stiffness/mass matrix.
Quantum mechanics: Energy levels are eigenvalues of the Hamiltonian operator.
Stability analysis: A dynamical system x' = Ax is stable iff all eigenvalues have negative real part.
Graph spectral theory: The spectrum of the adjacency/Laplacian matrix reveals graph properties (connectivity, expansion, chromatic number bounds).
Markov chains: The stationary distribution is the eigenvector for eigenvalue 1. The second-largest eigenvalue determines mixing time.
Image compression: Truncated eigendecomposition approximates data.
Google's PageRank: Dominant eigenvector of a modified adjacency matrix.
Control theory: Controllability and observability depend on eigenvalue placement.