Eigenvalues and Eigenvectors
Eigenvalues and eigenvectors reveal the intrinsic behavior of linear transformations — the directions that are merely scaled (not rotated) by the transformation.
Definition
For a square matrix A, a non-zero vector v is an eigenvector and scalar λ is the corresponding eigenvalue if:
Av = λv
The transformation A acts on v by simply scaling it by λ. The direction of v is preserved (or reversed if λ < 0).
Example:
A = [2 1], v = [1], Av = [3] = 3[1]
[0 3] [1] [3] [1]
So v = (1, 1) is an eigenvector with eigenvalue λ = 3.
Finding Eigenvalues
Rearrange Av = λv:
(A - λI)v = 0
This has a non-zero solution v iff A - λI is singular:
det(A - λI) = 0
Characteristic Polynomial
The characteristic polynomial of A is:
p(λ) = det(A - λI)
This is a degree-n polynomial in λ. Its roots are the eigenvalues.
Example:
A = [4 2]
[1 3]
p(λ) = det [4-λ 2 ] = (4-λ)(3-λ) - 2 = λ² - 7λ + 10 = (λ-5)(λ-2)
[ 1 3-λ]
Eigenvalues: λ₁ = 5, λ₂ = 2.
Properties of Eigenvalues
- Sum of eigenvalues = trace(A) = Σ Aᵢᵢ
- Product of eigenvalues = det(A)
- A is invertible iff all eigenvalues are non-zero
- Eigenvalues of A⁻¹ are 1/λᵢ
- Eigenvalues of Aᵏ are λᵢᵏ
- Eigenvalues of A + cI are λᵢ + c
- Eigenvalues of cA are cλᵢ
Finding Eigenvectors
For each eigenvalue λ, solve (A - λI)v = 0.
The solution space is the eigenspace E_λ = ker(A - λI).
Example (continuing):
For λ₁ = 5:
A - 5I = [-1 2] → v = t[2] → eigenvector: (2, 1)
[ 1 -2] [1]
For λ₂ = 2:
A - 2I = [2 2] → v = t[-1] → eigenvector: (-1, 1)
[1 1] [ 1]
Algebraic vs Geometric Multiplicity
Algebraic multiplicity of λ: Its multiplicity as a root of the characteristic polynomial.
Geometric multiplicity of λ: dim(E_λ) = dim(ker(A - λI)) = number of independent eigenvectors.
Key relationship:
1 ≤ geometric multiplicity ≤ algebraic multiplicity
When geometric < algebraic, the eigenvalue is called defective. The matrix cannot be diagonalized.
Example: A = [[2, 1], [0, 2]]. Eigenvalue λ = 2 with algebraic multiplicity 2 but geometric multiplicity 1 (only one independent eigenvector: (1, 0)).
Diagonalization
A matrix A is diagonalizable if there exists an invertible P and diagonal D such that:
A = PDP⁻¹
where D = diag(λ₁, λ₂, ..., λₙ) and P = [v₁ | v₂ | ... | vₙ] (eigenvectors as columns).
When is A Diagonalizable?
A is diagonalizable iff it has n linearly independent eigenvectors, which happens iff geometric multiplicity = algebraic multiplicity for every eigenvalue.
Sufficient conditions:
- A has n distinct eigenvalues (always diagonalizable)
- A is symmetric (real eigenvalues, orthogonal eigenvectors)
- A is normal (AA* = A*A)
Power of Diagonalization
If A = PDP⁻¹, then:
Aᵏ = PDᵏP⁻¹
Computing Aᵏ becomes trivial: Dᵏ = diag(λ₁ᵏ, λ₂ᵏ, ..., λₙᵏ).
Application: Solve linear recurrences. If xₙ₊₁ = Axₙ, then xₙ = Aⁿx₀ = PDⁿP⁻¹x₀.
Example: Fibonacci numbers. F(n+1) = F(n) + F(n-1).
[F(n+1)] = [1 1] [F(n) ]
[F(n) ] [1 0] [F(n-1)]
Diagonalizing the matrix gives eigenvalues φ = (1+√5)/2 and ψ = (1-√5)/2, yielding Binet's formula: F(n) = (φⁿ - ψⁿ)/√5.
Cayley-Hamilton Theorem
Every square matrix satisfies its own characteristic polynomial:
p(A) = 0
If p(λ) = λⁿ + cₙ₋₁λⁿ⁻¹ + ... + c₁λ + c₀, then:
Aⁿ + cₙ₋₁Aⁿ⁻¹ + ... + c₁A + c₀I = 0
Applications
Computing A⁻¹: From p(A) = 0, isolate the constant term:
A(Aⁿ⁻¹ + cₙ₋₁Aⁿ⁻² + ... + c₁I) = -c₀I
A⁻¹ = -(1/c₀)(Aⁿ⁻¹ + cₙ₋₁Aⁿ⁻² + ... + c₁I)
(Only works when c₀ = (-1)ⁿ det(A) ≠ 0.)
Reducing matrix powers: Any power Aᵏ for k ≥ n can be expressed as a polynomial in A of degree < n.
Minimal polynomial: The minimal polynomial divides the characteristic polynomial and is the lowest-degree monic polynomial m(λ) such that m(A) = 0.
Spectral Theorem
Real Symmetric Matrices
If A = Aᵀ (real symmetric), then:
- All eigenvalues are real.
- Eigenvectors for distinct eigenvalues are orthogonal.
- A is orthogonally diagonalizable: A = QΛQᵀ where Q is orthogonal (Qᵀ = Q⁻¹).
This is the spectral theorem for real symmetric matrices.
Normal Matrices
A matrix A is normal if AA* = AA (where A = Āᵀ).
Normal matrices include: symmetric, Hermitian, orthogonal, unitary, skew-symmetric.
The spectral theorem generalizes: normal matrices are unitarily diagonalizable: A = UΛU* where U is unitary.
Gershgorin Circle Theorem
Every eigenvalue of A lies in at least one Gershgorin disc:
Dᵢ = {z ∈ ℂ : |z - Aᵢᵢ| ≤ Σⱼ≠ᵢ |Aᵢⱼ|}
The disc is centered at the diagonal entry with radius equal to the sum of absolute values of off-diagonal entries in that row.
Union of all discs contains all eigenvalues. Useful for quick eigenvalue estimates without computing them.
Eigenvalues of Special Matrices
| Matrix Type | Eigenvalue Property | |---|---| | Symmetric (A = Aᵀ) | All real | | Skew-symmetric (A = -Aᵀ) | Purely imaginary (or zero) | | Orthogonal (QᵀQ = I) | |λ| = 1 | | Positive definite | All positive | | Positive semidefinite | All non-negative | | Nilpotent (Aᵏ = 0) | All zero | | Idempotent (A² = A) | 0 or 1 | | Involutory (A² = I) | +1 or -1 | | Stochastic | 1 is an eigenvalue, all |λ| ≤ 1 |
Numerical Computation
In practice, eigenvalues are not computed by finding roots of the characteristic polynomial (numerically unstable for large n).
Methods:
- Power iteration: Finds the dominant eigenvalue. O(n²) per iteration. Converges linearly.
- Inverse iteration: Finds eigenvalue closest to a shift μ. Uses (A - μI)⁻¹.
- QR algorithm: The workhorse. Iteratively computes A = QR, then A ← RQ. Converges to upper triangular (Schur form). O(n³) total.
- Lanczos algorithm: For large sparse symmetric matrices. Finds extreme eigenvalues efficiently.
- Arnoldi algorithm: Generalization of Lanczos for non-symmetric matrices.
Real-World Applications
- PageRank: The dominant eigenvector of the web graph's transition matrix.
- PCA: Eigenvectors of the covariance matrix give principal components.
- Vibrational analysis: Natural frequencies of a structure are eigenvalues of the stiffness/mass matrix.
- Quantum mechanics: Energy levels are eigenvalues of the Hamiltonian operator.
- Stability analysis: A dynamical system x' = Ax is stable iff all eigenvalues have negative real part.
- Graph spectral theory: The spectrum of the adjacency/Laplacian matrix reveals graph properties (connectivity, expansion, chromatic number bounds).
- Markov chains: The stationary distribution is the eigenvector for eigenvalue 1. The second-largest eigenvalue determines mixing time.
- Image compression: Truncated eigendecomposition approximates data.
- Google's PageRank: Dominant eigenvector of a modified adjacency matrix.
- Control theory: Controllability and observability depend on eigenvalue placement.