Matrices
A matrix is a rectangular array of numbers arranged in rows and columns. Matrices represent linear transformations, systems of equations, graphs, and data.
Basic Definitions
An m × n matrix A has m rows and n columns:
A = [a₁₁ a₁₂ ... a₁ₙ]
[a₂₁ a₂₂ ... a₂ₙ]
[ ⋮ ⋮ ⋱ ⋮ ]
[aₘ₁ aₘ₂ ... aₘₙ]
We write A ∈ ℝᵐˣⁿ or Aᵢⱼ for the entry in row i, column j.
A square matrix has m = n.
Matrix Operations
Addition and Scalar Multiplication
Element-wise (matrices must have the same dimensions):
(A + B)ᵢⱼ = Aᵢⱼ + Bᵢⱼ
(cA)ᵢⱼ = c · Aᵢⱼ
Matrix Multiplication
For A ∈ ℝᵐˣⁿ and B ∈ ℝⁿˣᵖ, the product C = AB ∈ ℝᵐˣᵖ:
Cᵢⱼ = Σₖ₌₁ⁿ Aᵢₖ Bₖⱼ
Row i of A dotted with column j of B.
Properties:
- Associative: (AB)C = A(BC)
- Distributive: A(B + C) = AB + AC
- NOT commutative: AB ≠ BA in general
- Identity: AI = IA = A where I is the identity matrix
- (AB)ᵀ = BᵀAᵀ (transpose reverses order)
Complexity: Naive multiplication of two n × n matrices: O(n³). Strassen: O(n^2.807). Best known: O(n^2.371) (theoretical).
Transpose
(Aᵀ)ᵢⱼ = Aⱼᵢ
Flips rows and columns.
Properties:
- (Aᵀ)ᵀ = A
- (A + B)ᵀ = Aᵀ + Bᵀ
- (cA)ᵀ = cAᵀ
- (AB)ᵀ = BᵀAᵀ
Special Matrix Types
Diagonal Matrix
Non-zero entries only on the main diagonal: Aᵢⱼ = 0 for i ≠ j.
D = diag(d₁, d₂, ..., dₙ) = [d₁ 0 ... 0 ]
[0 d₂ ... 0 ]
[⋮ ⋮ ⋱ ⋮ ]
[0 0 ... dₙ ]
Diagonal matrices multiply by scaling each coordinate: Dv = (d₁v₁, d₂v₂, ..., dₙvₙ).
Triangular Matrices
Upper triangular: Aᵢⱼ = 0 for i > j (zeros below diagonal). Lower triangular: Aᵢⱼ = 0 for i < j (zeros above diagonal).
Properties:
- Product of upper triangular matrices is upper triangular.
- Determinant = product of diagonal entries.
- Eigenvalues = diagonal entries.
Symmetric Matrix
Aᵀ = A (Aᵢⱼ = Aⱼᵢ).
Properties:
- All eigenvalues are real.
- Eigenvectors for distinct eigenvalues are orthogonal.
- Always diagonalizable: A = QΛQᵀ where Q is orthogonal.
- Positive definite if all eigenvalues > 0.
Applications: Covariance matrices, graph Laplacians, kernel matrices.
Orthogonal Matrix
QᵀQ = QQᵀ = I (columns and rows are orthonormal).
Properties:
- Q⁻¹ = Qᵀ (inverse = transpose)
- |det(Q)| = 1
- Preserves lengths: ‖Qv‖ = ‖v‖
- Preserves angles: (Qu) · (Qv) = u · v
- Represents rotations (det = +1) and reflections (det = -1)
Unitary Matrix (Complex)
UU = UU = I where U* = Ūᵀ (conjugate transpose).
The complex analog of orthogonal matrices. Preserves the complex inner product.
Hermitian Matrix (Complex)
A* = A. Complex analog of symmetric matrices. All eigenvalues are real.
Row Echelon Form
Row Operations
Three elementary row operations (don't change the solution set of a linear system):
- Swap two rows: Rᵢ ↔ Rⱼ
- Scale a row by a non-zero constant: Rᵢ → cRᵢ
- Add a multiple of one row to another: Rᵢ → Rᵢ + cRⱼ
Row Echelon Form (REF)
A matrix is in REF if:
- All zero rows are at the bottom.
- The leading entry (pivot) of each non-zero row is to the right of the pivot above it.
- All entries below a pivot are zero.
[1 2 3 4]
[0 0 5 6] (REF — pivots at columns 1, 3)
[0 0 0 7]
[0 0 0 0]
Reduced Row Echelon Form (RREF)
REF plus: 4. Each pivot is 1. 5. Each pivot is the only non-zero entry in its column.
[1 2 0 0]
[0 0 1 0] (RREF)
[0 0 0 1]
[0 0 0 0]
RREF is unique for a given matrix.
Gaussian Elimination
Algorithm to convert a matrix to REF (or RREF):
- Find the leftmost non-zero column (pivot column).
- Swap rows to bring a non-zero entry to the pivot position.
- Eliminate entries below the pivot using row operations.
- Move to the next row and column; repeat.
- (For RREF) Back-substitute to eliminate entries above pivots and scale pivots to 1.
Time: O(mn · min(m,n)) for an m × n matrix. O(n³) for square.
Solving Linear Systems
Augmented matrix [A | b]:
[1 2 1 | 5] → [1 0 0 | 1]
[2 3 1 | 8] → [0 1 0 | 1] (RREF)
[3 5 2 | 13] → [0 0 1 | 2]
Solution: x₁ = 1, x₂ = 1, x₃ = 2.
Cases:
- Unique solution: n pivots for n unknowns.
- Infinitely many: Free variables (columns without pivots).
- No solution: Pivot in the augmented column (inconsistent: 0 = non-zero).
Matrix Rank
The rank of A, written rank(A), is:
- The number of pivots in REF.
- The dimension of the column space.
- The dimension of the row space.
- The maximum number of linearly independent rows (or columns).
Properties:
- rank(A) ≤ min(m, n)
- rank(A) = rank(Aᵀ) (row rank = column rank)
- rank(AB) ≤ min(rank(A), rank(B))
- rank(A + B) ≤ rank(A) + rank(B)
- Full rank: rank(A) = min(m, n)
Rank and Systems of Equations
For Ax = b:
- Consistent iff rank(A) = rank([A|b]).
- Unique solution iff rank(A) = n (number of unknowns).
- The solution space of Ax = 0 has dimension n - rank(A) (nullity).
Matrix Inverse
For a square matrix A, the inverse A⁻¹ satisfies:
AA⁻¹ = A⁻¹A = I
A is invertible (non-singular) iff:
- det(A) ≠ 0
- rank(A) = n
- Ax = 0 has only the trivial solution
- Columns are linearly independent
- 0 is not an eigenvalue
Computing the Inverse
Method 1: Row reduce [A | I] to [I | A⁻¹].
Method 2: A⁻¹ = (1/det(A)) · adj(A) where adj(A) is the adjugate (matrix of cofactors, transposed). Useful for 2×2 and 3×3, impractical for larger.
2×2 inverse:
[a b]⁻¹ = (1/(ad-bc)) [ d -b]
[c d] [-c a]
Properties
- (A⁻¹)⁻¹ = A
- (AB)⁻¹ = B⁻¹A⁻¹ (reverse order)
- (Aᵀ)⁻¹ = (A⁻¹)ᵀ
- (cA)⁻¹ = (1/c)A⁻¹
- det(A⁻¹) = 1/det(A)
Warning: In numerical computation, explicitly computing A⁻¹ is almost always a bad idea. Use LU decomposition or other factorizations instead.
Block Matrices
Matrices can be partitioned into blocks (sub-matrices):
M = [A B]
[C D]
Block multiplication works if the blocks are compatible:
[A B][E F] [AE+BG AF+BH]
[C D][G H] = [CE+DG CF+DH]
Schur complement: If D is invertible, the Schur complement of D in M is A - BD⁻¹C. Used in block LU decomposition, Gaussian elimination on block matrices, and control theory.
Sparse Matrices
A matrix where most entries are zero. Common in practice (graphs, PDEs, NLP).
Storage formats:
- COO (Coordinate): Store (row, col, value) triples.
- CSR (Compressed Sparse Row): Row pointers + column indices + values.
- CSC (Compressed Sparse Column): Column pointers + row indices + values.
CSR is efficient for row access and matrix-vector multiplication. CSC for column access.
Real-World Applications
- Systems of equations: Every linear system Ax = b is a matrix equation. Gaussian elimination is the workhorse.
- Computer graphics: 4×4 matrices represent 3D transformations (rotation, translation, scaling, perspective projection).
- Machine learning: Data stored as matrices (rows = samples, columns = features). Weight matrices in neural networks.
- Graph theory: Adjacency matrix, Laplacian matrix, incidence matrix.
- Image processing: Images are matrices of pixel values. Convolutions are matrix operations.
- Cryptography: Hill cipher uses matrix multiplication mod 26. Lattice-based crypto uses matrix operations.
- Economics: Input-output models (Leontief), transition matrices for economic models.
- Quantum computing: Quantum gates are unitary matrices. State evolution is matrix multiplication.