An eigenvector $v$ of a square matrix $A$ is a non-zero vector that satisfies $Av = \lambda v$ for some scalar $\lambda$, called the corresponding eigenvalue. An $n \times n$ matrix with $n$ linearly independent eigenvectors can be diagonalised:
$$A = Q \Lambda Q^{-1}$$
where $Q$ has the eigenvectors as columns and $\Lambda$ is diagonal with the eigenvalues. For symmetric real matrices ($A = A^\top$), the eigenvectors can be chosen orthonormal, giving the spectral decomposition $A = Q \Lambda Q^\top$.
Eigenvalues are roots of the characteristic polynomial $\det(A - \lambda I) = 0$. For a symmetric positive semi-definite matrix, all eigenvalues are non-negative; for a positive definite matrix, all are strictly positive.
In machine learning, eigendecomposition appears in: Principal component analysis, eigendecomposition of the data covariance matrix gives principal components ordered by eigenvalue (variance explained); Spectral clustering, eigenvectors of a graph Laplacian reveal cluster structure; PageRank, the dominant eigenvector of the link matrix gives node importance; Quadratic optimisation, Hessian eigendecomposition characterises saddle points and curvature; Stability analysis of neural networks, eigenvalues of the Jacobian determine the local linear-stability of recurrent networks.
Power iteration is the simplest algorithm for computing the dominant eigenvalue: starting from random $v_0$, iterate $v_{k+1} = A v_k / \|A v_k\|$. Convergence is geometric at rate $|\lambda_2 / \lambda_1|$. QR algorithm computes the full spectrum in $O(n^3)$. Lanczos is the method of choice for large sparse symmetric matrices.
Singular value decomposition generalises eigendecomposition to non-square matrices: the singular values of $A$ are the square roots of the eigenvalues of $A^\top A$ (or equivalently $A A^\top$).
Video
Discussed in:
- Chapter 2: Linear Algebra, Linear Algebra