Abstract Linear Algebra II

➗Abstract Linear Algebra II Unit 3 – Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors are fundamental concepts in linear algebra, representing special scalars and vectors that remain unchanged in direction when transformed by a matrix. They provide insights into a matrix's behavior, allowing us to understand linear transformations, solve differential equations, and analyze complex systems. This unit covers the calculation of eigenvalues and eigenvectors, their geometric interpretation, and applications in diagonalization. We explore the characteristic polynomial, properties of eigenvalues and eigenvectors, and advanced topics like the spectral theorem and computational methods for eigenvalue problems.

Study Guides for Unit 3

3.1

Eigenvalues and eigenvectors of a linear operator

3 min read

3.2

Characteristic polynomial and eigenspaces

4 min read

3.3

Diagonalization of matrices

3 min read

3.4

Cayley-Hamilton theorem

3 min read

3.5

Applications of eigenvalues and eigenvectors

4 min read

Key Concepts and Definitions

Eigenvalues are scalar values $\lambda$ that satisfy the equation $Av = \lambda v$ for a square matrix $A$ and a non-zero vector $v$
Eigenvectors are non-zero vectors $v$ $v$ that, when multiplied by a square matrix $A$ $A$ , result in a scalar multiple of themselves $Av = \lambda v$ $A v = λ v$
- The scalar multiple $\lambda$ is the corresponding eigenvalue
The set of all eigenvalues of a matrix is called its spectrum
Algebraic multiplicity of an eigenvalue is the number of times it appears as a root of the characteristic polynomial
Geometric multiplicity of an eigenvalue is the dimension of its corresponding eigenspace (the space spanned by its eigenvectors)
A matrix is diagonalizable if it is similar to a diagonal matrix, meaning it can be written as $A = PDP^{-1}$ , where $D$ is a diagonal matrix and $P$ is an invertible matrix
The eigenspace of an eigenvalue $\lambda$ is the nullspace of the matrix $A - \lambda I$ , where $I$ is the identity matrix

Eigenvalue Equation and Characteristic Polynomial

The eigenvalue equation is $Av = \lambda v$ , where $A$ is a square matrix, $v$ is a non-zero vector, and $\lambda$ is a scalar
This equation can be rewritten as $(A - \lambda I)v = 0$ , where $I$ is the identity matrix
For a non-zero solution to exist, the determinant of $(A - \lambda I)$ must be zero
The characteristic polynomial of a matrix $A$ $A$ is defined as $p(\lambda) = \det(A - \lambda I)$ $p (λ) = det (A - λ I)$
- It is a polynomial in $\lambda$ whose roots are the eigenvalues of $A$
The degree of the characteristic polynomial is equal to the size of the matrix $A$
The coefficients of the characteristic polynomial are determined by the entries of the matrix $A$
The characteristic equation is the equation obtained by setting the characteristic polynomial equal to zero: $p(\lambda) = 0$

Geometric Interpretation of Eigenvectors

Eigenvectors represent the directions in which a linear transformation (represented by a matrix) acts as a scaling operation
When a matrix $A$ $A$ is applied to one of its eigenvectors $v$ $v$ , the result is a scalar multiple of $v$ $v$ , with the scalar being the corresponding eigenvalue $\lambda$ $λ$
- Geometrically, this means that the eigenvector's direction remains unchanged, but its magnitude is scaled by $\lambda$
Eigenvectors corresponding to distinct eigenvalues are linearly independent
The eigenspace of an eigenvalue $\lambda$ $λ$ is the set of all vectors $v$ $v$ (including the zero vector) that satisfy $Av = \lambda v$ $A v = λ v$
- Geometrically, the eigenspace is a subspace of the vector space on which the linear transformation acts as a scaling by $\lambda$ in all directions
If a matrix has a full set of linearly independent eigenvectors, it can be diagonalized, meaning that the linear transformation can be represented as a scaling along mutually orthogonal axes (eigenvectors)

Properties of Eigenvalues and Eigenvectors

The sum of the eigenvalues of a matrix $A$ is equal to the trace of $A$ (the sum of its diagonal entries)
The product of the eigenvalues of a matrix $A$ is equal to the determinant of $A$
If $\lambda$ is an eigenvalue of $A$ , then $\lambda^k$ is an eigenvalue of $A^k$ for any positive integer $k$
If $A$ $A$ and $B$ $B$ are similar matrices ( $B = P^{-1}AP$ $B = P^{- 1} A P$ for some invertible matrix $P$ $P$ ), then they have the same eigenvalues
- Their eigenvectors are related by the similarity transformation: if $v$ is an eigenvector of $A$ , then $P^{-1}v$ is an eigenvector of $B$
If $A$ is a real symmetric matrix, then all its eigenvalues are real, and its eigenvectors corresponding to distinct eigenvalues are orthogonal
If $A$ is a Hermitian matrix (complex analogue of a real symmetric matrix), then all its eigenvalues are real, and its eigenvectors corresponding to distinct eigenvalues are orthogonal
The algebraic and geometric multiplicities of an eigenvalue are always positive integers, with the geometric multiplicity never exceeding the algebraic multiplicity

Diagonalization and Its Applications

A matrix $A$ $A$ is diagonalizable if it can be written as $A = PDP^{-1}$ $A = P D P^{- 1}$ , where $D$ $D$ is a diagonal matrix and $P$ $P$ is an invertible matrix
- The columns of $P$ are the eigenvectors of $A$ , and the diagonal entries of $D$ are the corresponding eigenvalues
Diagonalization allows for the simplification of matrix powers: if $A = PDP^{-1}$ $A = P D P^{- 1}$ , then $A^k = PD^kP^{-1}$ $A^{k} = P D^{k} P^{- 1}$ for any positive integer $k$ $k$
- This is because $D^k$ is easy to compute (just raise each diagonal entry to the power $k$ )
Diagonalization is useful in solving systems of linear differential equations
- If the coefficient matrix of the system is diagonalizable, the system can be decoupled into independent equations that are easier to solve
Diagonalization is also used in principal component analysis (PCA), a technique for dimensionality reduction and data compression
- The eigenvectors of the covariance matrix of a dataset represent the principal components, and the corresponding eigenvalues indicate the amount of variance captured by each component
Markov chains, which model systems that transition between states according to certain probabilities, can be analyzed using diagonalization of the transition matrix
- The long-term behavior of the system is determined by the eigenvalues and eigenvectors of the transition matrix

Spectral Theorem and Symmetric Matrices

The spectral theorem states that if $A$ $A$ is a real symmetric matrix, then $A$ $A$ is diagonalizable by an orthogonal matrix $P$ $P$
- This means $A = PDP^{-1}$ , where $D$ is a diagonal matrix and $P$ is an orthogonal matrix ( $P^{-1} = P^T$ )
- The columns of $P$ are orthonormal eigenvectors of $A$ , and the diagonal entries of $D$ are the corresponding eigenvalues
The spectral theorem guarantees that a real symmetric matrix has a full set of orthonormal eigenvectors and real eigenvalues
The eigenvalues of a real symmetric matrix are always real because the characteristic polynomial has real coefficients and the matrix is diagonalizable
The eigenvectors corresponding to distinct eigenvalues of a real symmetric matrix are orthogonal
- This is because the matrix is diagonalizable by an orthogonal matrix, and the columns of an orthogonal matrix are orthonormal
The spectral theorem has important applications in physics and engineering, such as in the analysis of vibrations and the study of quantum mechanics
- In quantum mechanics, observables are represented by Hermitian operators, and their eigenvalues correspond to the possible measurement outcomes, while the eigenvectors represent the corresponding quantum states

Computational Methods and Algorithms

The power method is an iterative algorithm for finding the dominant eigenvalue (eigenvalue with the largest absolute value) and its corresponding eigenvector
- It starts with an initial vector $v_0$ and iteratively computes $v_{k+1} = Av_k / ||Av_k||$ until convergence
- The dominant eigenvalue is approximated by the Rayleigh quotient $\lambda \approx (v_k^T A v_k) / (v_k^T v_k)$
The QR algorithm is a more sophisticated method for computing all eigenvalues and eigenvectors of a matrix
- It involves iteratively decomposing the matrix into a product of an orthogonal matrix $Q$ and an upper triangular matrix $R$ (QR decomposition)
- The algorithm converges to a matrix whose diagonal entries are the eigenvalues, and the columns of the accumulated $Q$ matrices are the eigenvectors
The Jacobi method is an iterative algorithm for diagonalizing a real symmetric matrix
- It works by performing a series of orthogonal similarity transformations to reduce the off-diagonal elements to zero
- Each transformation is a rotation that eliminates one off-diagonal element and modifies the others
- The algorithm converges to a diagonal matrix containing the eigenvalues, and the product of the rotation matrices converges to the matrix of eigenvectors
Krylov subspace methods, such as the Arnoldi iteration and the Lanczos algorithm, are used for computing a subset of eigenvalues and eigenvectors of large sparse matrices
- These methods work by iteratively building an orthonormal basis for a Krylov subspace (a subspace spanned by powers of the matrix applied to a starting vector) and extracting approximate eigenvalues and eigenvectors from a projected matrix of smaller size

Advanced Topics and Extensions

Generalized eigenvalue problems involve finding scalar values $\lambda$ $λ$ and non-zero vectors $v$ $v$ that satisfy $Av = \lambda Bv$ $A v = λ B v$ , where $A$ $A$ and $B$ $B$ are matrices
- This arises in problems such as vibration analysis of structures with mass and stiffness matrices
Singular value decomposition (SVD) is a generalization of eigendecomposition to rectangular matrices
- It decomposes a matrix $A$ into $A = U\Sigma V^T$ , where $U$ and $V$ are orthogonal matrices and $\Sigma$ is a diagonal matrix containing the singular values
- The columns of $U$ and $V$ are called left and right singular vectors, respectively
Perturbation theory studies how eigenvalues and eigenvectors change when the matrix is subjected to small perturbations
- It provides bounds on the sensitivity of eigenvalues and eigenvectors to errors or uncertainties in the matrix entries
Pseudospectra are sets in the complex plane that provide insight into the behavior of non-normal matrices (matrices whose eigenvectors are not orthogonal)
- They are defined as the sets of complex numbers $z$ for which the norm of the resolvent $(zI - A)^{-1}$ is greater than a specified threshold
- Pseudospectra can reveal the sensitivity of eigenvalues to perturbations and the transient behavior of the dynamical system represented by the matrix
Eigenvalue optimization problems seek to find matrices with eigenvalues that satisfy certain constraints or optimize certain objectives
- Examples include finding the matrix with the largest or smallest eigenvalue subject to structural constraints (e.g., sparsity or symmetry) or finding the matrix that best approximates a given set of desired eigenvalues
Eigenvalue problems arise in various applications beyond linear algebra, such as in graph theory (spectral graph theory), quantum mechanics (Schrödinger equation), and dynamical systems (stability analysis)
- In these contexts, eigenvalues and eigenvectors often have specific physical or practical interpretations that guide the analysis and understanding of the underlying systems