Linear algebra forms the backbone of control theory, providing the tools to represent and analyze linear systems. Matrices, vectors, and linear transformations let you model system dynamics and design controllers in a compact, efficient way.
Key concepts like eigenvalues, inner products, and least squares approximation enable stability analysis, optimization, and system identification. These fundamentals are essential for understanding and applying control theory principles.
Fundamentals of Linear Algebra
Linear algebra deals with linear equations, matrices, and vector spaces. It shows up everywhere in control theory because control systems are often modeled as linear systems, and linear algebra gives you the language and machinery to work with them.
Scalars, Vectors, and Matrices
Scalars are single numbers (e.g., or ). Vectors are ordered lists of numbers, typically written as columns, that can represent quantities with both magnitude and direction. Matrices are rectangular arrays of numbers arranged in rows and columns, used to represent linear transformations and systems of linear equations.
Basic operations you need to know:
- Scalar multiplication: multiply every entry of a vector or matrix by a single number
- Vector/matrix addition: add corresponding elements entry by entry
- Matrix multiplication: the entry in row , column of the product is the dot product of row of with column of
Linear Equations and Systems
A linear equation is one where variables appear only to the first power and aren't multiplied together, such as . A system of linear equations consists of two or more such equations sharing the same variables. The solution is the set of values that satisfies all equations simultaneously.
In matrix form, a system can be written as , where is the coefficient matrix, is the vector of unknowns, and is the vector of constants. This compact notation is exactly how control engineers represent the relationship between inputs and outputs of a linear system.
Row Reduction and Echelon Forms
Row reduction is a systematic process for simplifying a matrix using three elementary row operations:
- Swap two rows
- Multiply a row by a nonzero scalar
- Add a multiple of one row to another row
The goal is to reach one of two standard forms:
- Row echelon form (REF): each leading entry (first nonzero entry from the left) is to the right of the leading entry in the row above, and all entries below each leading entry are zero
- Reduced row echelon form (RREF): same as REF, but every leading entry is 1, and it's the only nonzero entry in its column
RREF is unique for any given matrix, which makes it especially useful. Row reduction is the workhorse method for solving systems of linear equations, finding the rank of a matrix, and computing matrix inverses.
Vector Spaces and Subspaces
A vector space is a set of vectors together with addition and scalar multiplication operations that satisfy certain axioms. In control theory, the state space and input/output spaces of a system are modeled as vector spaces, so understanding their structure is critical.
Vector Space Axioms and Properties
A vector space over a field must satisfy these axioms:
- Closure under addition: for any
- Associativity of addition:
- Commutativity of addition:
- Additive identity: a zero vector exists such that
- Additive inverses: for every , there exists such that
- Closure under scalar multiplication: for any ,
- Distributivity over vector addition:
- Distributivity over scalar addition:
- Compatibility of scalar multiplication:
- Scalar multiplicative identity:
Common examples include (real -dimensional vectors), (complex -dimensional vectors), and the space of polynomials of degree at most .
Null Space, Column Space, and Row Space
These three subspaces associated with a matrix reveal a lot about the linear system it represents:
- Null space (kernel): the set of all vectors satisfying . This tells you about the "freedom" in the system, i.e., which inputs produce zero output.
- Column space (range): the set of all linear combinations of the columns of . This represents all possible outputs of the transformation . A system has a solution if and only if lies in the column space of .
- Row space: the set of all linear combinations of the rows of , which equals the column space of .
These subspaces are connected by the rank-nullity theorem: for an matrix ,
where rank is the dimension of the column space and nullity is the dimension of the null space.
Basis and Dimension of Vector Spaces
A basis of a vector space is a set of vectors that is both linearly independent and spans . Every vector in can be written as a unique linear combination of the basis vectors.
The dimension of is the number of vectors in any basis. You can think of it as the number of degrees of freedom in the space. For example, the standard basis for is where has a 1 in position and 0s elsewhere, so has dimension .
A vector space with a finite basis is finite-dimensional; otherwise it's infinite-dimensional (like the space of all polynomials with no degree bound).

Linear Transformations
A linear transformation is a function between vector spaces that preserves addition and scalar multiplication. In control theory, these describe how a system's state and output change in response to inputs.
Definition and Properties of Linear Transformations
A function is a linear transformation if it satisfies two properties for all and :
- Additivity:
- Homogeneity:
These two properties together guarantee that (the zero vector maps to the zero vector). The composition of two linear transformations is also linear: if and are both linear, then is linear too.
Familiar examples include matrix multiplication (), differentiation of polynomials, and integration over a fixed interval.
Kernel and Range of Linear Transformations
- The kernel of is . It's a subspace of .
- The range of is . It's a subspace of .
The dimension of the kernel is called the nullity, and the dimension of the range is called the rank. These are linked by the rank-nullity theorem:
A transformation is injective (one-to-one) if and only if its kernel is , and surjective (onto) if and only if its range equals the entire codomain .
Matrices of Linear Transformations
Every linear transformation between finite-dimensional vector spaces can be represented by a matrix once you choose bases for the domain and codomain.
If is a basis for and is a basis for , the matrix of is the matrix whose -th column is the coordinate vector of expressed in the basis .
A key consequence: matrix multiplication corresponds to composition of transformations. If has matrix and has matrix , then has matrix (note the reversed order).
Eigenvalues and Eigenvectors
Eigenvalues and eigenvectors describe directions along which a linear transformation acts by simple scaling. In control theory, they're central to stability analysis: the eigenvalues of a system matrix determine whether the system's response grows, decays, or oscillates over time.
Characteristic Equation and Eigenvalues
An eigenvector of a square matrix is a nonzero vector satisfying:
The scalar is the corresponding eigenvalue. To find eigenvalues, you solve the characteristic equation:
The roots of this polynomial are the eigenvalues of . For an matrix, the characteristic polynomial has degree , so there are at most distinct eigenvalues (though some may be complex or repeated).
Eigenvalues reveal key properties: is invertible if and only if none of its eigenvalues are zero. For a control system with state matrix , eigenvalues with negative real parts indicate stability, while positive real parts indicate instability.
Eigenvectors and Eigenspaces
For each eigenvalue , the set of all eigenvectors corresponding to , together with the zero vector, forms a subspace called the eigenspace of . You find it by solving .
Two types of multiplicity matter here:
- Algebraic multiplicity: how many times appears as a root of the characteristic polynomial
- Geometric multiplicity: the dimension of the eigenspace (i.e., the number of linearly independent eigenvectors for )
The geometric multiplicity is always less than or equal to the algebraic multiplicity. When they differ, the matrix cannot be diagonalized using that eigenvalue alone.

Diagonalization of Matrices
A square matrix is diagonalizable if there exists an invertible matrix such that:
where is a diagonal matrix. The diagonal entries of are the eigenvalues, and the columns of are the corresponding eigenvectors.
is diagonalizable if and only if it has linearly independent eigenvectors (equivalently, the geometric multiplicity equals the algebraic multiplicity for every eigenvalue).
Why this matters for control theory: diagonalization makes computing matrix powers and matrix exponentials straightforward. If , then and , where is just the exponential applied to each diagonal entry. This directly gives you the solution to the state equation .
Inner Product Spaces
An inner product space is a vector space equipped with an inner product, which generalizes the dot product. It lets you measure lengths, angles, and distances between vectors. In control theory, inner products underpin optimization problems, stability analysis via Lyapunov methods, and optimal controller design.
Dot Product and Inner Product
The dot product of two vectors and in is:
More generally, an inner product on a vector space over or is a function satisfying:
- Conjugate symmetry:
- Linearity in the second argument:
- Positive definiteness: , with equality if and only if
The dot product is the standard inner product on . On , the standard inner product is . The norm (length) of a vector is .
Orthogonality and Orthonormal Bases
Two vectors are orthogonal if . A set of vectors is orthogonal if every pair is orthogonal, and orthonormal if additionally each vector has unit length ().
An orthonormal basis is both a basis and an orthonormal set. Orthonormal bases are especially convenient because coordinates are easy to compute: the coefficient of basis vector for any vector is simply . No matrix inversion needed.
Gram-Schmidt Orthogonalization Process
The Gram-Schmidt process converts any set of linearly independent vectors into an orthonormal set spanning the same subspace. Here's the procedure:
Given linearly independent vectors :
- Set
- For each subsequent vector ():
- Subtract off the projections onto all previous basis vectors:
- Normalize:
Each step removes the component of that lies along the already-constructed orthonormal vectors, leaving only the "new" direction. The result is orthonormal.
This process is the foundation of the QR decomposition (, where is orthogonal and is upper triangular), which is widely used in numerical least squares and controller design.
Least Squares Approximation
Least squares finds the "best fit" when an exact solution doesn't exist. Given an overdetermined system (more equations than unknowns, typically no exact solution), least squares minimizes the sum of squared residuals . In control theory, this is the go-to method for system identification, parameter estimation, and fitting models to measured data.
Orthogonal Projections and Least Squares
The geometric idea is clean: you're projecting the vector onto the column space of . The projection is the closest point in the column space to , and the residual is orthogonal to the column space.
That orthogonality condition gives you the normal equations:
Any solution to the normal equations minimizes .
Normal Equations and Pseudoinverse
The normal equations come directly from requiring the residual to be orthogonal to every column of .
If has full column rank (its columns are linearly independent), then is invertible and the unique least squares solution is:
The matrix is called the left pseudoinverse (or Moore-Penrose pseudoinverse for full column rank matrices), often denoted . It generalizes the concept of a matrix inverse to non-square or rank-deficient matrices.
When does not have full column rank, is singular, and the least squares solution is not unique. In that case, the pseudoinverse still selects the minimum-norm solution among all minimizers. This situation arises in control when a system is overparameterized or has redundant inputs.