Fiveable

🎛️Control Theory Unit 1 Review

QR code for Control Theory practice questions

1.1 Linear algebra

1.1 Linear algebra

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🎛️Control Theory
Unit & Topic Study Guides

Linear algebra forms the backbone of control theory, providing the tools to represent and analyze linear systems. Matrices, vectors, and linear transformations let you model system dynamics and design controllers in a compact, efficient way.

Key concepts like eigenvalues, inner products, and least squares approximation enable stability analysis, optimization, and system identification. These fundamentals are essential for understanding and applying control theory principles.

Fundamentals of Linear Algebra

Linear algebra deals with linear equations, matrices, and vector spaces. It shows up everywhere in control theory because control systems are often modeled as linear systems, and linear algebra gives you the language and machinery to work with them.

Scalars, Vectors, and Matrices

Scalars are single numbers (e.g., 55 or 3.2-3.2). Vectors are ordered lists of numbers, typically written as columns, that can represent quantities with both magnitude and direction. Matrices are rectangular arrays of numbers arranged in rows and columns, used to represent linear transformations and systems of linear equations.

Basic operations you need to know:

  • Scalar multiplication: multiply every entry of a vector or matrix by a single number
  • Vector/matrix addition: add corresponding elements entry by entry
  • Matrix multiplication: the entry in row ii, column jj of the product ABAB is the dot product of row ii of AA with column jj of BB

Linear Equations and Systems

A linear equation is one where variables appear only to the first power and aren't multiplied together, such as ax+by=cax + by = c. A system of linear equations consists of two or more such equations sharing the same variables. The solution is the set of values that satisfies all equations simultaneously.

In matrix form, a system can be written as Ax=bAx = b, where AA is the coefficient matrix, xx is the vector of unknowns, and bb is the vector of constants. This compact notation is exactly how control engineers represent the relationship between inputs and outputs of a linear system.

Row Reduction and Echelon Forms

Row reduction is a systematic process for simplifying a matrix using three elementary row operations:

  1. Swap two rows
  2. Multiply a row by a nonzero scalar
  3. Add a multiple of one row to another row

The goal is to reach one of two standard forms:

  • Row echelon form (REF): each leading entry (first nonzero entry from the left) is to the right of the leading entry in the row above, and all entries below each leading entry are zero
  • Reduced row echelon form (RREF): same as REF, but every leading entry is 1, and it's the only nonzero entry in its column

RREF is unique for any given matrix, which makes it especially useful. Row reduction is the workhorse method for solving systems of linear equations, finding the rank of a matrix, and computing matrix inverses.

Vector Spaces and Subspaces

A vector space is a set of vectors together with addition and scalar multiplication operations that satisfy certain axioms. In control theory, the state space and input/output spaces of a system are modeled as vector spaces, so understanding their structure is critical.

Vector Space Axioms and Properties

A vector space VV over a field FF must satisfy these axioms:

  1. Closure under addition: u+vVu + v \in V for any u,vVu, v \in V
  2. Associativity of addition: (u+v)+w=u+(v+w)(u + v) + w = u + (v + w)
  3. Commutativity of addition: u+v=v+uu + v = v + u
  4. Additive identity: a zero vector 00 exists such that v+0=vv + 0 = v
  5. Additive inverses: for every vv, there exists v-v such that v+(v)=0v + (-v) = 0
  6. Closure under scalar multiplication: avVav \in V for any aFa \in F, vVv \in V
  7. Distributivity over vector addition: a(u+v)=au+ava(u + v) = au + av
  8. Distributivity over scalar addition: (a+b)v=av+bv(a + b)v = av + bv
  9. Compatibility of scalar multiplication: (ab)v=a(bv)(ab)v = a(bv)
  10. Scalar multiplicative identity: 1v=v1v = v

Common examples include Rn\mathbb{R}^n (real nn-dimensional vectors), Cn\mathbb{C}^n (complex nn-dimensional vectors), and the space of polynomials of degree at most nn.

Null Space, Column Space, and Row Space

These three subspaces associated with a matrix AA reveal a lot about the linear system it represents:

  • Null space (kernel): the set of all vectors xx satisfying Ax=0Ax = 0. This tells you about the "freedom" in the system, i.e., which inputs produce zero output.
  • Column space (range): the set of all linear combinations of the columns of AA. This represents all possible outputs of the transformation xAxx \mapsto Ax. A system Ax=bAx = b has a solution if and only if bb lies in the column space of AA.
  • Row space: the set of all linear combinations of the rows of AA, which equals the column space of ATA^T.

These subspaces are connected by the rank-nullity theorem: for an m×nm \times n matrix AA,

rank(A)+nullity(A)=n\text{rank}(A) + \text{nullity}(A) = n

where rank is the dimension of the column space and nullity is the dimension of the null space.

Basis and Dimension of Vector Spaces

A basis of a vector space VV is a set of vectors that is both linearly independent and spans VV. Every vector in VV can be written as a unique linear combination of the basis vectors.

The dimension of VV is the number of vectors in any basis. You can think of it as the number of degrees of freedom in the space. For example, the standard basis for Rn\mathbb{R}^n is {e1,e2,,en}\{e_1, e_2, \ldots, e_n\} where eie_i has a 1 in position ii and 0s elsewhere, so Rn\mathbb{R}^n has dimension nn.

A vector space with a finite basis is finite-dimensional; otherwise it's infinite-dimensional (like the space of all polynomials with no degree bound).

Scalars, vectors, and matrices, Vectors and scalars - grade 10

Linear Transformations

A linear transformation is a function between vector spaces that preserves addition and scalar multiplication. In control theory, these describe how a system's state and output change in response to inputs.

Definition and Properties of Linear Transformations

A function T:VWT: V \to W is a linear transformation if it satisfies two properties for all u,vVu, v \in V and aFa \in F:

  1. Additivity: T(u+v)=T(u)+T(v)T(u + v) = T(u) + T(v)
  2. Homogeneity: T(av)=aT(v)T(av) = aT(v)

These two properties together guarantee that T(0V)=0WT(0_V) = 0_W (the zero vector maps to the zero vector). The composition of two linear transformations is also linear: if T:VWT: V \to W and S:WUS: W \to U are both linear, then ST:VUS \circ T: V \to U is linear too.

Familiar examples include matrix multiplication (xAxx \mapsto Ax), differentiation of polynomials, and integration over a fixed interval.

Kernel and Range of Linear Transformations

  • The kernel of T:VWT: V \to W is ker(T)={vV:T(v)=0W}\ker(T) = \{v \in V : T(v) = 0_W\}. It's a subspace of VV.
  • The range of TT is range(T)={T(v):vV}\text{range}(T) = \{T(v) : v \in V\}. It's a subspace of WW.

The dimension of the kernel is called the nullity, and the dimension of the range is called the rank. These are linked by the rank-nullity theorem:

rank(T)+nullity(T)=dim(V)\text{rank}(T) + \text{nullity}(T) = \dim(V)

A transformation is injective (one-to-one) if and only if its kernel is {0}\{0\}, and surjective (onto) if and only if its range equals the entire codomain WW.

Matrices of Linear Transformations

Every linear transformation between finite-dimensional vector spaces can be represented by a matrix once you choose bases for the domain and codomain.

If {v1,,vn}\{v_1, \ldots, v_n\} is a basis for VV and {w1,,wm}\{w_1, \ldots, w_m\} is a basis for WW, the matrix AA of TT is the m×nm \times n matrix whose jj-th column is the coordinate vector of T(vj)T(v_j) expressed in the basis {w1,,wm}\{w_1, \ldots, w_m\}.

A key consequence: matrix multiplication corresponds to composition of transformations. If TT has matrix AA and SS has matrix BB, then STS \circ T has matrix BABA (note the reversed order).

Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors describe directions along which a linear transformation acts by simple scaling. In control theory, they're central to stability analysis: the eigenvalues of a system matrix determine whether the system's response grows, decays, or oscillates over time.

Characteristic Equation and Eigenvalues

An eigenvector of a square matrix AA is a nonzero vector vv satisfying:

Av=λvAv = \lambda v

The scalar λ\lambda is the corresponding eigenvalue. To find eigenvalues, you solve the characteristic equation:

det(AλI)=0\det(A - \lambda I) = 0

The roots of this polynomial are the eigenvalues of AA. For an n×nn \times n matrix, the characteristic polynomial has degree nn, so there are at most nn distinct eigenvalues (though some may be complex or repeated).

Eigenvalues reveal key properties: AA is invertible if and only if none of its eigenvalues are zero. For a control system with state matrix AA, eigenvalues with negative real parts indicate stability, while positive real parts indicate instability.

Eigenvectors and Eigenspaces

For each eigenvalue λ\lambda, the set of all eigenvectors corresponding to λ\lambda, together with the zero vector, forms a subspace called the eigenspace of λ\lambda. You find it by solving (AλI)x=0(A - \lambda I)x = 0.

Two types of multiplicity matter here:

  • Algebraic multiplicity: how many times λ\lambda appears as a root of the characteristic polynomial
  • Geometric multiplicity: the dimension of the eigenspace (i.e., the number of linearly independent eigenvectors for λ\lambda)

The geometric multiplicity is always less than or equal to the algebraic multiplicity. When they differ, the matrix cannot be diagonalized using that eigenvalue alone.

Scalars, vectors, and matrices, Vectors, Scalars, and Coordinate Systems | Physics

Diagonalization of Matrices

A square matrix AA is diagonalizable if there exists an invertible matrix PP such that:

P1AP=DP^{-1}AP = D

where DD is a diagonal matrix. The diagonal entries of DD are the eigenvalues, and the columns of PP are the corresponding eigenvectors.

AA is diagonalizable if and only if it has nn linearly independent eigenvectors (equivalently, the geometric multiplicity equals the algebraic multiplicity for every eigenvalue).

Why this matters for control theory: diagonalization makes computing matrix powers and matrix exponentials straightforward. If A=PDP1A = PDP^{-1}, then Ak=PDkP1A^k = PD^kP^{-1} and eAt=PeDtP1e^{At} = Pe^{Dt}P^{-1}, where eDte^{Dt} is just the exponential applied to each diagonal entry. This directly gives you the solution to the state equation x˙=Ax\dot{x} = Ax.

Inner Product Spaces

An inner product space is a vector space equipped with an inner product, which generalizes the dot product. It lets you measure lengths, angles, and distances between vectors. In control theory, inner products underpin optimization problems, stability analysis via Lyapunov methods, and optimal controller design.

Dot Product and Inner Product

The dot product of two vectors x=(x1,,xn)x = (x_1, \ldots, x_n) and y=(y1,,yn)y = (y_1, \ldots, y_n) in Rn\mathbb{R}^n is:

xy=x1y1+x2y2++xnynx \cdot y = x_1y_1 + x_2y_2 + \cdots + x_ny_n

More generally, an inner product on a vector space VV over R\mathbb{R} or C\mathbb{C} is a function ,:V×VF\langle \cdot, \cdot \rangle: V \times V \to F satisfying:

  1. Conjugate symmetry: x,y=y,x\langle x, y \rangle = \overline{\langle y, x \rangle}
  2. Linearity in the second argument: x,ay+z=ax,y+x,z\langle x, ay + z \rangle = a\langle x, y \rangle + \langle x, z \rangle
  3. Positive definiteness: x,x0\langle x, x \rangle \geq 0, with equality if and only if x=0x = 0

The dot product is the standard inner product on Rn\mathbb{R}^n. On Cn\mathbb{C}^n, the standard inner product is x,y=xˉ1y1++xˉnyn\langle x, y \rangle = \bar{x}_1y_1 + \cdots + \bar{x}_ny_n. The norm (length) of a vector is x=x,x\|x\| = \sqrt{\langle x, x \rangle}.

Orthogonality and Orthonormal Bases

Two vectors are orthogonal if x,y=0\langle x, y \rangle = 0. A set of vectors is orthogonal if every pair is orthogonal, and orthonormal if additionally each vector has unit length (vi,vi=1\langle v_i, v_i \rangle = 1).

An orthonormal basis is both a basis and an orthonormal set. Orthonormal bases are especially convenient because coordinates are easy to compute: the coefficient of basis vector eie_i for any vector vv is simply v,ei\langle v, e_i \rangle. No matrix inversion needed.

Gram-Schmidt Orthogonalization Process

The Gram-Schmidt process converts any set of linearly independent vectors into an orthonormal set spanning the same subspace. Here's the procedure:

Given linearly independent vectors {v1,,vn}\{v_1, \ldots, v_n\}:

  1. Set e1=v1v1e_1 = \dfrac{v_1}{\|v_1\|}
  2. For each subsequent vector viv_i (i=2,,ni = 2, \ldots, n):
    • Subtract off the projections onto all previous basis vectors: ui=vij=1i1vi,ejeju_i = v_i - \sum_{j=1}^{i-1} \langle v_i, e_j \rangle \, e_j
    • Normalize: ei=uiuie_i = \dfrac{u_i}{\|u_i\|}

Each step removes the component of viv_i that lies along the already-constructed orthonormal vectors, leaving only the "new" direction. The result {e1,,en}\{e_1, \ldots, e_n\} is orthonormal.

This process is the foundation of the QR decomposition (A=QRA = QR, where QQ is orthogonal and RR is upper triangular), which is widely used in numerical least squares and controller design.

Least Squares Approximation

Least squares finds the "best fit" when an exact solution doesn't exist. Given an overdetermined system Ax=bAx = b (more equations than unknowns, typically no exact solution), least squares minimizes the sum of squared residuals Axb2\|Ax - b\|^2. In control theory, this is the go-to method for system identification, parameter estimation, and fitting models to measured data.

Orthogonal Projections and Least Squares

The geometric idea is clean: you're projecting the vector bb onto the column space of AA. The projection b^=Ax^\hat{b} = A\hat{x} is the closest point in the column space to bb, and the residual bAx^b - A\hat{x} is orthogonal to the column space.

That orthogonality condition gives you the normal equations:

ATAx^=ATbA^TA\hat{x} = A^Tb

Any solution x^\hat{x} to the normal equations minimizes Axb2\|Ax - b\|^2.

Normal Equations and Pseudoinverse

The normal equations ATAx=ATbA^TAx = A^Tb come directly from requiring the residual to be orthogonal to every column of AA.

If AA has full column rank (its columns are linearly independent), then ATAA^TA is invertible and the unique least squares solution is:

x^=(ATA)1ATb\hat{x} = (A^TA)^{-1}A^Tb

The matrix (ATA)1AT(A^TA)^{-1}A^T is called the left pseudoinverse (or Moore-Penrose pseudoinverse for full column rank matrices), often denoted A+A^+. It generalizes the concept of a matrix inverse to non-square or rank-deficient matrices.

When AA does not have full column rank, ATAA^TA is singular, and the least squares solution is not unique. In that case, the pseudoinverse still selects the minimum-norm solution among all minimizers. This situation arises in control when a system is overparameterized or has redundant inputs.