Linearization techniques let you approximate complex nonlinear systems with simpler linear models. This makes it possible to use the full toolkit of linear control theory (frequency-domain analysis, pole placement, LQR, etc.) for systems that are inherently nonlinear.

The tradeoff is that these linear approximations are only accurate near a chosen operating point. Move too far from that point and the model breaks down. Understanding both the power and the limits of linearization is essential for applying it effectively.

Linearization of nonlinear systems

Linearization approximates a nonlinear system with a linear model so you can apply standard linear control techniques. The linearized model is valid in the vicinity of an operating point, which means it captures the system's local behavior rather than its full nonlinear dynamics.

Importance of linearization

Most real-world systems are nonlinear, but decades of powerful tools exist for linear systems. Linearization bridges that gap. Once you have a linear model, you can use frequency-domain analysis, root locus, state-space methods, and other well-established techniques to analyze stability and design controllers.

Limitations of linear models

Linear models are approximations. Their accuracy degrades as the system moves away from the operating point.
Nonlinear phenomena like multiple equilibrium points, limit cycles, and chaos simply cannot be captured by a linear model.
If your system operates over a wide range of conditions, a single linearized model may not be sufficient. You might need multiple linearizations at different operating points (gain scheduling) or fully nonlinear techniques.

Taylor series expansion

Taylor series expansion is the mathematical foundation of linearization. It approximates a nonlinear function by expanding it as a sum of terms involving the function's derivatives at a specific point.

First-order approximation

The first-order Taylor approximation is what we typically mean by "linearization." It keeps only the first derivative term and discards everything else:

$f(x) \approx f(a) + f'(a)(x - a)$

where $a$ is the operating point and $f'(a)$ is the derivative evaluated at that point. Geometrically, this replaces the nonlinear curve with its tangent line at $a$ .

For a function of multiple variables $f(x_1, x_2, \ldots, x_n)$ , the first-order approximation uses partial derivatives:

$f(\mathbf{x}) \approx f(\mathbf{a}) + \sum_{i=1}^{n} \frac{\partial f}{\partial x_i}\bigg|_{\mathbf{a}} (x_i - a_i)$

Higher-order terms

Higher-order terms (second derivative, third derivative, etc.) capture curvature and other nonlinear effects. Including them improves accuracy but increases model complexity. For example, the second-order term adds $\frac{1}{2} f''(a)(x-a)^2$ , which accounts for the curvature of the function. In most control applications, we stop at first order because the goal is to obtain a linear model.

Truncation error

When you drop the higher-order terms, you introduce truncation error. Two factors determine how large this error is:

Distance from the operating point: The farther $x$ is from $a$ , the larger the error, since the neglected terms involve powers of $(x - a)$ .
Degree of nonlinearity: Functions with large higher-order derivatives produce bigger truncation errors for the same deviation.

This is why linearized models are described as "local" approximations. They're reliable near the operating point and increasingly unreliable as you move away.

Jacobian matrix

The Jacobian matrix generalizes the derivative to vector-valued functions. It's the central tool for linearizing multivariable nonlinear systems.

Definition and properties

For a vector-valued function $\mathbf{f}(\mathbf{x})$ where $\mathbf{x} = [x_1, x_2, \ldots, x_n]^T$ , the Jacobian matrix $\mathbf{J}(\mathbf{x})$ is:

\frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \cdots & \frac{\partial f_1}{\partial x_n} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \cdots & \frac{\partial f_2}{\partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1} & \frac{\partial f_m}{\partial x_2} & \cdots & \frac{\partial f_m}{\partial x_n} \end{bmatrix}$$ Each row corresponds to one component of $$\mathbf{f}$$, and each column corresponds to one state variable. The matrix is square when $$m = n$$ (same number of outputs as inputs). Each entry tells you how sensitive a particular output is to a small change in a particular input. ### Evaluating the Jacobian To evaluate the Jacobian at a specific point $$\mathbf{x}_0$$: 1. Compute the partial derivatives symbolically for each entry of the matrix. 2. Substitute the values of $$\mathbf{x}_0$$ into each expression. When analytical expressions for the partial derivatives aren't available (or are too complex), you can use **numerical differentiation** techniques like finite differences to approximate the Jacobian. ### Relationship to linearization The Jacobian is the key ingredient in linearizing a nonlinear state-space model. For a system $$\dot{\mathbf{x}} = \mathbf{f}(\mathbf{x}, \mathbf{u})$$, the linearized model around an operating point $$(\mathbf{x}_0, \mathbf{u}_0)$$ is: $$\dot{\mathbf{x}} \approx \mathbf{f}(\mathbf{x}_0, \mathbf{u}_0) + \mathbf{A}(\mathbf{x} - \mathbf{x}_0) + \mathbf{B}(\mathbf{u} - \mathbf{u}_0)$$ where: - $$\mathbf{A} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}\bigg|_{(\mathbf{x}_0, \mathbf{u}_0)}$$ is the Jacobian with respect to the state variables - $$\mathbf{B} = \frac{\partial \mathbf{f}}{\partial \mathbf{u}}\bigg|_{(\mathbf{x}_0, \mathbf{u}_0)}$$ is the Jacobian with respect to the input variables If the operating point is an equilibrium, then $$\mathbf{f}(\mathbf{x}_0, \mathbf{u}_0) = \mathbf{0}$$, and the linearized model simplifies to $$\dot{\delta \mathbf{x}} = \mathbf{A}\,\delta\mathbf{x} + \mathbf{B}\,\delta\mathbf{u}$$, where $$\delta\mathbf{x} = \mathbf{x} - \mathbf{x}_0$$ and $$\delta\mathbf{u} = \mathbf{u} - \mathbf{u}_0$$ are the deviation variables. ## Equilibrium points An **equilibrium point** is a steady-state condition where the system's state variables don't change over time. Formally, $$\mathbf{x}_e$$ is an equilibrium point if: $$\dot{\mathbf{x}} = \mathbf{f}(\mathbf{x}_e, \mathbf{u}_e) = \mathbf{0}$$ Equilibrium points determine the long-term behavior of a system and serve as the natural operating points around which to linearize. ### Definition and types Equilibrium points are classified by their stability properties: - **Stable equilibrium**: Small perturbations cause the system to return to the equilibrium. Think of a pendulum hanging straight down. - **Unstable equilibrium**: Small perturbations cause the system to diverge away. Think of a pendulum balanced straight up. - **Saddle point**: The equilibrium is stable in some directions and unstable in others. A ball sitting on a saddle-shaped surface rolls back toward the center along one axis but away along the other. A nonlinear system can have multiple equilibrium points, each with different stability properties. This is one of the key differences from linear systems, which have at most one equilibrium (the origin, for autonomous systems).

Stability of equilibrium points

Lyapunov stability theory provides a way to assess stability without solving the differential equations directly. The approach works as follows:

Construct a Lyapunov function $V(\mathbf{x})$ , which is a positive definite scalar function ( $V > 0$ everywhere except at the equilibrium, where $V = 0$ ).
Compute its time derivative $\dot{V}(\mathbf{x})$ along the system trajectories.
If $\dot{V} \leq 0$ in a neighborhood of the equilibrium, the equilibrium is stable.
If $\dot{V} < 0$ (strictly negative), the equilibrium is asymptotically stable, meaning trajectories actually converge to it.

Finding a suitable Lyapunov function can be challenging, but for mechanical and electrical systems, energy-based functions are often a natural choice.

Linearization around equilibrium

Linearizing around an equilibrium point is the most common application of linearization. Since $\mathbf{f}(\mathbf{x}_e, \mathbf{u}_e) = \mathbf{0}$ , the constant term drops out, and you get a clean linear model in the deviation variables.

The stability of the equilibrium can then be determined from the eigenvalues of the $\mathbf{A}$ matrix (the Jacobian evaluated at the equilibrium):

All eigenvalues have negative real parts → asymptotically stable
At least one eigenvalue has a positive real part → unstable
Eigenvalues on the imaginary axis (zero real part) → the linearization is inconclusive, and you need nonlinear methods (like Lyapunov analysis) to determine stability

That third case is important and often overlooked. Linearization can't always give you a definitive answer.

State-space representation

State-space representation describes a system using first-order differential equations. It consists of a state equation (how the states evolve) and an output equation (how the states map to measurable outputs).

Nonlinear state-space models

The general nonlinear state-space model is:

$\dot{\mathbf{x}} = \mathbf{f}(\mathbf{x}, \mathbf{u})$ $\mathbf{y} = \mathbf{h}(\mathbf{x}, \mathbf{u})$

where $\mathbf{x}$ is the state vector, $\mathbf{u}$ is the input vector, $\mathbf{y}$ is the output vector, $\mathbf{f}(\cdot)$ is the nonlinear state transition function, and $\mathbf{h}(\cdot)$ is the nonlinear output function.

These models can capture complex behaviors (multiple equilibria, limit cycles, chaos) but generally require specialized analysis techniques like Lyapunov methods, feedback linearization, or sliding mode control.

Linearized state-space models

Linearizing around an operating point $(\mathbf{x}_0, \mathbf{u}_0)$ produces the standard linear state-space form:

$\dot{\delta\mathbf{x}} = \mathbf{A}\,\delta\mathbf{x} + \mathbf{B}\,\delta\mathbf{u}$ $\delta\mathbf{y} = \mathbf{C}\,\delta\mathbf{x} + \mathbf{D}\,\delta\mathbf{u}$

where the matrices are obtained from the Jacobians of $\mathbf{f}$ and $\mathbf{h}$ :

$\mathbf{A} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}\bigg|_{(\mathbf{x}_0, \mathbf{u}_0)}$ , $\mathbf{B} = \frac{\partial \mathbf{f}}{\partial \mathbf{u}}\bigg|_{(\mathbf{x}_0, \mathbf{u}_0)}$
$\mathbf{C} = \frac{\partial \mathbf{h}}{\partial \mathbf{x}}\bigg|_{(\mathbf{x}_0, \mathbf{u}_0)}$ , $\mathbf{D} = \frac{\partial \mathbf{h}}{\partial \mathbf{u}}\bigg|_{(\mathbf{x}_0, \mathbf{u}_0)}$

Once in this form, you can apply pole placement, LQR, Kalman filtering, eigenvalue analysis, and frequency-domain techniques.

Linearization of input-output models

Input-output models like transfer functions describe the relationship between inputs and outputs without explicitly tracking state variables. To linearize an input-output model:

Identify the operating point of the nonlinear input-output relationship.
Apply the Taylor series expansion around that operating point.
Retain only the first-order terms to obtain a linear transfer function.

This linearized transfer function can then be used with classical control techniques like PID tuning, root locus, and lead-lag compensation.

Validity of linearization

A linearized model is only as useful as it is accurate. Understanding where and when the approximation holds is critical for safe controller design.

Local vs global validity

Linearization is inherently a local approximation. The linearized model is accurate in a neighborhood around the operating point, but that neighborhood can be quite small for highly nonlinear systems.

Global validity means a model works well across a wide range of operating conditions. Linear models generally don't provide this. If you need global accuracy, consider:

Gain scheduling: Use multiple linearized models at different operating points and switch between them.
Nonlinear models: Hammerstein-Wiener models, neural networks, or other nonlinear identification techniques can provide better global approximations.

Regions of attraction

The region of attraction (ROA) of a stable equilibrium point is the set of all initial conditions from which trajectories converge to that equilibrium. The ROA matters for linearization because:

If your system stays within the ROA and close to the equilibrium, the linearized model is a good approximation.
If perturbations push the system outside the ROA, the linearized model may predict stability when the actual system diverges.

Estimating the ROA for nonlinear systems is nontrivial. Techniques include Lyapunov sublevel sets and sum-of-squares (SOS) programming, both of which produce conservative (inner) estimates.

Checking model accuracy

You should always validate a linearized model before relying on it for controller design. Practical approaches include:

Simulation comparison: Simulate both the nonlinear and linearized models for the same inputs and initial conditions. Compare the responses.
Perturbation testing: Apply perturbations of increasing size to see where the linearized model's predictions start to diverge significantly from the nonlinear system.
Residual analysis: Examine the difference between the linearized model's output and the actual (or simulated nonlinear) output. Look for systematic patterns that indicate the linear model is missing important dynamics.

If accuracy is insufficient, you can include higher-order Taylor terms (quadratic, cubic) or switch to a nonlinear modeling approach entirely.

Feedback linearization

Feedback linearization takes a fundamentally different approach from Taylor-based linearization. Instead of approximating a nonlinear system as linear near an operating point, it uses a nonlinear control law to exactly cancel the nonlinearities, transforming the system into a truly linear one.

This gives you an exact linear system (not an approximation), but it requires precise knowledge of the system model and is sensitive to modeling errors.

Input-state linearization

Input-state linearization (also called full-state feedback linearization) transforms the entire state equation into linear form. For a system of the form:

$\dot{\mathbf{x}} = \mathbf{f}(\mathbf{x}) + \mathbf{g}(\mathbf{x})\mathbf{u}$

the goal is to find:

A coordinate transformation $\mathbf{z} = \mathbf{T}(\mathbf{x})$ that maps the original states to new coordinates.
A feedback control law $\mathbf{u} = \boldsymbol{\alpha}(\mathbf{x}) + \boldsymbol{\beta}(\mathbf{x})\mathbf{v}$ where $\boldsymbol{\alpha}$ cancels the nonlinearities and $\boldsymbol{\beta}$ scales the new input $\mathbf{v}$ .

The result is a linear system in the new coordinates:

$\dot{\mathbf{z}} = \mathbf{A}\mathbf{z} + \mathbf{B}\mathbf{v}$

Not every nonlinear system can be input-state linearized. The system must satisfy specific conditions related to its structure and controllability (involutivity of certain distributions, in differential geometric terms).

Input-output linearization

Input-output linearization focuses only on making the relationship between the input and a chosen output linear, rather than linearizing the entire state equation. For a system with output $y = h(\mathbf{x})$ :

Differentiate the output $y$ repeatedly until the input $\mathbf{u}$ appears explicitly. The number of differentiations required is the relative degree $r$ of the system.
Design a feedback law $\mathbf{u} = \boldsymbol{\alpha}(\mathbf{x}) + \boldsymbol{\beta}(\mathbf{x})v$ that cancels the nonlinear terms, yielding:

$y^{(r)} = v$

This is a chain of $r$ integrators, which is a simple linear system that you can control with standard linear techniques.

A key concern with input-output linearization is the internal dynamics, the part of the state dynamics that doesn't appear in the input-output relationship. These internal dynamics must be stable (the system must be minimum phase) for the approach to work safely. Unstable internal dynamics (non-minimum phase systems) can lead to unbounded states even when the output behaves well.