Fiveable

📡Advanced Signal Processing Unit 4 Review

QR code for Advanced Signal Processing practice questions

4.4 Kalman filtering

4.4 Kalman filtering

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
📡Advanced Signal Processing
Unit & Topic Study Guides

Kalman filtering estimates a system's state from noisy measurements by combining a mathematical model of the system with observed data. It produces statistically optimal estimates under certain conditions, which makes it central to applications like speech enhancement, image restoration, navigation, and target tracking. The algorithm operates recursively through prediction and update steps, balancing trust between what the model predicts and what the sensors measure.

Overview of Kalman filtering

Kalman filtering is a recursive algorithm for estimating the state of a dynamic system from noisy measurements. Rudolf E. Kalman introduced it in 1960, and it has since become a cornerstone of signal processing, control systems, and navigation.

What makes the Kalman filter so useful is that it handles uncertainty in both the system model and the measurements simultaneously. At each time step, it produces a minimum mean-square error (MMSE) estimate of the state, along with a covariance matrix that quantifies how confident you should be in that estimate. This dual output of "best guess plus uncertainty" is what separates it from simpler estimation approaches.

Applications in signal processing

  • Speech processing: Kalman filters can model the speech production system as a state-space process, then estimate clean speech parameters from noisy observations. This leads to noise reduction algorithms that adapt to changing signal conditions rather than relying on fixed spectral subtraction.
  • Image processing: Applications include image denoising, motion estimation, and object tracking in video sequences. For tracking, the Kalman filter predicts where an object will appear in the next frame and corrects that prediction with the actual detection.
  • Navigation (GPS): GPS receivers use Kalman filters to fuse noisy satellite measurements with a kinematic model of the vehicle, producing smooth position and velocity estimates even when individual measurements are noisy or temporarily unavailable.
  • Robotics and control: In robotic systems, Kalman filters estimate the robot's pose (position and orientation) from noisy sensor readings like encoders, IMUs, and range finders.

State-space models

State-space models provide the mathematical framework that Kalman filters operate on. The idea is to describe a dynamic system using two sets of equations: one governing how the internal state evolves over time, and another governing how measurements relate to that state.

State transition equations

The state transition equation describes how the system's state at one time step produces the state at the next:

xk+1=Akxk+Bkuk+wkx_{k+1} = A_k x_k + B_k u_k + w_k

  • xkx_k is the state vector at time step kk (the quantities you're trying to estimate)
  • AkA_k is the state transition matrix (encodes the system dynamics)
  • BkB_k is the input matrix and uku_k is a known control input
  • wkw_k is process noise, assumed to be zero-mean Gaussian with covariance QkQ_k

The process noise wkw_k accounts for modeling inaccuracies and disturbances that the state transition equation doesn't capture explicitly.

Observation equations

The observation equation relates the hidden state to the measurements you actually receive from sensors:

zk=Hkxk+vkz_k = H_k x_k + v_k

  • zkz_k is the measurement vector at time step kk
  • HkH_k is the observation matrix (maps the state space to the measurement space)
  • vkv_k is measurement noise, assumed to be zero-mean Gaussian with covariance RkR_k

Together, these two equations fully specify the linear Gaussian state-space model that the standard Kalman filter requires.

Kalman filter algorithm

The algorithm alternates between two steps at each time step: predict the state forward using the model, then correct that prediction using the new measurement. This predict-update cycle is what makes the filter recursive.

Prediction step

The prediction step propagates the state estimate and its uncertainty forward in time using the system model.

  1. Predict the state: Compute the a priori state estimate using the transition model: x^kk1=Akx^k1k1+Bkuk\hat{x}_{k|k-1} = A_k \hat{x}_{k-1|k-1} + B_k u_k

  2. Predict the error covariance: Propagate the uncertainty forward, adding process noise: Pkk1=AkPk1k1AkT+QkP_{k|k-1} = A_k P_{k-1|k-1} A_k^T + Q_k

After this step, x^kk1\hat{x}_{k|k-1} is your best guess of the state before seeing the measurement at time kk, and Pkk1P_{k|k-1} quantifies how uncertain that guess is.

Update step

The update step incorporates the new measurement zkz_k to refine the prediction.

  1. Compute the innovation (measurement residual): y~k=zkHkx^kk1\tilde{y}_k = z_k - H_k \hat{x}_{k|k-1}

This is the difference between what you actually measured and what the model predicted you'd measure.

  1. Compute the innovation covariance: Sk=HkPkk1HkT+RkS_k = H_k P_{k|k-1} H_k^T + R_k

  2. Compute the Kalman gain: Kk=Pkk1HkTSk1K_k = P_{k|k-1} H_k^T S_k^{-1}

  3. Update the state estimate: x^kk=x^kk1+Kky~k\hat{x}_{k|k} = \hat{x}_{k|k-1} + K_k \tilde{y}_k

  4. Update the error covariance: Pkk=(IKkHk)Pkk1P_{k|k} = (I - K_k H_k) P_{k|k-1}

The updated covariance PkkP_{k|k} will always be smaller (in the positive-semidefinite sense) than Pkk1P_{k|k-1}, reflecting the reduction in uncertainty gained from the measurement.

Kalman gain matrix

The Kalman gain KkK_k is the core mechanism that balances trust between the model prediction and the measurement. Its behavior is intuitive:

  • When measurement noise is small (RkR_k is small relative to Pkk1P_{k|k-1}): KkK_k is large, so the filter trusts the measurement more and pulls the estimate heavily toward zkz_k.
  • When prediction uncertainty is small (Pkk1P_{k|k-1} is small relative to RkR_k): KkK_k is small, so the filter trusts the model prediction more and the measurement has less influence.

In the limiting case where Rk0R_k \to 0, the gain approaches Hk1H_k^{-1} (if it exists), meaning the filter essentially ignores the prediction and uses the measurement directly. When Pkk10P_{k|k-1} \to 0, the gain approaches zero, meaning the filter ignores the measurement entirely.

Assumptions and limitations

State transition equations, File:Basic concept of Kalman filtering.svg - Wikimedia Commons

Linearity and Gaussian noise

The standard Kalman filter is optimal (in the MMSE sense) only under specific conditions:

  • The state transition and observation equations must be linear.
  • The process noise wkw_k and measurement noise vkv_k must be additive, white, and Gaussian with known covariance matrices QkQ_k and RkR_k.
  • The noises wkw_k and vkv_k must be mutually uncorrelated.

When any of these assumptions are violated, the standard Kalman filter loses its optimality guarantee. For nonlinear systems, the extended Kalman filter (EKF) or unscented Kalman filter (UKF) are common alternatives. For non-Gaussian noise, particle filters are typically used.

Model accuracy and robustness

The filter's performance depends heavily on how well AkA_k, HkH_k, QkQ_k, and RkR_k represent the true system. Poorly specified models can cause the filter to produce biased estimates or even diverge, where the estimated covariance shrinks (suggesting high confidence) while the actual errors grow.

Adaptive Kalman filtering addresses this by estimating noise covariances online from the innovation sequence. Robust Kalman filtering incorporates techniques that bound the influence of model mismatch, providing performance guarantees even when the model is approximate.

Extensions and variations

Extended Kalman filter for nonlinear systems

The extended Kalman filter (EKF) handles nonlinear state transition and observation functions by linearizing them around the current state estimate using first-order Taylor series expansion (i.e., computing Jacobians). The prediction and update equations then use these Jacobians in place of AkA_k and HkH_k.

The EKF works well when the nonlinearities are mild and the state estimate is close to the true state. However, the linearization can introduce significant errors for highly nonlinear systems, and the EKF provides no guarantee of optimality. It's widely used in robot localization, satellite navigation, and computer vision.

Unscented Kalman filter

The unscented Kalman filter (UKF) avoids linearization entirely. Instead, it uses the unscented transform: a deterministic sampling technique that selects a small set of sigma points capturing the mean and covariance of the current state distribution.

The process works as follows:

  1. Generate 2n+12n+1 sigma points (where nn is the state dimension) around the current mean.
  2. Propagate each sigma point through the nonlinear function.
  3. Reconstruct the mean and covariance of the output from the transformed sigma points using weighted averages.

The UKF captures second-order statistics more accurately than the EKF's first-order linearization, and it doesn't require you to compute Jacobians. It tends to outperform the EKF when nonlinearities are significant.

Particle filters

Particle filters (sequential Monte Carlo methods) represent the state distribution using a large set of weighted random samples (particles). They can handle arbitrary nonlinearities and non-Gaussian distributions, which makes them the most general approach.

Each particle is propagated through the system dynamics, and its weight is updated based on how well it explains the observed measurement. Resampling steps prevent weight degeneracy by concentrating particles in high-probability regions.

The trade-off is computational cost: particle filters require many particles (often thousands) to represent the distribution adequately, especially in high-dimensional state spaces. They're most practical for low-dimensional tracking problems with multi-modal distributions or non-additive noise.

Kalman smoothing

Standard Kalman filtering produces estimates using only measurements up to the current time (causal filtering). Kalman smoothing uses measurements from both past and future time steps, yielding more accurate state trajectory estimates at the cost of introducing latency.

Forward-backward smoothing

The Rauch-Tung-Striebel (RTS) smoother is the most common forward-backward approach:

  1. Forward pass: Run a standard Kalman filter over the entire data sequence, storing the filtered estimates x^kk\hat{x}_{k|k} and covariances PkkP_{k|k} at each step.
  2. Backward pass: Starting from the last time step, recursively combine the filtered estimates with information propagated backward from future measurements to produce smoothed estimates x^kN\hat{x}_{k|N} (where NN is the final time step).

The smoothed estimates always have equal or lower error covariance than the filtered estimates.

Fixed-interval vs fixed-lag smoothing

  • Fixed-interval smoothing estimates the entire state trajectory over a known time interval using all measurements in that interval. This gives the best possible estimates but requires the full dataset, making it an offline technique.
  • Fixed-lag smoothing estimates the state at time kLk - L using measurements up to time kk, where LL is a fixed lag. This provides better estimates than filtering alone while remaining suitable for near-real-time applications, at the cost of introducing a delay of LL steps.
State transition equations, Kalman filter - Wikipedia

Practical considerations

Initialization and convergence

Proper initialization affects how quickly the filter converges to accurate estimates. Set the initial state estimate x^00\hat{x}_{0|0} to the best available prior knowledge. If you have little prior information, set P00P_{0|0} to a large diagonal matrix, which tells the filter to trust early measurements heavily. The filter will typically converge within a few time steps as measurements accumulate.

Be cautious with overly large P00P_{0|0} values in finite-precision arithmetic, as they can cause numerical issues.

Tuning process and measurement covariance matrices

Getting QkQ_k and RkR_k right is often the most challenging part of deploying a Kalman filter.

  • RkR_k (measurement noise covariance): Usually easier to determine. You can often characterize sensor noise directly from datasheets or by collecting stationary measurements.
  • QkQ_k (process noise covariance): Harder to specify because it represents unmodeled dynamics. Setting QkQ_k too small makes the filter overconfident in its model and slow to respond to real changes. Setting it too large makes the filter noisy and overly reactive.

Common tuning approaches include autocovariance least-squares (ALS), maximum likelihood estimation, and innovation-based methods that monitor the consistency of the innovation sequence.

Computational complexity and real-time implementation

The dominant cost in the Kalman filter is the matrix inversion (or equivalent solve) in the Kalman gain computation, which scales as O(m3)O(m^3) where mm is the measurement dimension. The overall per-step complexity is O(n2m+m3)O(n^2 m + m^3) where nn is the state dimension.

For high-dimensional problems or high sampling rates, consider:

  • Matrix factorization methods (square-root filters, UD factorization) that improve numerical stability and can reduce computation
  • Steady-state Kalman filter: If the system is time-invariant, the Kalman gain converges to a constant value. Pre-computing this gain eliminates the covariance update entirely.
  • Reduced-order filters that approximate the full state with a lower-dimensional representation

Comparison with other estimation techniques

Kalman filters vs least squares

Least squares estimation processes an entire batch of measurements at once to minimize the sum of squared errors. The Kalman filter processes measurements sequentially and is mathematically equivalent to recursive least squares when the system is static (Ak=IA_k = I, Qk=0Q_k = 0).

The Kalman filter's advantages over batch least squares are its ability to handle dynamic systems with time-varying states, its recursive structure (constant memory and computation per step), and its natural suitability for real-time processing.

Kalman filters vs Bayesian estimation

The Kalman filter is a special case of Bayesian estimation. In the Bayesian framework, you maintain a posterior distribution over the state given all measurements. For linear Gaussian systems, this posterior is Gaussian, and the Kalman filter computes its mean and covariance exactly.

For nonlinear or non-Gaussian problems, the posterior is no longer Gaussian, and you need more general Bayesian methods: the EKF and UKF approximate the posterior as Gaussian, while particle filters and MCMC methods represent it more flexibly at higher computational cost.

Advanced topics and research areas

Multi-sensor data fusion with Kalman filters

When multiple sensors observe the same system, Kalman filters can fuse their measurements to produce estimates that are more accurate than any single sensor alone. Challenges include handling asynchronous measurement arrivals, different sensor update rates, and unknown cross-correlations between sensor errors.

The federated Kalman filter runs independent local filters for each sensor and combines their outputs at a master filter. Covariance intersection provides a conservative fusion rule that remains consistent even when cross-correlations between estimates are unknown.

Adaptive Kalman filtering

Adaptive techniques estimate the noise covariances QkQ_k and RkR_k online, which is critical when these quantities are unknown or change over time.

  • Innovation-based adaptive estimation (IAE): Monitors the innovation sequence y~k\tilde{y}_k. If the filter is well-tuned, the innovations should be white and consistent with their predicted covariance SkS_k. Deviations indicate model mismatch, and the covariances are adjusted accordingly.
  • Multiple model adaptive estimation (MMAE): Runs a bank of Kalman filters, each with different model parameters. The probability of each model is updated based on how well it explains the observed data, and the final estimate is a probability-weighted combination.

Distributed and decentralized Kalman filtering

In large-scale systems (sensor networks, multi-robot teams, smart grids), centralizing all measurements at a single processor is often impractical due to communication bandwidth, latency, or reliability constraints.

  • Distributed Kalman filtering: Each node maintains a local state estimate and exchanges information with neighbors. Through iterative communication rounds, the local estimates converge toward the centralized optimal estimate.
  • Decentralized Kalman filtering: Achieves a global estimate without any central coordinator, using consensus algorithms or gossip protocols to propagate information across the network.

These approaches trade some estimation accuracy for scalability, robustness to node failures, and reduced communication overhead.

2,589 studying →