Fiveable

๐Ÿ”ขNumerical Analysis II Unit 9 Review

QR code for Numerical Analysis II practice questions

9.5 Broyden's method

9.5 Broyden's method

Written by the Fiveable Content Team โ€ข Last updated August 2025
Written by the Fiveable Content Team โ€ข Last updated August 2025
๐Ÿ”ขNumerical Analysis II
Unit & Topic Study Guides

Fundamentals of Broyden's Method

Broyden's method solves nonlinear systems of equations without computing the full Jacobian at every step. Instead, it builds and updates an approximation to the Jacobian using information gathered from previous iterations. This makes it far cheaper per iteration than Newton's method, especially when the Jacobian is expensive to evaluate or the system is large.

The core idea: rather than recomputing all partial derivatives each time, you start with some initial Jacobian estimate and then apply a rank-one correction after each step. The result is superlinear convergence (faster than linear, slower than quadratic) with only one function evaluation per iteration.

Quasi-Newton Methods Overview

Quasi-Newton methods form a family of iterative techniques that approximate the Jacobian (or its inverse) rather than computing it exactly. The key properties:

  • They avoid explicit derivative calculations, using only function values to build successive approximations
  • They retain superlinear convergence in many practical cases, which is a significant step up from fixed-point iteration
  • Each iteration updates the current Jacobian approximation using a low-rank correction, keeping the cost per iteration low
  • Newton's method needs n2n^2 partial derivative evaluations per step for an nn-dimensional system; quasi-Newton methods need zero

Connection to the Secant Method

In one dimension, the secant method replaces the derivative fโ€ฒ(xk)f'(x_k) with the finite difference f(xk)โˆ’f(xkโˆ’1)xkโˆ’xkโˆ’1\frac{f(x_k) - f(x_{k-1})}{x_k - x_{k-1}}. Broyden's method generalizes this idea to nn dimensions.

The challenge in multiple dimensions is that a single step gives you one piece of information (one function difference and one step vector), but the Jacobian has n2n^2 entries. You can't determine the full Jacobian from one step alone. Broyden's solution: change the Jacobian approximation only in the direction of the most recent step, and leave it unchanged in all other directions. This is the minimum-change principle.

Broyden's Update Formula

The update satisfies the secant equation (also called the quasi-Newton condition):

Bk+1sk=ykB_{k+1} s_k = y_k

where:

  • sk=xk+1โˆ’xks_k = x_{k+1} - x_k is the step vector
  • yk=f(xk+1)โˆ’f(xk)y_k = f(x_{k+1}) - f(x_k) is the change in function values
  • BkB_k is the current Jacobian approximation

The rank-one update that satisfies this condition with the smallest change to BkB_k (in the Frobenius norm) is:

Bk+1=Bk+(ykโˆ’Bksk)skTskTskB_{k+1} = B_k + \frac{(y_k - B_k s_k) s_k^T}{s_k^T s_k}

Notice that the numerator (ykโˆ’Bksk)(y_k - B_k s_k) measures how much the current approximation BkB_k fails to satisfy the secant equation. If Bksk=ykB_k s_k = y_k already, no update occurs.

Algorithm Implementation

Step-by-Step Iteration

  1. Choose an initial guess x0x_0 and an initial Jacobian approximation B0B_0

  2. Solve the linear system Bksk=โˆ’f(xk)B_k s_k = -f(x_k) for the step sks_k

  3. Update the iterate: xk+1=xk+skx_{k+1} = x_k + s_k

  4. Evaluate f(xk+1)f(x_{k+1})

  5. Compute yk=f(xk+1)โˆ’f(xk)y_k = f(x_{k+1}) - f(x_k)

  6. Update the Jacobian approximation: Bk+1=Bk+(ykโˆ’Bksk)skTskTskB_{k+1} = B_k + \frac{(y_k - B_k s_k) s_k^T}{s_k^T s_k}

  7. Check stopping criteria; if not met, return to step 2

Choosing the Initial Jacobian

The initial approximation B0B_0 matters a lot for convergence. Common strategies:

  • Finite differences: Compute the true Jacobian J(x0)J(x_0) at the starting point using forward or central differences. This costs nn extra function evaluations (forward) or 2n2n (central), but you only pay this cost once.
  • Identity matrix: Set B0=IB_0 = I. This is cheap but can lead to slow convergence if the true Jacobian is far from the identity.
  • Problem-specific knowledge: If you know the structure of the Jacobian (e.g., it's diagonally dominant or banded), use that to build a better B0B_0.

A good B0B_0 can mean the difference between convergence in 10 iterations and convergence in 100 (or not at all).

Convergence Criteria

You typically check two conditions and stop when either is satisfied:

  • Residual norm: โˆฅf(xk)โˆฅ<ฯตf\|f(x_k)\| < \epsilon_f (the function is close enough to zero)
  • Step size: โˆฅxk+1โˆ’xkโˆฅโˆฅxkโˆฅ<ฯตx\frac{\|x_{k+1} - x_k\|}{\|x_k\|} < \epsilon_x (the iterates have essentially stopped moving)

Always set a maximum iteration count as a safeguard. If neither criterion is met within that limit, the method may not be converging for your problem.

Advantages and Limitations

Computational Efficiency

The main selling point of Broyden's method is cost per iteration:

  • Only one function evaluation per iteration (Newton's method needs one function evaluation plus one full Jacobian evaluation)
  • The linear system solve in step 2 costs O(n3)O(n^3) in general, but if you maintain the inverse approximation directly (see "Good Broyden" below), you can reduce this to O(n2)O(n^2)
  • For problems where a single function evaluation is expensive (e.g., it requires solving a PDE), this savings is substantial

Memory Requirements

Storing BkB_k requires O(n2)O(n^2) memory since it's a dense nร—nn \times n matrix. For moderate nn this is fine, but for very large systems (nn in the tens of thousands or more), this becomes a bottleneck. Limited-memory variants exist that store only a few vectors instead of the full matrix.

Convergence Rate

  • Under standard assumptions (ff is continuously differentiable, the Jacobian at the root is nonsingular, x0x_0 is close enough to the solution), Broyden's method converges superlinearly
  • The convergence rate sits between linear and quadratic. You won't match Newton's quadratic rate, but you'll beat simple fixed-point iteration
  • If the initial Jacobian approximation is poor or the system is highly nonlinear, convergence can degrade or fail entirely

Variations and Extensions

Good Broyden Method (Broyden's First Method)

This is the standard version described above, which updates BkB_k directly. The name "good" is somewhat historical and doesn't mean it's always better.

Quasi-Newton methods overview, Convergence properties of the Broyden-like method for mixed linearโ€“nonlinear systems of ...

Inverse Broyden Method (Broyden's Second Method)

Instead of updating BkB_k and solving a linear system each iteration, you can update the inverse approximation Hkโ‰ˆJโˆ’1H_k \approx J^{-1} directly:

Hk+1=Hk+(skโˆ’Hkyk)skTHkskTHkykH_{k+1} = H_k + \frac{(s_k - H_k y_k) s_k^T H_k}{s_k^T H_k y_k}

The iteration then replaces the linear solve with a simple matrix-vector multiply: sk=โˆ’Hkf(xk)s_k = -H_k f(x_k). This reduces the per-iteration cost from O(n3)O(n^3) to O(n2)O(n^2), which is a significant gain for larger systems. The trade-off is that maintaining the inverse can be less numerically stable.

Terminology note: The literature sometimes calls the direct update "Broyden's first method" or "good Broyden," and the inverse update "Broyden's second method" or "bad Broyden." The names are misleading; neither is universally better.

Symmetric Broyden Updates

When the Jacobian is known to be symmetric (as in optimization, where the Hessian of a scalar objective is symmetric), you can enforce symmetry in the update. The symmetric rank-one (SR1) update preserves symmetry and satisfies the secant equation:

Bk+1=Bk+(ykโˆ’Bksk)(ykโˆ’Bksk)T(ykโˆ’Bksk)TskB_{k+1} = B_k + \frac{(y_k - B_k s_k)(y_k - B_k s_k)^T}{(y_k - B_k s_k)^T s_k}

This is useful in unconstrained optimization where you're approximating the Hessian.

Numerical Stability Considerations

Ill-Conditioning

Over many iterations, the Jacobian approximation BkB_k can drift and become poorly conditioned. When this happens:

  • The linear system Bksk=โˆ’f(xk)B_k s_k = -f(x_k) produces inaccurate steps
  • Convergence slows down or the method diverges
  • Monitoring the condition number of BkB_k (or at least its diagonal entries) helps you detect this early

A practical fix is to restart the method periodically: recompute the Jacobian from scratch using finite differences, then continue with Broyden updates from the fresh approximation.

Scaling Techniques

If the variables in your system have very different magnitudes (e.g., one variable is on the order of 10610^6 and another is 10โˆ’310^{-3}), the Jacobian approximation will be poorly scaled from the start. Normalize your variables to similar magnitudes before applying Broyden's method, or apply diagonal scaling to BkB_k.

Regularization

When BkB_k is near-singular, you can add a small multiple of the identity: solve (Bk+ฮปI)sk=โˆ’f(xk)(B_k + \lambda I) s_k = -f(x_k) instead. This is analogous to Tikhonov regularization and prevents catastrophically large steps. Trust-region methods provide a more principled way to achieve the same effect.

Comparison with Other Methods

FeatureBroydenNewtonBFGS
Jacobian/Hessian computationOnce (at B0B_0)Every iterationNever (builds from gradients)
Function evaluations per iteration11 + Jacobian cost1 (gradient)
Convergence rateSuperlinearQuadraticSuperlinear
Primary use caseNonlinear systemsNonlinear systemsUnconstrained optimization
Symmetry assumptionNoNoYes (positive definite Hessian)
  • vs. Newton: Use Broyden when the Jacobian is expensive to compute or unavailable analytically. Use Newton when you can cheaply compute exact derivatives and want the fastest convergence.
  • vs. BFGS: Both are quasi-Newton methods, but BFGS is designed for optimization (minimizing a scalar function) and maintains positive definiteness of the Hessian approximation. Broyden's method is more general and applies to non-symmetric systems of equations.
  • vs. DFP: The Davidon-Fletcher-Powell method is an older quasi-Newton approach for optimization. BFGS has largely replaced it in practice due to better numerical behavior, but DFP remains historically important.

Applications

Broyden's method is widely used in situations where function evaluations dominate the computational cost:

  • Chemical engineering: Solving equilibrium equations for multicomponent systems where each function evaluation involves thermodynamic calculations
  • Circuit simulation: Finding steady-state operating points of nonlinear circuits (SPICE-type simulators)
  • Nonlinear least squares: Parameter estimation and curve fitting, where it can complement Gauss-Newton approaches
  • PDE-constrained problems: When the "function evaluation" involves solving a partial differential equation, avoiding repeated Jacobian computation is critical

Implementation Notes

Most scientific computing environments provide Broyden's method out of the box:

  • Python: scipy.optimize.broyden1 (good Broyden) and scipy.optimize.broyden2 (inverse Broyden) in SciPy; also available through the general scipy.optimize.root interface with method='broyden1'
  • MATLAB: The fsolve function in the Optimization Toolbox supports quasi-Newton methods including Broyden-type updates
  • C++: Libraries like Eigen (for matrix operations), the GNU Scientific Library (GSL), and Ceres Solver provide the building blocks or direct implementations

For custom implementations, pay attention to the linear solve in step 2. Using the Sherman-Morrison formula, you can update Bkโˆ’1B_k^{-1} directly from Bkโˆ’1โˆ’1B_{k-1}^{-1} in O(n2)O(n^2) operations, avoiding the O(n3)O(n^3) cost of solving a fresh linear system each iteration.

Advanced Topics

Inexact Broyden Methods

In large-scale problems, even solving the linear system Bksk=โˆ’f(xk)B_k s_k = -f(x_k) exactly can be too expensive. Inexact variants allow an approximate solve (e.g., using a few steps of GMRES) and still maintain convergence, provided the solve accuracy improves as you approach the solution.

Limited-Memory Variants

For very high-dimensional problems where storing the full nร—nn \times n matrix is impractical, limited-memory Broyden methods store only the last mm update vectors (typically mโ‰ชnm \ll n) and reconstruct the matrix-vector product implicitly. This mirrors the idea behind L-BFGS in optimization.

Parallel Implementations

Function evaluations can sometimes be parallelized (e.g., evaluating components of ff independently). Asynchronous variants of Broyden's method allow updates to proceed before all function evaluations complete, improving throughput on multi-core or distributed systems.