Fundamentals of Broyden's Method
Broyden's method solves nonlinear systems of equations without computing the full Jacobian at every step. Instead, it builds and updates an approximation to the Jacobian using information gathered from previous iterations. This makes it far cheaper per iteration than Newton's method, especially when the Jacobian is expensive to evaluate or the system is large.
The core idea: rather than recomputing all partial derivatives each time, you start with some initial Jacobian estimate and then apply a rank-one correction after each step. The result is superlinear convergence (faster than linear, slower than quadratic) with only one function evaluation per iteration.
Quasi-Newton Methods Overview
Quasi-Newton methods form a family of iterative techniques that approximate the Jacobian (or its inverse) rather than computing it exactly. The key properties:
- They avoid explicit derivative calculations, using only function values to build successive approximations
- They retain superlinear convergence in many practical cases, which is a significant step up from fixed-point iteration
- Each iteration updates the current Jacobian approximation using a low-rank correction, keeping the cost per iteration low
- Newton's method needs partial derivative evaluations per step for an -dimensional system; quasi-Newton methods need zero
Connection to the Secant Method
In one dimension, the secant method replaces the derivative with the finite difference . Broyden's method generalizes this idea to dimensions.
The challenge in multiple dimensions is that a single step gives you one piece of information (one function difference and one step vector), but the Jacobian has entries. You can't determine the full Jacobian from one step alone. Broyden's solution: change the Jacobian approximation only in the direction of the most recent step, and leave it unchanged in all other directions. This is the minimum-change principle.
Broyden's Update Formula
The update satisfies the secant equation (also called the quasi-Newton condition):
where:
- is the step vector
- is the change in function values
- is the current Jacobian approximation
The rank-one update that satisfies this condition with the smallest change to (in the Frobenius norm) is:
Notice that the numerator measures how much the current approximation fails to satisfy the secant equation. If already, no update occurs.
Algorithm Implementation
Step-by-Step Iteration
-
Choose an initial guess and an initial Jacobian approximation
-
Solve the linear system for the step
-
Update the iterate:
-
Evaluate
-
Compute
-
Update the Jacobian approximation:
-
Check stopping criteria; if not met, return to step 2
Choosing the Initial Jacobian
The initial approximation matters a lot for convergence. Common strategies:
- Finite differences: Compute the true Jacobian at the starting point using forward or central differences. This costs extra function evaluations (forward) or (central), but you only pay this cost once.
- Identity matrix: Set . This is cheap but can lead to slow convergence if the true Jacobian is far from the identity.
- Problem-specific knowledge: If you know the structure of the Jacobian (e.g., it's diagonally dominant or banded), use that to build a better .
A good can mean the difference between convergence in 10 iterations and convergence in 100 (or not at all).
Convergence Criteria
You typically check two conditions and stop when either is satisfied:
- Residual norm: (the function is close enough to zero)
- Step size: (the iterates have essentially stopped moving)
Always set a maximum iteration count as a safeguard. If neither criterion is met within that limit, the method may not be converging for your problem.
Advantages and Limitations
Computational Efficiency
The main selling point of Broyden's method is cost per iteration:
- Only one function evaluation per iteration (Newton's method needs one function evaluation plus one full Jacobian evaluation)
- The linear system solve in step 2 costs in general, but if you maintain the inverse approximation directly (see "Good Broyden" below), you can reduce this to
- For problems where a single function evaluation is expensive (e.g., it requires solving a PDE), this savings is substantial
Memory Requirements
Storing requires memory since it's a dense matrix. For moderate this is fine, but for very large systems ( in the tens of thousands or more), this becomes a bottleneck. Limited-memory variants exist that store only a few vectors instead of the full matrix.
Convergence Rate
- Under standard assumptions ( is continuously differentiable, the Jacobian at the root is nonsingular, is close enough to the solution), Broyden's method converges superlinearly
- The convergence rate sits between linear and quadratic. You won't match Newton's quadratic rate, but you'll beat simple fixed-point iteration
- If the initial Jacobian approximation is poor or the system is highly nonlinear, convergence can degrade or fail entirely
Variations and Extensions
Good Broyden Method (Broyden's First Method)
This is the standard version described above, which updates directly. The name "good" is somewhat historical and doesn't mean it's always better.

Inverse Broyden Method (Broyden's Second Method)
Instead of updating and solving a linear system each iteration, you can update the inverse approximation directly:
The iteration then replaces the linear solve with a simple matrix-vector multiply: . This reduces the per-iteration cost from to , which is a significant gain for larger systems. The trade-off is that maintaining the inverse can be less numerically stable.
Terminology note: The literature sometimes calls the direct update "Broyden's first method" or "good Broyden," and the inverse update "Broyden's second method" or "bad Broyden." The names are misleading; neither is universally better.
Symmetric Broyden Updates
When the Jacobian is known to be symmetric (as in optimization, where the Hessian of a scalar objective is symmetric), you can enforce symmetry in the update. The symmetric rank-one (SR1) update preserves symmetry and satisfies the secant equation:
This is useful in unconstrained optimization where you're approximating the Hessian.
Numerical Stability Considerations
Ill-Conditioning
Over many iterations, the Jacobian approximation can drift and become poorly conditioned. When this happens:
- The linear system produces inaccurate steps
- Convergence slows down or the method diverges
- Monitoring the condition number of (or at least its diagonal entries) helps you detect this early
A practical fix is to restart the method periodically: recompute the Jacobian from scratch using finite differences, then continue with Broyden updates from the fresh approximation.
Scaling Techniques
If the variables in your system have very different magnitudes (e.g., one variable is on the order of and another is ), the Jacobian approximation will be poorly scaled from the start. Normalize your variables to similar magnitudes before applying Broyden's method, or apply diagonal scaling to .
Regularization
When is near-singular, you can add a small multiple of the identity: solve instead. This is analogous to Tikhonov regularization and prevents catastrophically large steps. Trust-region methods provide a more principled way to achieve the same effect.
Comparison with Other Methods
| Feature | Broyden | Newton | BFGS |
|---|---|---|---|
| Jacobian/Hessian computation | Once (at ) | Every iteration | Never (builds from gradients) |
| Function evaluations per iteration | 1 | 1 + Jacobian cost | 1 (gradient) |
| Convergence rate | Superlinear | Quadratic | Superlinear |
| Primary use case | Nonlinear systems | Nonlinear systems | Unconstrained optimization |
| Symmetry assumption | No | No | Yes (positive definite Hessian) |
- vs. Newton: Use Broyden when the Jacobian is expensive to compute or unavailable analytically. Use Newton when you can cheaply compute exact derivatives and want the fastest convergence.
- vs. BFGS: Both are quasi-Newton methods, but BFGS is designed for optimization (minimizing a scalar function) and maintains positive definiteness of the Hessian approximation. Broyden's method is more general and applies to non-symmetric systems of equations.
- vs. DFP: The Davidon-Fletcher-Powell method is an older quasi-Newton approach for optimization. BFGS has largely replaced it in practice due to better numerical behavior, but DFP remains historically important.
Applications
Broyden's method is widely used in situations where function evaluations dominate the computational cost:
- Chemical engineering: Solving equilibrium equations for multicomponent systems where each function evaluation involves thermodynamic calculations
- Circuit simulation: Finding steady-state operating points of nonlinear circuits (SPICE-type simulators)
- Nonlinear least squares: Parameter estimation and curve fitting, where it can complement Gauss-Newton approaches
- PDE-constrained problems: When the "function evaluation" involves solving a partial differential equation, avoiding repeated Jacobian computation is critical
Implementation Notes
Most scientific computing environments provide Broyden's method out of the box:
- Python:
scipy.optimize.broyden1(good Broyden) andscipy.optimize.broyden2(inverse Broyden) in SciPy; also available through the generalscipy.optimize.rootinterface withmethod='broyden1' - MATLAB: The
fsolvefunction in the Optimization Toolbox supports quasi-Newton methods including Broyden-type updates - C++: Libraries like Eigen (for matrix operations), the GNU Scientific Library (GSL), and Ceres Solver provide the building blocks or direct implementations
For custom implementations, pay attention to the linear solve in step 2. Using the Sherman-Morrison formula, you can update directly from in operations, avoiding the cost of solving a fresh linear system each iteration.
Advanced Topics
Inexact Broyden Methods
In large-scale problems, even solving the linear system exactly can be too expensive. Inexact variants allow an approximate solve (e.g., using a few steps of GMRES) and still maintain convergence, provided the solve accuracy improves as you approach the solution.
Limited-Memory Variants
For very high-dimensional problems where storing the full matrix is impractical, limited-memory Broyden methods store only the last update vectors (typically ) and reconstruct the matrix-vector product implicitly. This mirrors the idea behind L-BFGS in optimization.
Parallel Implementations
Function evaluations can sometimes be parallelized (e.g., evaluating components of independently). Asynchronous variants of Broyden's method allow updates to proceed before all function evaluations complete, improving throughput on multi-core or distributed systems.