upgrade
upgrade

🔢Numerical Analysis I

Curve Fitting Methods

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Curve fitting is the bridge between messy real-world data and the clean mathematical models you need for analysis, prediction, and computation. In Numerical Analysis I, you're being tested on your ability to choose the right method for a given situation—whether that's minimizing overall error, passing exactly through data points, or avoiding numerical instabilities like oscillation. These concepts connect directly to error analysis, convergence behavior, computational efficiency, and numerical stability.

The methods below aren't just recipes to memorize—they represent fundamentally different philosophies about how to approximate data. Some prioritize exactness at known points (interpolation), others prioritize minimizing total error (regression), and still others prioritize smoothness or special mathematical properties. Don't just memorize formulas—know when each method shines and why it might fail.


Interpolation: Passing Exactly Through Points

These methods construct functions that pass exactly through every data point. The tradeoff? You gain precision at known locations but risk instability between them, especially as the number of points grows.

Lagrange Interpolation

  • Constructs the unique interpolating polynomial directly—uses basis polynomials Li(x)L_i(x) that equal 1 at point ii and 0 at all others
  • Conceptually elegant but computationally expensive—requires O(n2)O(n^2) operations and complete recalculation if any point changes
  • Best for theoretical analysis—frequently appears in proofs and derivations, less practical for large datasets

Newton's Divided Difference Method

  • Builds the interpolating polynomial recursively—uses a divided difference table that reveals the structure of the approximation
  • Efficient for adding new points—only requires computing new divided differences rather than starting from scratch
  • Handles unevenly spaced data naturally—the divided difference formulation doesn't assume uniform spacing

Compare: Lagrange vs. Newton's Divided Difference—both produce the same interpolating polynomial, but Newton's method is computationally superior when data points are added incrementally. If an FRQ asks about updating an interpolation with new data, Newton is your answer.

Polynomial Interpolation

  • Degree equals n1n-1 for nn points—this uniqueness theorem is fundamental to interpolation theory
  • Runge's phenomenon causes oscillation—high-degree polynomials can wildly oscillate between points, especially near boundaries
  • Works best for small, well-behaved datasets—avoid when you have many points or data with high variability at edges

Piecewise Methods: Controlling Local Behavior

Rather than forcing one polynomial through all points, these methods use different polynomials in different regions. This local control dramatically improves stability and smoothness.

Spline Interpolation

  • Piecewise polynomials joined with continuity conditions—typically requires matching function values and derivatives at connection points (knots)
  • Cubic splines are the standard choice—they ensure C2C^2 continuity (continuous second derivatives) with minimal oscillation
  • Avoids Runge's phenomenon entirely—low-degree pieces can't oscillate wildly, making splines ideal for large datasets

Bezier Curves

  • Defined by control points, not data points—the curve is influenced by control points but only passes through the first and last
  • Bernstein polynomial basis provides intuitive control—moving a control point creates predictable, local changes to the curve shape
  • Degree equals number of control points minus one—quadratic (3 points) and cubic (4 points) Bezier curves dominate practical applications

Compare: Splines vs. Bezier Curves—splines pass through all data points while Bezier curves use control points to shape the curve. Splines excel at data fitting; Bezier curves excel at design applications where intuitive manipulation matters more than exact interpolation.


Regression: Minimizing Overall Error

When data contains noise or you don't need to pass exactly through points, regression methods find the "best fit" by minimizing error across all observations.

Least Squares Method

  • Minimizes the sum of squared residuals (yiy^i)2\sum (y_i - \hat{y}_i)^2—squaring penalizes large errors more heavily and ensures a unique solution
  • Normal equations provide the solution—for linear problems, solving ATAx=ATbA^T A \mathbf{x} = A^T \mathbf{b} gives optimal coefficients
  • Foundation for all regression techniques—understanding least squares is essential for linear algebra applications throughout numerical analysis

Linear Regression

  • Fits the model y=mx+by = mx + b by finding slope and intercept that minimize squared error
  • Assumes linear relationship and independent errors—violations of these assumptions degrade prediction quality
  • Closed-form solution exists—coefficients can be computed directly without iteration, making it computationally efficient

Nonlinear Regression

  • Fits models like y=aebxy = ae^{bx} or y=a1+ebxy = \frac{a}{1 + e^{-bx}}—handles relationships that can't be captured by polynomials
  • Requires iterative optimization—methods like Gauss-Newton or Levenberg-Marquardt search for parameter values since no closed-form solution exists
  • Initial guesses matter significantly—poor starting values can lead to convergence to local minima or failure to converge at all

Compare: Linear vs. Nonlinear Regression—linear regression has guaranteed unique solutions computed directly, while nonlinear regression requires iterative methods with potential convergence issues. Always try linearizing your model (e.g., taking logs) before resorting to full nonlinear fitting.


Special Approximation Techniques

These methods optimize for specific mathematical properties beyond just fitting data—minimizing maximum error, capturing frequency content, or achieving optimal convergence.

Chebyshev Approximation

  • Minimizes the maximum error (minimax criterion)—distributes error evenly rather than minimizing total squared error
  • Chebyshev polynomials Tn(x)T_n(x) have optimal properties—their roots provide the best interpolation nodes to avoid Runge's phenomenon
  • Equioscillation theorem characterizes best approximations—the optimal polynomial alternates between ++ and - maximum error exactly n+2n+2 times

Fourier Series Approximation

  • Represents functions as sums of sines and cosines—the coefficients an,bna_n, b_n capture frequency content of the original function
  • Optimal for periodic functions—converges to the function at points of continuity; exhibits Gibbs phenomenon at discontinuities
  • Foundation for spectral methods—connects curve fitting to signal processing and solving differential equations

Compare: Chebyshev vs. Fourier Approximation—Chebyshev minimizes the worst-case error for general functions on an interval, while Fourier is optimal for periodic functions and frequency analysis. Choose Chebyshev for polynomial approximation problems; choose Fourier when periodicity or frequency content matters.


Quick Reference Table

ConceptBest Examples
Exact interpolation through pointsLagrange, Newton's Divided Difference, Polynomial Interpolation
Avoiding oscillation/Runge's phenomenonSpline Interpolation, Chebyshev Approximation
Piecewise constructionSpline Interpolation, Bezier Curves
Minimizing total errorLeast Squares, Linear Regression, Nonlinear Regression
Efficient updates with new dataNewton's Divided Difference
Periodic function approximationFourier Series
Minimax (minimize maximum error)Chebyshev Approximation
Design/graphics applicationsBezier Curves

Self-Check Questions

  1. Both Lagrange interpolation and Newton's divided difference produce the same polynomial. What computational advantage does Newton's method offer, and in what scenario does this matter most?

  2. You have 20 noisy data points and want to find a trend line. Would you choose polynomial interpolation or least squares regression? Explain your reasoning in terms of error behavior.

  3. Compare and contrast spline interpolation with high-degree polynomial interpolation. What phenomenon does spline interpolation avoid, and what mathematical property makes this possible?

  4. A function exhibits periodic behavior with multiple frequency components. Which approximation method would best capture this structure, and what mathematical objects form its basis?

  5. (FRQ-style) Given a dataset where new points are frequently added, recommend an interpolation method and justify your choice. Then explain what would change if the data contained significant measurement noise.