Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Curve fitting connects messy real-world data to clean mathematical models you can use for analysis, prediction, and computation. In Numerical Analysis I, you need to choose the right method for a given situation: minimizing overall error, passing exactly through data points, or avoiding numerical instabilities like oscillation. These concepts tie directly into error analysis, convergence behavior, computational efficiency, and numerical stability.
The methods below represent fundamentally different philosophies about how to approximate data. Some prioritize exactness at known points (interpolation), others prioritize minimizing total error (regression), and still others prioritize smoothness or special mathematical properties. Knowing when each method shines and why it might fail matters just as much as knowing the formulas.
These methods construct functions that pass exactly through every data point. The tradeoff: you gain precision at known locations but risk instability between them, especially as the number of points grows.
Given data points, Lagrange interpolation constructs the unique degree- interpolating polynomial directly using basis polynomials . Each basis polynomial equals 1 at point and 0 at every other data point, so the interpolant is:
This formulation is conceptually elegant but computationally expensive: it requires operations and a complete recalculation if any data point changes. For that reason, Lagrange interpolation is most useful in theoretical analysis and proofs rather than large-scale computation.
Newton's method builds the same interpolating polynomial but does so recursively through a divided difference table. The polynomial takes the form:
where denotes the th-order divided difference.
The key advantage: when a new data point arrives, you only need to compute new divided differences and append one more term. You don't start from scratch. This method also handles unevenly spaced data naturally since the divided difference formulation makes no assumption about uniform spacing.
Compare: Lagrange vs. Newton's Divided Difference produce the same interpolating polynomial, but Newton's method is computationally superior when data points are added incrementally. If a problem asks about updating an interpolation with new data, Newton is your answer.
A few foundational facts tie the interpolation methods together:
Rather than forcing one polynomial through all points, these methods use different polynomials in different regions. This local control dramatically improves stability and smoothness.
A spline is a piecewise polynomial where the pieces are joined at connection points called knots, with continuity conditions enforced at each knot.
Cubic splines are the standard choice. Between each pair of consecutive data points, you fit a cubic polynomial, then require:
Two additional boundary conditions are needed to close the system. Common choices include natural splines () and clamped splines (first derivatives specified at the endpoints).
Because each piece is only cubic, splines avoid Runge's phenomenon entirely. This makes them ideal for interpolating large datasets where a single high-degree polynomial would oscillate.
Bezier curves take a different approach: they're defined by control points rather than data points. The curve is influenced by all control points but only passes through the first and last. The mathematical basis is the set of Bernstein polynomials:
A Bezier curve with control points has degree . Quadratic (3 control points) and cubic (4 control points) Bezier curves dominate practical applications. Moving a single control point creates a predictable, localized change to the curve shape, which is why Bezier curves are the standard in computer graphics and CAD.
Compare: Splines vs. Bezier Curves: splines pass through all data points while Bezier curves use control points to shape the curve. Splines excel at data fitting; Bezier curves excel at design applications where intuitive manipulation matters more than exact interpolation.
When data contains noise or you don't need to pass exactly through points, regression methods find the "best fit" by minimizing error across all observations.
The least squares method minimizes the sum of squared residuals:
Why squared? Squaring penalizes large errors more heavily than small ones, prevents positive and negative errors from canceling, and yields a smooth objective function with a unique minimum.
For a linear model (linear in the parameters, not necessarily in ), the optimal coefficients satisfy the normal equations:
where is the design matrix, is the vector of unknown coefficients, and is the vector of observed -values. Understanding this system is essential since it connects curve fitting directly to linear algebra.
Linear regression fits the model by finding the slope and intercept that minimize squared error. The key assumptions are:
A closed-form solution exists (directly from the normal equations), so no iteration is needed. This makes linear regression fast and reliable. Violations of the assumptions, such as a clearly curved relationship or errors that grow with , degrade prediction quality and should prompt you to consider a different model.
When the relationship can't be captured by a polynomial in the parameters, you need nonlinear regression. Models like or fall into this category.
No closed-form solution exists, so you must use iterative optimization methods such as Gauss-Newton or Levenberg-Marquardt. These algorithms refine parameter estimates step by step, which introduces two practical concerns:
Compare: Linear vs. Nonlinear Regression: linear regression has a guaranteed unique solution computed directly, while nonlinear regression requires iterative methods with potential convergence issues. A useful strategy: try linearizing your model first (e.g., taking logs to turn into ) before resorting to full nonlinear fitting.
These methods optimize for specific mathematical properties beyond just fitting data: minimizing maximum error, capturing frequency content, or achieving optimal convergence.
Most methods minimize total error. Chebyshev approximation instead minimizes the maximum error across the entire interval (the minimax criterion). This distributes error evenly rather than allowing it to concentrate in certain regions.
The Chebyshev polynomials on have remarkable properties:
Use Chebyshev approximation when you need to control worst-case error, such as in function approximation for numerical libraries.
Fourier series represent functions as sums of sines and cosines:
The coefficients and capture the frequency content of the original function. This makes Fourier approximation optimal for periodic functions: it converges to the function at every point of continuity.
At discontinuities, Fourier series exhibit the Gibbs phenomenon: the partial sums overshoot by roughly 9% of the jump size, and this overshoot persists no matter how many terms you include. Fourier methods also form the foundation for spectral methods in solving differential equations and for signal processing applications.
Compare: Chebyshev vs. Fourier Approximation: Chebyshev minimizes the worst-case error for general (non-periodic) functions on an interval, while Fourier is optimal for periodic functions and frequency analysis. Choose Chebyshev for polynomial approximation problems; choose Fourier when periodicity or frequency content matters.
| Concept | Best Methods |
|---|---|
| Exact interpolation through points | Lagrange, Newton's Divided Difference, Polynomial Interpolation |
| Avoiding oscillation / Runge's phenomenon | Spline Interpolation, Chebyshev Approximation |
| Piecewise construction | Spline Interpolation, Bezier Curves |
| Minimizing total error | Least Squares, Linear Regression, Nonlinear Regression |
| Efficient updates with new data | Newton's Divided Difference |
| Periodic function approximation | Fourier Series |
| Minimax (minimize maximum error) | Chebyshev Approximation |
| Design / graphics applications | Bezier Curves |
Both Lagrange interpolation and Newton's divided difference produce the same polynomial. What computational advantage does Newton's method offer, and in what scenario does this matter most?
You have 20 noisy data points and want to find a trend line. Would you choose polynomial interpolation or least squares regression? Explain your reasoning in terms of error behavior.
Compare and contrast spline interpolation with high-degree polynomial interpolation. What phenomenon does spline interpolation avoid, and what mathematical property makes this possible?
A function exhibits periodic behavior with multiple frequency components. Which approximation method would best capture this structure, and what mathematical objects form its basis?
Given a dataset where new points are frequently added, recommend an interpolation method and justify your choice. Then explain what would change if the data contained significant measurement noise.