upgrade
upgrade

🎲Data, Inference, and Decisions

Regression Analysis Types

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Regression analysis is the backbone of statistical inference and predictive modeling—and it's everywhere on the exam. You're being tested on your ability to choose the right type of regression for a given data scenario, interpret coefficients correctly, and recognize when model assumptions are violated. Whether you're predicting continuous outcomes, classifying binary events, or handling messy real-world data with multicollinearity, understanding regression types demonstrates your grasp of model selection, assumption checking, and bias-variance tradeoffs.

Don't just memorize the names of these regression methods. Know when each type applies, what assumptions it requires, and how it handles problems like overfitting, non-linearity, and correlated predictors. An FRQ might give you a scenario and ask you to justify your model choice—that's where conceptual understanding beats rote recall every time.


Modeling Linear Relationships

These foundational methods assume that the relationship between predictors and outcomes can be captured with a straight line (or a flat plane in higher dimensions). The key mechanism is minimizing the sum of squared residuals to find the best-fit line.

Simple Linear Regression

  • One predictor, one outcome—models the relationship between two variables using the equation y^=b0+b1x\hat{y} = b_0 + b_1x
  • Assumes linearity and constant variance—residuals should be randomly scattered with no pattern (homoscedasticity)
  • Interpretation is straightforwardb1b_1 represents the expected change in yy for a one-unit increase in xx

Multiple Linear Regression

  • Multiple predictors, one outcome—extends simple regression to y^=b0+b1x1+b2x2++bkxk\hat{y} = b_0 + b_1x_1 + b_2x_2 + \cdots + b_kx_k
  • Controls for confounding variables—allows you to isolate the effect of one predictor while holding others constant
  • Assumes independence among residuals—also requires no perfect multicollinearity between predictors

Compare: Simple Linear Regression vs. Multiple Linear Regression—both minimize squared residuals and assume linearity, but multiple regression lets you analyze several factors simultaneously. If an FRQ asks about controlling for confounding variables, multiple regression is your answer.


Handling Non-Linear Patterns

When data curves, bends, or follows exponential/logarithmic patterns, linear models fail. These methods capture relationships where the rate of change itself changes across the range of predictors.

Polynomial Regression

  • Adds polynomial terms—models curves using y^=b0+b1x+b2x2++bnxn\hat{y} = b_0 + b_1x + b_2x^2 + \cdots + b_nx^n
  • Captures curvature that linear models miss—useful when scatter plots show U-shapes, S-curves, or other non-linear patterns
  • Overfitting risk increases with degree—higher-order polynomials fit training data perfectly but generalize poorly (bias-variance tradeoff)

Nonlinear Regression

  • Flexible functional forms—can fit exponential (y=aebxy = ae^{bx}), logarithmic (y=a+bln(x)y = a + b\ln(x)), or power relationships
  • Requires theory-driven model selection—you must specify the functional form before estimation, unlike polynomial regression
  • Iterative estimation methods—uses algorithms like Gauss-Newton rather than closed-form solutions

Compare: Polynomial vs. Nonlinear Regression—polynomial regression adds powers of xx to a linear framework, while nonlinear regression fits inherently curved functions. Choose polynomial for exploratory analysis; choose nonlinear when theory suggests a specific functional form.


Predicting Categorical Outcomes

Not all dependent variables are continuous. When your outcome is binary (yes/no, success/failure), you need methods that predict probabilities bounded between 0 and 1.

Logistic Regression

  • Predicts probability of a binary outcome—models the log-odds as ln(p1p)=b0+b1x1++bkxk\ln\left(\frac{p}{1-p}\right) = b_0 + b_1x_1 + \cdots + b_kx_k
  • Coefficients represent log-odds ratioseb1e^{b_1} gives the multiplicative change in odds for a one-unit increase in x1x_1
  • No normality assumption required—but assumes observations are independent and the relationship between predictors and log-odds is linear

Compare: Linear vs. Logistic Regression—linear regression predicts continuous values (potentially outside 0-1), while logistic regression predicts bounded probabilities. If the outcome is binary, logistic regression is always the correct choice.


Regularization for Complex Models

When you have many predictors or correlated variables, standard regression can overfit or produce unstable estimates. Regularization adds a penalty term to the loss function, shrinking coefficients toward zero.

Ridge Regression

  • L2 penalty shrinks coefficients—minimizes (yiy^i)2+λbj2\sum(y_i - \hat{y}_i)^2 + \lambda\sum b_j^2, where λ\lambda controls penalty strength
  • Handles multicollinearity—when predictors are highly correlated, ridge stabilizes coefficient estimates by shrinking them toward each other
  • Never eliminates variables entirely—coefficients approach but never reach zero, so all predictors remain in the model

Lasso Regression

  • L1 penalty enables variable selection—minimizes (yiy^i)2+λbj\sum(y_i - \hat{y}_i)^2 + \lambda\sum |b_j|, which can shrink coefficients exactly to zero
  • Produces sparse, interpretable models—automatically identifies and retains only the most important predictors
  • Better for high-dimensional data—when you have many potential predictors, lasso helps you find the essential few

Compare: Ridge vs. Lasso Regression—both prevent overfitting through regularization, but ridge keeps all variables (shrunk) while lasso performs automatic variable selection. Use lasso when you suspect many predictors are irrelevant; use ridge when all predictors likely matter but are correlated.


Model Selection Strategies

Sometimes the challenge isn't fitting a model—it's deciding which predictors to include. These methods systematically search for the best subset of variables.

Stepwise Regression

  • Iteratively adds or removes predictors—forward selection starts empty and adds variables; backward elimination starts full and removes them
  • Uses statistical criteria for decisions—typically based on p-values, AIC, or BIC at each step
  • Useful but imperfect—can miss the globally optimal model and may capitalize on chance; best used for exploratory analysis rather than confirmatory testing

Compare: Stepwise vs. Lasso Regression—both simplify models by reducing predictors, but stepwise uses discrete add/remove decisions while lasso uses continuous shrinkage. Lasso is generally preferred in modern practice because it's less prone to overfitting.


Specialized Applications

Some data structures require tailored regression approaches. These methods address specific challenges like time dependence or distributional asymmetry.

Time Series Regression

  • Accounts for temporal structure—incorporates trends, seasonality, and lagged values of the dependent variable
  • Addresses autocorrelation—standard regression assumes independent errors, but time series data often has correlated residuals
  • Essential for forecasting—predicts future values based on historical patterns and predictor trajectories

Quantile Regression

  • Estimates conditional quantiles, not means—predicts the median, 25th percentile, 90th percentile, or any other quantile of yy
  • Robust to outliers—unlike OLS, which is heavily influenced by extreme values, quantile regression provides stable estimates across the distribution
  • Reveals heterogeneous effects—shows how predictors affect different parts of the outcome distribution differently (e.g., does education boost income more at the top or bottom?)

Compare: OLS vs. Quantile Regression—OLS estimates the conditional mean and assumes homoscedasticity, while quantile regression estimates any conditional quantile and handles heterogeneous variance. Use quantile regression when you care about effects beyond the average or when outliers are a concern.


Quick Reference Table

ConceptBest Examples
Continuous outcome, linear relationshipSimple Linear Regression, Multiple Linear Regression
Non-linear patternsPolynomial Regression, Nonlinear Regression
Binary/categorical outcomeLogistic Regression
Preventing overfittingRidge Regression, Lasso Regression
Variable selectionLasso Regression, Stepwise Regression
Correlated predictors (multicollinearity)Ridge Regression
Time-dependent dataTime Series Regression
Robust to outliers / distributional analysisQuantile Regression

Self-Check Questions

  1. You have a dataset with 50 predictors, many of which are likely irrelevant. Which regression method would simultaneously fit the model and identify the most important variables?

  2. Compare and contrast Ridge and Lasso regression: What do they share in common, and when would you choose one over the other?

  3. A researcher wants to predict whether a customer will churn (yes/no) based on usage patterns. Why would linear regression be inappropriate, and what method should they use instead?

  4. Your scatter plot shows a clear U-shaped relationship between study hours and test anxiety. Which two regression types could capture this pattern, and how do they differ in approach?

  5. An economist studying income inequality wants to understand how education affects earnings at the 10th, 50th, and 90th percentiles of the income distribution. Which regression method is designed for this purpose, and why is it preferable to standard OLS here?