🥖Linear Modeling Theory

Essential Linear Regression Coefficients

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Linear regression coefficients are the building blocks of every model you'll construct and interpret in this course. You're not just being tested on definitions—exams expect you to understand how coefficients work together to describe relationships, quantify uncertainty, and evaluate model quality. The concepts here connect directly to hypothesis testing, model comparison, diagnostics, and prediction, making them foundational for everything from simple bivariate analysis to complex multiple regression.

When you encounter a regression output, you need to read it like a story: the coefficients tell you what's happening, the standard errors and confidence intervals tell you how certain you can be, and the fit statistics tell you how well the model captures reality. Don't just memorize formulas—know what each coefficient reveals about the underlying data and when to use each metric to answer different analytical questions.

Model Parameters: The Core Relationship

These coefficients define the actual regression line and tell you what the model predicts. They're the heart of your equation: $\hat{y} = \beta_0 + \beta_1 x$ .

Intercept ( $\beta_0$ )

Baseline value—represents the expected value of $y$ when all independent variables equal zero
Anchors the regression line by setting its vertical position on the coordinate plane
Interpretation caveat: only meaningful if $x = 0$ falls within the realistic range of your data

Slope ( $\beta_1$ )

Rate of change—quantifies how much $y$ changes for each one-unit increase in $x$
Sign indicates direction: positive slopes show direct relationships; negative slopes show inverse relationships
Magnitude matters for comparing effect sizes across standardized variables

Compare: Intercept ( $\beta_0$ ) vs. Slope ( $\beta_1$ )—both define the regression equation, but the intercept sets the starting point while the slope determines the trajectory. FRQ tip: if asked to "interpret the regression equation," address both coefficients separately with context.

Uncertainty Quantification: How Precise Are Your Estimates?

These metrics tell you how much your coefficient estimates might vary from sample to sample. They're essential for distinguishing real effects from statistical noise.

Standard Error of Coefficients

Precision measure—quantifies the variability of coefficient estimates across repeated sampling
Smaller is better: low standard errors indicate your estimates are stable and reliable
Foundation for inference: used to construct confidence intervals and calculate t-statistics

Confidence Intervals

Range of plausible values—typically 95% CI means you're confident the true parameter falls within this range
Width indicates precision: narrow intervals suggest reliable estimates; wide intervals signal uncertainty
Excludes zero? If a 95% CI for $\beta_1$ doesn't contain zero, the coefficient is significant at $\alpha = 0.05$

Compare: Standard Error vs. Confidence Interval—standard error is a single number measuring variability, while confidence intervals use that standard error to create a range. Both assess precision, but CIs are more interpretable for communicating uncertainty.

Hypothesis Testing: Is the Effect Real?

These statistics help you determine whether your coefficients reflect genuine relationships or could have occurred by chance. The logic follows: estimate → standardize → evaluate probability.

t-Statistic

Standardized coefficient—calculated as $t = \frac{\beta}{\text{SE}(\beta)}$ , measuring how many standard errors the coefficient is from zero
Larger absolute values indicate stronger evidence against the null hypothesis ( $H_0: \beta = 0$ )
Degrees of freedom matter: critical values depend on sample size, especially in small samples

p-Value

Probability of extremity—the likelihood of observing your result (or more extreme) if the null hypothesis were true
Decision threshold: typically reject $H_0$ when $p < 0.05$ , indicating statistical significance
Not effect size: a tiny p-value doesn't mean a large or important effect—just a detectable one

Compare: t-Statistic vs. p-Value—the t-statistic measures how far your estimate is from zero in standard error units, while the p-value converts that distance into a probability. Always report both: t tells the story, p makes the decision.

Model Fit: How Well Does the Model Work?

These statistics evaluate whether your model captures meaningful variation in the data. They answer: "Is this model actually useful?"

R-Squared ( $R^2$ )

Proportion of variance explained—ranges from 0 to 1, with higher values indicating better explanatory power
Interpretation: an $R^2 = 0.75$ means 75% of the variation in $y$ is accounted for by the model
Limitation: always increases when you add predictors, even useless ones

Adjusted R-Squared

Penalized fit measure—adjusts $R^2$ downward based on the number of predictors relative to sample size
Model comparison tool: use this instead of $R^2$ when comparing models with different numbers of variables
Can decrease if a new predictor doesn't improve fit enough to justify its inclusion

F-Statistic

Overall model test—evaluates whether the regression model explains significantly more variance than a model with no predictors
Calculated as the ratio of explained variance to unexplained variance, adjusted for degrees of freedom
Complements individual t-tests: F tests the model as a whole; t-statistics test each coefficient separately

Compare: $R^2$ vs. Adjusted $R^2$ —both measure fit, but $R^2$ is optimistic (never decreases with more predictors) while adjusted $R^2$ penalizes complexity. For model selection, always prefer adjusted $R^2$ .

Diagnostics: Is Something Wrong?

Diagnostic statistics help you identify problems that could invalidate your model's assumptions or distort your results.

Variance Inflation Factor (VIF)

Multicollinearity detector—measures how much the variance of a coefficient is inflated due to correlation with other predictors
Rule of thumb: VIF > 10 signals problematic multicollinearity; some use VIF > 5 as a warning threshold
Consequences of ignoring: inflated standard errors, unstable coefficients, and unreliable hypothesis tests

Compare: VIF vs. Standard Error—both increase when multicollinearity is present, but VIF specifically isolates the multicollinearity problem while standard errors can be inflated for other reasons (small sample size, high variance in residuals).

Quick Reference Table

Concept	Best Examples
Model parameters	Intercept ( $\beta_0$ ), Slope ( $\beta_1$ )
Precision of estimates	Standard Error, Confidence Intervals
Significance testing	t-Statistic, p-Value
Overall model fit	$R^2$ , Adjusted $R^2$ , F-Statistic
Multicollinearity diagnosis	VIF
Coefficient interpretation	Slope (direction/magnitude), Intercept (baseline)
Model comparison	Adjusted $R^2$ , F-Statistic

Self-Check Questions

If a 95% confidence interval for $\beta_1$ is $[0.23, 0.89]$ , what can you conclude about the coefficient's statistical significance at $\alpha = 0.05$ ? Why?
Compare and contrast $R^2$ and adjusted $R^2$ : when would these two statistics lead you to different conclusions about model quality?
A regression output shows a slope of 2.5 with a standard error of 0.5. Calculate the t-statistic and explain what it tells you about the relationship.
Which two statistics would you examine first if you suspected multicollinearity was inflating your standard errors? What values would concern you?
An FRQ asks you to "interpret the regression equation $\hat{y} = 12.4 + 3.2x$ in context." What specific information must you include for both the intercept and slope to earn full credit?

🥖Linear Modeling Theory

Essential Linear Regression Coefficients

Why This Matters

Model Parameters: The Core Relationship

Intercept (β0\beta_0β0​)

Slope (β1\beta_1β1​)

Uncertainty Quantification: How Precise Are Your Estimates?

Standard Error of Coefficients

Confidence Intervals

Hypothesis Testing: Is the Effect Real?

t-Statistic

p-Value

Model Fit: How Well Does the Model Work?

R-Squared (R2R^2R2)

Adjusted R-Squared

F-Statistic

Diagnostics: Is Something Wrong?

Variance Inflation Factor (VIF)

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes

Intercept ( $\beta_0$ )

Slope ( $\beta_1$ )

R-Squared ( $R^2$ )