Hypothesis Testing for Regression Coefficients
Hypothesis testing for regression coefficients lets you determine whether a predictor variable has a statistically significant linear relationship with the response variable, or whether the observed association could plausibly be due to chance. This is the foundation of inference in simple linear regression: without it, you have estimates but no way to judge whether those estimates reflect real patterns in the population.
Formulating Null and Alternative Hypotheses
The null hypothesis claims that the true population coefficient is zero, meaning the predictor has no linear effect on the response:
where is the population coefficient for the predictor. If this is true, knowing the predictor's value tells you nothing (linearly) about the response.
The alternative hypothesis claims the coefficient is not zero, meaning a linear relationship does exist:
This is a two-sided test. Sometimes you have a directional prediction, which calls for a one-sided alternative:
- if you expect a positive relationship
- if you expect a negative relationship
For example, if a researcher believes that increased advertising expenditure leads to higher sales, the appropriate alternative would be . One-sided tests are more powerful in the hypothesized direction but can't detect effects in the opposite direction, so only use them when you have a strong prior reason.
Conducting Hypothesis Tests Using t-Tests
The t-test for a regression coefficient compares the estimated coefficient to the variability you'd expect from sampling alone. Here's the process:
- Compute the test statistic. The t-statistic measures how many standard errors the estimate is from zero:
where is the estimated coefficient and is its standard error.
-
Understand the standard error. The standard error of captures how much the estimate would vary across repeated samples. It depends on the variance of the residuals and the spread of the predictor values. Larger residual variance inflates the standard error; greater spread in the predictor shrinks it.
-
Determine the degrees of freedom. For the t-distribution, degrees of freedom equal , where is the number of observations and is the number of predictors. In simple linear regression, , so degrees of freedom are .
-
Compare to the critical value or compute a p-value. If the absolute value of the t-statistic exceeds the critical value for your chosen and degrees of freedom, you reject . For instance, with , , and , the degrees of freedom are , and the two-tailed critical value is approximately .
Note: Since this is a simple linear regression course, you'll typically have . The formula generalizes to multiple regression, but for now, think of it as .

Interpreting Hypothesis Test Results
Rejecting or Failing to Reject the Null Hypothesis
If you reject : There is sufficient evidence that . The predictor has a statistically significant linear relationship with the response. Changes in the predictor are associated with changes in the response beyond what you'd expect from random variation alone.
If you fail to reject : The data don't provide strong enough evidence to conclude that . This does not mean the predictor is unrelated to the response. It means the signal wasn't strong enough relative to the noise in your data. A small sample size, high variability, or a genuinely weak effect could all produce this result.

Understanding the Coefficient's Sign and Magnitude
The sign of tells you the direction of the linear relationship:
- Positive coefficient: as the predictor increases, the response tends to increase (direct relationship)
- Negative coefficient: as the predictor increases, the response tends to decrease (inverse relationship)
The magnitude tells you the size of the effect. Specifically, represents the expected change in the response for a one-unit increase in the predictor, holding all else constant. If the coefficient for "age" is 0.5, then each additional year of age is associated with a 0.5-unit increase in the response.
Keep in mind that a coefficient can be large in magnitude but not statistically significant (if the standard error is also large), or small in magnitude but highly significant (if estimated very precisely). Statistical significance and practical importance are separate questions.
Significance of Regression Coefficients
Using P-Values to Determine Significance
The p-value is the probability of observing a t-statistic as extreme as (or more extreme than) the one you calculated, assuming the null hypothesis is true. It quantifies how surprising your data are under .
- A small p-value (less than ) means the observed result would be unlikely if were truly zero. You reject and conclude the coefficient is statistically significant. For example, if and the p-value is 0.02, you reject .
- A large p-value (greater than ) means the observed result is reasonably consistent with . You fail to reject . For example, if and the p-value is 0.15, you do not reject .
The p-value does not tell you the probability that is true. It tells you how likely the observed data (or something more extreme) would be if were true. That distinction matters.
Choosing an Appropriate Significance Level
The significance level is the threshold you set before testing. It represents the maximum probability of a Type I error, which is rejecting when it's actually true (a false positive).
Common choices:
- : Stringent. Requires very strong evidence to reject . Used when false positives carry serious consequences, such as in medical research or regulatory decisions.
- : The most common default across many fields. Balances the risk of false positives and false negatives.
- : More lenient. Useful in exploratory analyses or when missing a real effect (Type II error) is more costly than a false alarm.
There's a tradeoff: lowering reduces your Type I error rate but increases your Type II error rate (failing to detect a real effect). The right choice depends on the stakes involved and the goals of your analysis.