TLDR
A confidence interval for a regression slope gives you a range of plausible values for the true population slope, built from your sample data as b ± t*(SE_b). You use a t-interval with df = n - 2, check the linear regression conditions first, and interpret the interval in context. On the AP exam, a graphing calculator (LinRegTInt) handles the arithmetic so you can focus on conditions and interpretation.

Why This Matters for the AP Statistics Exam
Inference for slopes shows up in Unit 9, which carries roughly 2-5% of the exam, but the thinking connects to the whole inference part of the course. When you build a slope interval, you practice the same workflow you use for proportions and means: identify the right procedure, check conditions, calculate, then interpret. That workflow appears on both multiple-choice and free-response questions.
You also need to read computer (regression) output, because the exam often hands you a software table instead of raw data. Being able to pull the slope, the standard error of the slope, s, and df from that output is a skill that pays off across the unit. A slope interval can also serve as an entry point into a larger free-response question before the harder parts.
Key Takeaways
- The point estimate is the sample slope b, and the interval is b ± t*(SE_b).
- Use a t-interval for the slope with degrees of freedom df = n - 2.
- The standard error of the slope is SE_b = s / (s_x √(n - 1)), where s is the standard deviation of the residuals.
- The residual standard deviation is s = √(Σ(y_i - ŷ_i)² / (n - 2)); it uses n - 2 because two parameters (α and β) are estimated.
- Check four conditions before building the interval: linearity, equal standard deviation of y across x, independence, and approximate normality.
- Larger sample sizes tend to produce narrower intervals when everything else stays the same.
The Setup: From Sample Line to Population Slope
In linear regression, the sample regression line ŷ = a + bx is an estimate of the population regression line μ_y = α + βx. The value you actually care about for inference is β, the true slope for the whole population.
Your sample slope b is just one estimate. If you collected a new sample, you would likely get a slightly different slope. Across many samples, those slopes form an approximately normal sampling distribution centered at the true slope β. That variability is exactly why a single number is not enough, and why a confidence interval is useful: it gives a range of plausible values for β instead of one point.
Point Estimate
The center of your interval is the point estimate, which is the slope of your sample's least-squares line, b. You calculate it from your data (the same regression skills from Unit 2). From there you add and subtract a margin of error to create a buffer around that estimate.
Margin of Error
The margin of error is the critical value times the standard error of the slope:
ME = t* × SE_b
Your critical value t* depends on the confidence level and the degrees of freedom, df = n - 2. The standard error of the slope is:
SE_b = s / (s_x √(n - 1))
where s is the standard deviation of the residuals and s_x is the sample standard deviation of the x-values.
Putting It Together
The full interval is:
b ± t*(SE_b)
This formula is on the formula sheet, but it is cumbersome to compute by hand. On the AP exam, using a graphing calculator with LinRegTInt (entering your data in L1 and L2) is faster and reduces arithmetic errors, so you can spend your time on conditions and interpretation.
Standard Deviation of the Residuals
The residuals from the sample line are the differences between observed and predicted y-values, y_i - ŷ_i. These residuals estimate how far response values fall from the population regression line.
The standard deviation of the residuals, s, measures how spread out those residuals are:
s = √(Σ(y_i - ŷ_i)² / (n - 2))
It uses n - 2 in the denominator because two parameters, α and β, must be estimated to get the predicted values. This s is your estimate of σ, the standard deviation of the deviations from the population line, and it feeds directly into the standard error.
Conditions to Check
The appropriate interval for a population slope is a t-interval, based on the t-distribution. Before you build it, confirm these conditions.
1. Linear
The true relationship between x and y should be linear. Check this with a residual plot: if there is no clear pattern in the residuals, linearity is reasonable.
2. Constant Standard Deviation of y
The standard deviation of y should not change as x changes. On the residual plot, the points should not fan out or shrink as you move across the x-axis. You are looking for roughly even spread.
3. Independence
Check independence two ways:
- Data came from a random sample or a randomized experiment.
- The 10% condition: when sampling without replacement, n ≤ 0.10N.
4. Normality
For a particular value of x, the responses (y-values) should be approximately normal. You can check this with graphical representations of the residuals. If the distribution is skewed, the sample size should be greater than 30.
Once all four conditions check out, you can construct and interpret your interval.
How to Use This on the AP Statistics Exam
Free Response
- State the procedure: a t-interval for the slope of a regression model.
- Check and name all four conditions using the data or residual plots given, not just listing them generically.
- Show the values you use: b, SE_b, df = n - 2, and t*. If a question gives computer output, pull these directly from the table.
- Report the interval using b ± t*(SE_b), then interpret it in context with reference to the sample and the population it represents.
Reading Computer Output
Regression software tables usually list the slope (b) in the "Coef" column and SE_b in the "SE Coef" column for the predictor row. The "S" value near the bottom is s, the standard deviation of the residuals. Remember df = n - 2 for the interval.
Common Trap
When you describe the slope, use predicted language. Say "a predicted increase of ___ in y for each additional unit of x," not "a 1-unit increase in x causes an increase in y." The interval estimates an association, not a guaranteed cause-and-effect change.
Common Misconceptions
- The interval estimates the slope, not an individual y-value. It tells you plausible values for β, the population slope, not a prediction for one point.
- Degrees of freedom for a slope interval are n - 2, not n - 1. The n - 2 comes from estimating two parameters, α and β.
- The residual standard deviation s also uses n - 2, for the same reason. Do not divide by n or n - 1 here.
- A confidence interval is not a probability statement about one specific interval. The confidence level describes the long-run capture rate: in repeated sampling, about C% of intervals built this way capture the true slope.
- Checking conditions is not optional filler. If linearity or constant spread clearly fails, the t-interval is not appropriate.
- A wider interval is not "wrong." Smaller samples and higher confidence levels naturally produce wider intervals.
Related AP Statistics Guides
- Unit 9 Overview: Slopes
- 9.1 Introducing Statistics: Do Those Points Align?
- 9.4 Setting Up a Test for the Slope of a Regression Model
- 9.3 Justifying a Claim About the Slope of a Regression Model Based on a Confidence Interval
- 9.5 Carrying Out a Test for the Slope of a Regression Model
- 9.6 Skills Focus: Selecting an Appropriate Inference Procedure
Vocabulary
The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.Term | Definition |
|---|---|
confidence interval | A range of values, calculated from sample data, that is likely to contain the true population parameter with a specified level of confidence. |
critical value | A value from the standard normal distribution used to determine the margin of error for a given confidence level. |
explanatory variable | A variable whose values are used to explain or predict corresponding values for the response variable. |
independence | The condition that observations in a sample are not influenced by each other, typically ensured through random sampling or randomized experiments. |
least-squares regression line | A linear model that minimizes the sum of squared residuals to find the best-fitting line through a set of data points. |
linearity | The condition that the true relationship between two variables follows a straight line. |
margin of error | The amount by which a sample statistic is likely to vary from the corresponding population parameter, calculated as the critical value times the standard error. |
normality | The condition that data follows an approximately normal (bell-shaped) distribution. |
population regression line | The true linear relationship μy = α + βx between the response and explanatory variables in the entire population. |
random sample | A sample selected from a population in such a way that every member has an equal chance of being chosen, reducing bias and allowing for valid statistical inference. |
randomized experiment | A study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships. |
regression model | A statistical model that describes the relationship between a response variable (y) and one or more explanatory variables (x). |
residual | The difference between the actual observed value and the predicted value in a regression model, calculated as residual = y - ŷ. |
response variable | A variable whose values are being explained or predicted based on the explanatory variable. |
sample regression line | The line ŷ = a + bx calculated from sample data that estimates the population regression line. |
sample statistic | A numerical value calculated from sample data that is used to estimate the corresponding population parameter. |
sampling distribution | The probability distribution of a sample statistic (such as a sample proportion) obtained from repeated sampling of a population. |
sampling without replacement | A sampling method in which an item selected from a population cannot be selected again in subsequent draws. |
simple random sample | A sample selected from a population such that every possible sample of the same size has an equal chance of being chosen. |
skewed | A distribution that is not symmetric, with one tail longer or more pronounced than the other. |
slope | The value b in the regression equation ŷ = a + bx, representing the rate of change in the predicted response for each unit increase in the explanatory variable. |
slope of a regression model | The coefficient that represents the rate of change in the predicted response variable for each unit increase in the explanatory variable in a linear regression equation. |
standard deviation | A measure of how spread out data values are from the mean, represented by σ in the context of a population. |
standard deviation of residuals | A measure of the spread of residuals around the regression line, estimated by s = √(Σ(yi - ŷi)²/(n-2)). |
standard deviation of x values | A measure of the spread of the x-variable values in the sample, denoted as sx in the standard error formula. |
standard error of the slope | A measure of the variability of the slope estimate across different samples, calculated as s divided by (sx times the square root of n-1). |
t-interval | A confidence interval procedure that uses the t-distribution, appropriate for estimating the slope of a regression model. |
t* | The critical value from the t-distribution used to construct a confidence interval for the slope of a regression model. |
Frequently Asked Questions
What is a confidence interval for slope?
A confidence interval for slope is a range of plausible values for the true population slope, beta, in a linear regression model. It uses the sample slope, a critical t-value, and the standard error of the slope.
When do you use a confidence interval for the slope of a regression model?
Use it when you want to estimate the true slope of a linear relationship between two quantitative variables, after checking that linear regression inference conditions are reasonable.
What conditions are needed for regression inference in AP Statistics?
Check linearity, roughly constant standard deviation of responses, independence, and approximate normality of residuals. You should connect each condition to the data, residual plot, or study design given in the question.
What is the standard error of the slope?
The standard error of the slope, SE_b, measures how much the sample slope is expected to vary from sample to sample. In AP Statistics, it is often given in regression output or produced by calculator inference.
How do you interpret a confidence interval for slope?
Interpret it in context as plausible values for the average predicted change in y for each one-unit increase in x. Avoid causal wording unless the data came from a randomized experiment.
How is a slope confidence interval tested on the AP Statistics exam?
You may need to name the t-interval for slope, check conditions, read slope and SE_b from output, calculate or report the interval, and interpret the result in context.