---
title: "Unit 2 – Probability, Random Variables, and Probability Distributions"
description: "Unit 2 – Probability, Random Variables, and Probability Distributions - AP Statistics unit content"
canonical: "https://fiveable.me/ap-stats/unit-2"
type: "unit"
subject: "AP Statistics"
unit: "Unit 2 – Probability, Random Variables, and Probability Distributions"
---

# Unit 2 – Probability, Random Variables, and Probability Distributions

## Overview

Unit 2 covers two main tracks: categorical relationships (two-way tables, bar graphs, conditional distributions) and quantitative relationships (scatterplots, correlation, linear regression, residuals, and transformations). The central skill is describing and modeling associations while recognizing that correlation does not imply causation.

## AP CED Alignment

This unit hub is organized around AP Course and Exam Description topics, skills, and exam task types when they are available in the source data.
- 2.1: Introducing Statistics: Are Variables Related?
- 2.2: Representing Two Categorical Variables
- 2.3: Statistics for Two Categorical Variables
- 2.4: Representing the Relationship Between Two Quantitative Variables
- 2.5: Correlation
- 2.6: Linear Regression Models
- 2.7: Residuals
- 2.8: Least Squares Regression
- 2.9: Analyzing Departures from Linearity
- 2.1: Asking whether variables are related
- 2.2-2.3: Two-way tables and categorical distributions
- 2.4: Describing scatterplots
- 2.5: Correlation r
- 2.6-2.7: Linear regression models and residuals
- 2.8: Least squares regression: formulas and r²
- 2.9: Influential points and data transformations
- Skill Category 4 - Statistical Argumentation
- Skill Category 2 - Data Analysis
- FRQ 1 – Focus on Exploring Data
- FRQ 6 – Investigative Task

## Topics

- [2.1: Introducing Statistics: Are Variables Related?](/ap-stats/unit-2/introducing-statistics-are-variables-related/study-guide/Mh7Se81sjpqSYhL2ihl1): Frames the unit by asking whether apparent patterns between two variables are real or due to random variation. Introduces explanatory and response variables and the distinction between observational studies and experiments.
- [2.2: Representing Two Categorical Variables](/ap-stats/unit-2/representing-two-categorical-variables/study-guide/adVExzxFopPqv91AYcgp): Covers two-way tables, joint relative frequencies, and graphical displays including side-by-side bar graphs, segmented bar graphs, and mosaic plots for comparing two categorical variables.
- [2.3: Statistics for Two Categorical Variables](/ap-stats/unit-2/statistics-for-two-categorical-variables/study-guide/0SjltJrb2O0gXsmucKt1): Focuses on calculating and comparing marginal and conditional relative frequencies from a two-way table to determine whether two categorical variables are associated.
- [2.4: Representing the Relationship Between Two Quantitative Variables](/ap-stats/unit-2/representing-relationship-between-two-quantitative-variables/study-guide/3rWWsKXcnbYlqY64hQ1j): Introduces scatterplots for bivariate quantitative data and the four-part description framework: form, direction, strength, and unusual features.
- [2.5: Correlation](/ap-stats/unit-2/correlation/study-guide/LlS81pC6QricXgIKNuFM): Defines the correlation coefficient r, its formula, its properties (unit-free, range -1 to 1), and the critical principle that correlation does not imply causation.
- [2.6: Linear Regression Models](/ap-stats/unit-2/linear-regression-models/study-guide/PSt5cfDuvB5nu60DHulR): Introduces the regression equation ŷ = a + bx, how to calculate predicted values, how to interpret slope and intercept in context, and the risks of extrapolation.
- [2.7: Residuals](/ap-stats/unit-2/residuals/study-guide/zdTJQZw0UVGswyK6kkEF): Defines residuals as y - ŷ and explains how residual plots are used to assess whether a linear model is appropriate for a data set.
- [2.8: Least Squares Regression](/ap-stats/unit-2/least-squares-regression/study-guide/cRc4EhpHno3A4KvWrqyj): Covers the LSRL formulas for slope (b = r(sy/sx)) and intercept (a = y-bar - b(x-bar)), the meaning of r², and how to interpret regression coefficients in context.
- [2.9: Analyzing Departures from Linearity](/ap-stats/unit-2/analyzing-departures-linearity/study-guide/Krgk1LYlZMysG2Etzh4W): Identifies outliers, high-leverage points, and influential points in regression, and introduces log and power transformations to linearize curved data sets.

## Hardest Topics And Analytics

Snapshot: practice snapshot
This snapshot uses Fiveable practice activity to show where students tend to miss questions and which review moves are worth prioritizing first.
- **67% average MCQ accuracy** (Across 8.1k multiple-choice practice attempts for this unit.)
- **8.1k MCQ attempts** (Practice activity included in this snapshot.)
- **78% average FRQ score** (Across 23 scored free-response attempts for this unit.)
- **2.3: Statistics for Two Categorical Variables**: 39% MCQ miss rate across 1074 attempts. Review Statistics for Two Categorical Variables with attention to how the concept appears in AP-style source and evidence questions.
- **2.2: Representing Two Categorical Variables**: 36% MCQ miss rate across 972 attempts. Review Representing Two Categorical Variables with attention to how the concept appears in AP-style source and evidence questions.
- **2.8: Least Squares Regression**: 35% MCQ miss rate across 926 attempts. Review Least Squares Regression with attention to how the concept appears in AP-style source and evidence questions.
- **2.4: Representing the Relationship Between Two Quantitative Variables**: 34% MCQ miss rate across 1071 attempts. Review Representing the Relationship Between Two Quantitative Variables with attention to how the concept appears in AP-style source and evidence questions.

## Review Notes

### 2.1: Asking whether variables are related

Topic 2.1 frames the entire unit: when you observe an apparent pattern between two variables, you must ask whether it reflects a real association or just random variation. The type of variables determines which tools you use next.

- **Explanatory variable**: The variable used to explain or predict; placed on the x-axis of a scatterplot.
- **Response variable**: The outcome being measured or predicted; placed on the y-axis.
- **Random variation**: Apparent patterns in data can arise by chance alone, so observed associations may not reflect a true relationship.
- **Causation vs. correlation**: An association between two variables does not establish that one causes the other; only a well-designed randomized experiment can support causal claims.

**Checkpoint:** Before analyzing any two-variable data set, identify which variable is explanatory and which is the response, and note whether the data come from an observational study or an experiment.

### 2.2-2.3: Two-way tables and categorical distributions

Two-way tables (contingency tables) display counts or relative frequencies for two categorical variables. The three types of relative frequency serve different purposes: joint frequencies describe individual cells, marginal frequencies describe one variable overall, and conditional frequencies let you compare distributions across groups to detect association.

- **Joint relative frequency**: A cell count divided by the grand total; describes the proportion of the whole table in that cell.
- **Marginal relative frequency**: A row or column total divided by the grand total; describes the overall distribution of one variable.
- **Conditional relative frequency**: A cell count divided by its row or column total; used to compare distributions of one variable within categories of the other.
- **Association (categorical)**: Two categorical variables are associated if the conditional distributions of one variable differ across categories of the other.
- **Graphical displays**: Side-by-side bar graphs, segmented bar graphs, and mosaic plots all show one categorical variable broken down by another; segmented and mosaic plots make comparing conditional distributions especially clear.

**Checkpoint:** Given a two-way table, practice computing all three types of relative frequency and deciding which one answers a specific question about association.

Frequency type | Denominator | What it tells you
--- | --- | ---
Joint relative frequency | Grand total | Proportion of all observations in that cell
Marginal relative frequency | Grand total | Overall distribution of one variable
Conditional relative frequency | Row or column total | Distribution of one variable within a subgroup

### 2.4: Describing scatterplots

A scatterplot places the explanatory variable on the x-axis and the response variable on the y-axis, with one dot per observation. Every scatterplot description must address four features: form, direction, strength, and unusual features such as outliers or clusters.

- **Form**: Whether the pattern is linear or nonlinear.
- **Direction**: Positive association means both variables tend to increase together; negative association means one increases as the other decreases.
- **Strength**: How closely points follow the pattern; described as strong, moderate, or weak.
- **Unusual features**: Outliers (points far from the pattern) and clusters (groups of points separated from others) should be noted.

**Checkpoint:** Practice writing a complete four-part scatterplot description using a real data context, such as height vs. weight or study time vs. test score.

### 2.5: Correlation r

The correlation coefficient r quantifies the direction and strength of the linear association between two quantitative variables. It is calculated as r = (1/(n-1)) x sum of [(xi - x-bar)/sx times (yi - y-bar)/sy], but technology is used in practice. Key properties: r is unit-free, always between -1 and 1, and sensitive to outliers.

- **r = 1 or r = -1**: Perfect linear association; all points fall exactly on a line.
- **r = 0**: No linear association; does not rule out a nonlinear pattern.
- **Unit-free**: Changing the units of x or y does not change r.
- **Outlier effect**: A single outlier can substantially increase or decrease r, so always check the scatterplot.
- **Correlation does not imply causation**: A strong r value does not mean x causes y; lurking variables may explain the association.

**Checkpoint:** Given r = 0.92, can you conclude the relationship is linear? No. A high r only means a strong linear pattern if the scatterplot also shows a linear form.

### 2.6-2.7: Linear regression models and residuals

The regression equation ŷ = a + bx predicts the response variable from the explanatory variable. The slope b is the predicted change in y for each one-unit increase in x; the intercept a is the predicted y when x = 0, which sometimes has no meaningful interpretation. A residual is the difference between the actual value and the predicted value: residual = y - ŷ. Residual plots are used to check whether a linear model is appropriate.

- **Predicted value ŷ**: The value on the regression line for a given x; not the actual observed y.
- **Residual**: y - ŷ; positive means the actual value is above the line, negative means below.
- **Residual plot**: A plot of residuals vs. x or ŷ; a random scatter with no pattern supports a linear model.
- **Extrapolation**: Predicting outside the range of observed x-values; predictions become less reliable the further you extrapolate.
- **Curved residual pattern**: A systematic curve in the residual plot indicates the linear model is not appropriate for the data.

**Checkpoint:** If a residual plot shows a U-shaped pattern, the linear model is not a good fit. A patternless scatter of residuals supports the linear model.

Residual pattern | Interpretation
--- | ---
Random scatter around zero | Linear model is appropriate
Curved (U-shape or arch) | Nonlinear model may fit better
Fan shape (spread increases) | Variability is not constant across x-values

### 2.8: Least squares regression: formulas and r²

The LSRL minimizes the sum of squared residuals and always passes through the point (x-bar, y-bar). The slope formula b = r(sy/sx) connects correlation directly to the regression line. The coefficient of determination r² tells you the proportion of variation in y that is explained by the linear relationship with x.

- **Slope formula**: b = r(sy/sx); the slope depends on both the correlation and the relative spread of the two variables.
- **Intercept formula**: a = y-bar - b(x-bar); the line always passes through the means of both variables.
- **r²**: The square of r; interpreted as the proportion of variation in the response variable explained by the explanatory variable in the linear model.
- **Interpreting slope in context**: For every one-unit increase in x, the predicted y increases (or decreases) by b units.
- **Intercept interpretation**: The predicted value of y when x = 0; only meaningful if x = 0 is within or near the range of observed data.

**Checkpoint:** If r = 0.8, then r² = 0.64, meaning 64% of the variation in the response variable is explained by the linear relationship with the explanatory variable.

### 2.9: Influential points and data transformations

Not all data sets are well described by a linear model. Topic 2.9 covers two situations: unusual points that distort the LSRL, and curved relationships that require transformation before fitting a line.

- **Outlier in regression**: A point with a large residual that does not follow the general trend of the other data.
- **High-leverage point**: A point with an x-value much larger or smaller than the rest; it has the potential to pull the line.
- **Influential point**: Any point whose removal substantially changes the slope, intercept, or correlation; outliers and high-leverage points are often influential.
- **Transformation**: Applying a function such as natural log or squaring to x or y can linearize a curved relationship so that a linear model becomes appropriate.
- **Evidence of better fit after transformation**: A more random residual plot and an r² closer to 1 after transformation indicate the transformed model fits better than the original.

**Checkpoint:** To decide whether a transformation improved the model, compare the residual plots and r² values before and after. More randomness in residuals and higher r² support the transformed model.

## Study Guides

- [2.2 Summary Statistics for Two Categorical Variables](/ap-stats/unit-2/statistics-for-two-categorical-variables/study-guide/0SjltJrb2O0gXsmucKt1)
- [2.3 Estimating Probabilities Using Simulation](/ap-stats/unit-2/estimating-probabilities-using-simulation/study-guide/ABwpnnUf4VCXeVbB9q72)
- [2.8 Introduction to Random Variables and Probability Distributions](/ap-stats/unit-2/intro-random-variables-probability-distributions/study-guide/B5MJ1YqQJ4D455wegCvz)
- [2.12 Sampling Distributions and the Central Limit Theorem](/ap-stats/unit-2/central-limit-theorem/study-guide/DPmpebCrsJBYfpSgOKn3)
- [2.7 Independent Events and Unions of Events](/ap-stats/unit-2/independent-events-unions-events/study-guide/aMOuOhtDQIJAtgZx3JNz)
- [2.1 Tabular and Graphical Representations for the Distributions of Two Categorical Variables](/ap-stats/unit-2/representing-two-categorical-variables/study-guide/adVExzxFopPqv91AYcgp)
- [2.6 Conditional Probability](/ap-stats/unit-2/conditional-probability/study-guide/fiabQj72yj93Hti9l8oR)
- [2.4 Introduction to Probability](/ap-stats/unit-2/intro-probability/study-guide/gfnBWfyMANOxF3vWLrbA)
- [2.5 Mutually Exclusive Events](/ap-stats/unit-2/mutually-exclusive-events/study-guide/iBljImMDLJ8bSWeWXbi6)
- [Unit 2 Overview: Probability, Random Variables, and Probability Distributions](/ap-stats/unit-2/review/study-guide/JeaowG56kC80Eu94VWs7)
- [2.11 The Normal Distribution](/ap-stats/unit-2/normal-distribution-revisited/study-guide/dx4vMcx3WjSw68f1Ov66)
- [2.9 Parameters of Random Variables](/ap-stats/unit-2/mean-standard-deviation-random-variables/study-guide/g1wJm2o6V4wCoqbsC5Px)
- [2.10 The Binomial Distribution](/ap-stats/unit-2/intro-binomial-distribution/study-guide/uvU3qsHmVgEuPjvhgaEC)

## Practice Preview

### Multiple-choice practice

- **AP-style practice question**: Skill Category 4 - Statistical Argumentation | An analyst models the resale value of a specific car model ($y$, in dollars) based on its mileage ($x$, in thousands of miles) using the regression line $\hat{y} = 24,000 - 150x$. Which of the following is the best interpretation of the slope?
- **AP-style practice question**: Skill Category 2 - Data Analysis | A scatterplot of exam scores versus study time shows a strong positive linear trend. One student reported a study time much higher than the mean but received a score exactly predicted by the regression line. How would removing this student's data point affect the regression analysis?
- **AP-style practice question**: Skill Category 2 - Data Analysis | A regression model predicts crop yield from rainfall. Point A has a rainfall value equal to the mean and a large positive residual. Point B has a rainfall value far above the mean and a large positive residual. Which statement best compares the influence of these points?
- **AP-style practice question**: Skill Category 2 - Data Analysis | In a dataset with a mean x-value of 50, Point J has x=50, Point K has x=80 on the regression line, and Point L has x=80 far from the regression line. Which statement correctly compares the leverage of these points?
- **AP-style practice question**: Skill Category 2 - Data Analysis | A regression of house price versus size yields a positive slope. Point A represents a large house with a price much higher than predicted. Point B represents a large house with a price much lower than predicted. Which statement describes the effect of adding each point individually?
- **AP-style practice question**: Skill Category 4 - Statistical Argumentation | A limnologist models dissolved oxygen ($y$) in mg/L against water temperature ($x$) in °C using the least-squares regression equation $\hat{y} = 14.6 - 0.28x$. Which of the following is the best interpretation of the slope coefficient?

### FRQ practice

- **Comparing peach variety weight distributions**: FRQ 1 – Focus on Exploring Data | Comparing peach variety weight distributions
- **Linear regression residuals and model fit assessment**: FRQ 6 – Investigative Task | Linear regression residuals and model fit assessment

## Key Terms

- **Explanatory Variable**: The variable used to explain or predict the response variable; placed on the x-axis of a scatterplot.
- **Response Variable**: The outcome variable being predicted or measured; placed on the y-axis of a scatterplot.
- **Two-Way Table**: A table that displays counts or relative frequencies for two categorical variables simultaneously, also called a contingency table.
- **Conditional Relative Frequency**: A cell frequency divided by its row or column total; used to compare distributions of one categorical variable within subgroups of another.
- **Marginal Relative Frequencies**: Row or column totals in a two-way table divided by the grand total; describe the overall distribution of one variable regardless of the other.
- **Scatterplot**: A graph that displays two quantitative variables as ordered pairs, with the explanatory variable on the x-axis and the response variable on the y-axis.
- **Correlation Coefficient**: The value r that measures the direction and strength of the linear association between two quantitative variables; unit-free and always between -1 and 1.
- **Correlation Does Not Imply Causation**: A strong r value does not establish that changes in x cause changes in y; lurking variables or confounding may explain the association.
- **Simple linear regression**: A model of the form ŷ = a + bx that uses one explanatory variable to predict a response variable using the least squares criterion.
- **Residual Plot**: A graph of residuals (y - ŷ) plotted against x or ŷ; a random scatter with no pattern supports the use of a linear model.
- **Coefficient of Determination**: r², the proportion of variation in the response variable explained by the linear relationship with the explanatory variable.
- **Influential Point**: A data point whose removal substantially changes the slope, intercept, or correlation of the regression line; often an outlier or high-leverage point.
- **Mosaic Plot**: A graphical display for two categorical variables where rectangle areas are proportional to frequencies, making it easy to compare conditional distributions.
- **direction of association**: Whether the relationship between two variables is positive (both increase together) or negative (one increases as the other decreases).
- **Bivariate Data**: Data collected on two variables for each individual in a sample, used to investigate whether the variables are related.

## Common Mistakes

- **Confusing correlation with causation**: A high r value, even r = 0.99, does not mean x causes y. Always note that lurking variables or confounding could explain the association, especially when data come from an observational study.
- **Using the wrong relative frequency type**: Joint, marginal, and conditional relative frequencies answer different questions. To check for association between two categorical variables, you need conditional relative frequencies, not marginal ones.
- **Interpreting the intercept without checking context**: The intercept a is the predicted y when x = 0. If x = 0 is outside the range of the data or has no real-world meaning (for example, a person with 0 hours of sleep), the intercept should not be interpreted literally.
- **Ignoring the residual plot**: A high r² does not guarantee a linear model is appropriate. Always check the residual plot. A curved pattern in the residuals means the linear model is not the right choice, even if r² looks high.
- **Extrapolating beyond the data range**: Plugging an x-value outside the observed range into ŷ = a + bx produces an unreliable prediction. The further you extrapolate, the less trustworthy the result.

## Exam Connections

- **Interpret in context**: AP Statistics consistently asks you to interpret statistical values in the context of the problem, not just state a formula. For slope, intercept, r, and r², your answer must name the actual variables and units. A response that says only 'the slope is 2.3' without context will not receive full credit.
- **Justify model appropriateness**: A common task in Unit 2 is deciding whether a linear model is appropriate. You are expected to cite specific evidence: the scatterplot shows a linear form, the residual plot shows no pattern, and r² is reasonably high. For transformed data, you compare residual plots and r² before and after transformation.
- **Distinguish association from causation**: Questions about two-variable data often include a follow-up asking whether the association implies causation. The correct response identifies the study design (observational vs. experimental), notes that confounding or lurking variables may be present, and states that causation requires random assignment.

## Final Review Checklist

- **Final Unit 2 review checklist**: Use this list to confirm you can handle every major skill in Unit 2 before the exam.
- **Two-way tables**: Given a contingency table, calculate joint, marginal, and conditional relative frequencies and use them to determine whether two categorical variables are associated.
- **Scatterplot description**: Describe any scatterplot using all four components: form (linear or nonlinear), direction (positive or negative), strength (strong, moderate, or weak), and unusual features (outliers, clusters).
- **Correlation r**: Interpret r in context, explain its properties (unit-free, between -1 and 1), and explain why a strong r does not establish causation.
- **Regression equation**: Write and use ŷ = a + bx to calculate predicted values, interpret slope and intercept in context, and recognize when the intercept has no meaningful real-world interpretation.
- **Residuals and residual plots**: Calculate a residual as y - ŷ, construct or read a residual plot, and use the pattern (or lack of pattern) to judge whether a linear model is appropriate.
- **r² and LSRL formulas**: Use b = r(sy/sx) and a = y-bar - b(x-bar) to find the LSRL, and interpret r² as the proportion of variation in y explained by the linear model.
- **Influential points and transformations**: Distinguish outliers, high-leverage points, and influential points, and explain how a log or power transformation can improve model fit using residual plots and r² as evidence.

## Study Plan

- **Step 1: Categorical data (Topics 2.1-2.3)**: Start with two-way tables. Practice computing joint, marginal, and conditional relative frequencies from a single table, then sketch a segmented bar graph from the same data. Confirm you can state whether an association exists and explain why using conditional distributions.
- **Step 2: Scatterplots and correlation (Topics 2.4-2.5)**: Describe several scatterplots using form, direction, strength, and unusual features. Then practice interpreting r values in context, including explaining why a strong r does not imply causation and why r near 1 does not guarantee a linear form.
- **Step 3: Regression models and predictions (Topic 2.6)**: Given a regression equation, practice writing slope and intercept interpretations in context using the exact units of the variables. Calculate predicted values and identify when a prediction involves extrapolation.
- **Step 4: Residuals and model assessment (Topics 2.7-2.8)**: Calculate residuals by hand for a few data points, then practice reading residual plots to judge model fit. Use the formulas b = r(sy/sx) and a = y-bar - b(x-bar) to derive the LSRL, and interpret r² in a full sentence with context.
- **Step 5: Departures from linearity (Topic 2.9)**: Review the definitions of outlier, high-leverage point, and influential point with examples. Practice comparing residual plots and r² values before and after a log transformation to explain why the transformed model is a better fit.

## More Ways To Review

- [Topic study guides](/ap-stats/unit-2#topics)
- [FRQ practice](/ap-stats/frq-practice)
- [Cram archive videos](/cram-archives?subject=ap-statistics&unit=unit-2)
- [Cheatsheets](/ap-stats/cheatsheets/unit-2)
- [Key terms](/ap-stats/key-terms)

## FAQs

### What topics are covered in AP Stats Unit 2?

AP Stats Unit 2 covers 9 topics focused on exploring relationships between two variables using correlation, regression, and data displays. Topics include two-way tables (2.2, 2.3), scatterplots (2.4), correlation (2.5), linear regression models (2.6), residuals (2.7), least squares regression (2.8), and analyzing departures from linearity (2.9). The unit builds from asking whether variables are related all the way to evaluating how well a linear model fits data. See [AP Stats Unit 2](/ap-stats/unit-2) for matched practice on each topic.

### How much of the AP Stats exam is Unit 2?

AP Stats Unit 2 makes up 5-7% of the AP exam. That weight covers everything from correlation and scatterplots to least squares regression and residuals. It's a smaller unit by exam weight, but the skills here, especially interpreting regression output and residual plots, show up repeatedly in later units and in FRQs.

### What's on the AP Stats Unit 2 progress check (MCQ and FRQ)?

The AP Stats Unit 2 progress check in AP Classroom includes both MCQ and FRQ parts drawn from the unit's 9 topics. MCQ questions test your ability to read scatterplots, interpret correlation, and identify patterns in two-way tables. FRQ questions typically ask you to describe a linear regression model, interpret residuals, or explain why correlation does not imply causation. The progress check pulls heavily from topics 2.4 through 2.9, so make sure you're comfortable with least squares regression and analyzing departures from linearity before you sit for it. Practice with matched questions at [AP Stats Unit 2](/ap-stats/unit-2).

### How do I practice AP Stats Unit 2 FRQs?

AP Stats Unit 2 FRQs most often ask you to interpret a linear regression model, describe what residuals tell you about a model's fit, or explain the meaning of correlation in context. To practice, focus on writing full sentence interpretations of slope, y-intercept, and r-squared values, since partial credit on the AP exam depends on precise wording. Good practice steps:
- Sketch and describe residual plots for patterns
- Interpret least squares regression equations in context
- Explain why a high correlation does not prove causation Find FRQ-style practice questions at [AP Stats Unit 2](/ap-stats/unit-2).

### Where can I find AP Stats Unit 2 practice questions?

You can find AP Stats Unit 2 MCQ and practice test questions at [AP Stats Unit 2](/ap-stats/unit-2). That page has multiple-choice questions and practice problems covering all 9 topics, from two-way tables and correlation to residuals and least squares regression. For the best prep, mix MCQ practice with short written responses so you're ready for both question types on exam day.

### How should I study AP Stats Unit 2?

Start AP Stats Unit 2 by building a solid understanding of correlation before moving into regression. Correlation measures the strength and direction of a linear relationship, but it does not tell you one variable causes another, and that distinction comes up constantly on the exam. A practical study plan:
1. Review two-way tables and practice calculating relative frequencies (topics 2.2-2.3)
2. Sketch scatterplots and describe form, direction, and strength (topic 2.4)
3. Learn what the correlation coefficient r tells you and what it does not (topic 2.5)
4. Practice writing full interpretations of slope and y-intercept for linear regression models (topic 2.6)
5. Read and describe residual plots to check model fit (topics 2.7-2.8)
6. Study patterns like outliers and influential points that signal departures from linearity (topic 2.9) Spend extra time on residuals and regression interpretation since those are the highest-payoff skills for both the progress check and the AP exam. Use [AP Stats Unit 2](/ap-stats/unit-2) to find practice matched to each topic.

## Structured Data

```json
{"@context":"https://schema.org","@type":"FAQPage","inLanguage":"en","mainEntity":[{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-2#what-topics-are-covered-in-ap-stats-unit-2","name":"What topics are covered in AP Stats Unit 2?","acceptedAnswer":{"@type":"Answer","text":"AP Stats Unit 2 covers 9 topics focused on exploring relationships between two variables using correlation, regression, and data displays. Topics include two-way tables (2.2, 2.3), scatterplots (2.4), correlation (2.5), linear regression models (2.6), residuals (2.7), least squares regression (2.8), and analyzing departures from linearity (2.9). The unit builds from asking whether variables are related all the way to evaluating how well a linear model fits data. See <a href=\"/ap-stats/unit-2\">AP Stats Unit 2</a> for matched practice on each topic."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-2#how-much-of-the-ap-stats-exam-is-unit-2","name":"How much of the AP Stats exam is Unit 2?","acceptedAnswer":{"@type":"Answer","text":"AP Stats Unit 2 makes up 5-7% of the AP exam. That weight covers everything from correlation and scatterplots to least squares regression and residuals. It's a smaller unit by exam weight, but the skills here, especially interpreting regression output and residual plots, show up repeatedly in later units and in FRQs."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-2#whats-on-the-ap-stats-unit-2-progress-check-mcq-and-frq","name":"What's on the AP Stats Unit 2 progress check (MCQ and FRQ)?","acceptedAnswer":{"@type":"Answer","text":"The AP Stats Unit 2 progress check in AP Classroom includes both MCQ and FRQ parts drawn from the unit's 9 topics. MCQ questions test your ability to read scatterplots, interpret correlation, and identify patterns in two-way tables. FRQ questions typically ask you to describe a linear regression model, interpret residuals, or explain why correlation does not imply causation. The progress check pulls heavily from topics 2.4 through 2.9, so make sure you're comfortable with least squares regression and analyzing departures from linearity before you sit for it. Practice with matched questions at <a href=\"/ap-stats/unit-2\">AP Stats Unit 2</a>."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-2#how-do-i-practice-ap-stats-unit-2-frqs","name":"How do I practice AP Stats Unit 2 FRQs?","acceptedAnswer":{"@type":"Answer","text":"AP Stats Unit 2 FRQs most often ask you to interpret a linear regression model, describe what residuals tell you about a model's fit, or explain the meaning of correlation in context. To practice, focus on writing full sentence interpretations of slope, y-intercept, and r-squared values, since partial credit on the AP exam depends on precise wording. Good practice steps:\n- Sketch and describe residual plots for patterns\n- Interpret least squares regression equations in context\n- Explain why a high correlation does not prove causation Find FRQ-style practice questions at <a href=\"/ap-stats/unit-2\">AP Stats Unit 2</a>."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-2#where-can-i-find-ap-stats-unit-2-practice-questions","name":"Where can I find AP Stats Unit 2 practice questions?","acceptedAnswer":{"@type":"Answer","text":"You can find AP Stats Unit 2 MCQ and practice test questions at <a href=\"/ap-stats/unit-2\">AP Stats Unit 2</a>. That page has multiple-choice questions and practice problems covering all 9 topics, from two-way tables and correlation to residuals and least squares regression. For the best prep, mix MCQ practice with short written responses so you're ready for both question types on exam day."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-2#how-should-i-study-ap-stats-unit-2","name":"How should I study AP Stats Unit 2?","acceptedAnswer":{"@type":"Answer","text":"Start AP Stats Unit 2 by building a solid understanding of correlation before moving into regression. Correlation measures the strength and direction of a linear relationship, but it does not tell you one variable causes another, and that distinction comes up constantly on the exam. A practical study plan:\n1. Review two-way tables and practice calculating relative frequencies (topics 2.2-2.3)\n2. Sketch scatterplots and describe form, direction, and strength (topic 2.4)\n3. Learn what the correlation coefficient r tells you and what it does not (topic 2.5)\n4. Practice writing full interpretations of slope and y-intercept for linear regression models (topic 2.6)\n5. Read and describe residual plots to check model fit (topics 2.7-2.8)\n6. Study patterns like outliers and influential points that signal departures from linearity (topic 2.9) Spend extra time on residuals and regression interpretation since those are the highest-payoff skills for both the progress check and the AP exam. Use <a href=\"/ap-stats/unit-2\">AP Stats Unit 2</a> to find practice matched to each topic."}}]}
```
