AP Statistics Unit 4 ReviewProbability, Random Variables, and Probability Distributions

Verified for the 2027 examCompiled by AP educators~10–20% of the exam
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly→ and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc

AP Statistics Unit 4, Probability, Random Variables, and Probability Distributions, covers 12 topics worth 10-20% of the AP exam, centering on how probability quantifies the likelihood of random events and supports statistical inference. You'll work through conditional probability, mutually exclusive events, and independent events before moving into discrete random variables and their distributions. AP Stats Unit 4 then builds toward the binomial distribution, including its parameters, and the geometric distribution, giving you the core tools for modeling real random processes.

unit 4 review

AP Stats Unit 4 is where the course shifts from describing data to quantifying chance. You'll learn the rules of probability (complements, conditional probability, independence), build probability distributions for discrete random variables, and meet the two named distributions of the unit, binomial and geometric. The single biggest idea is that probability describes long-run relative frequency, which is exactly what lets you make predictions and, later, inferences from random data. This unit is worth 10-20% of the AP exam, one of the largest weights in the course.

What this unit covers

Random patterns and simulation (Topics 4.1-4.2)

  • Random processes produce patterns. Streaks in coin flips happen by chance, so a pattern in data does not automatically mean the variation is non-random. The question is always whether a result is surprising under randomness.
  • Simulation models a random process. You assign every possible outcome a value determined by chance (random digits, a random number generator), run many trials, record counts, and use the relative frequency of an event to estimate its probability.
  • The law of large numbers is the quiet engine here. As the number of trials grows, the simulated relative frequency settles toward the true probability.

Probability rules (Topics 4.3-4.6)

  • The sample space is the set of all possible non-overlapping outcomes. If outcomes are equally likely, P(E) is the number of outcomes in E divided by the total number of outcomes. Every probability lives between 0 and 1.
  • The complement rule says P(not E) = 1 - P(E). It's often the fastest path, especially for "at least one" problems.
  • Mutually exclusive (disjoint) events cannot happen at the same time, so P(A and B) = 0. Disjoint is about events sharing no outcomes, not about events being unrelated.
  • Conditional probability, P(A | B) = P(A and B) / P(B), updates a probability once you know B happened. The general multiplication rule, P(A and B) = P(A) · P(B | A), comes straight from it.
  • Independence means knowing one event changed nothing. A and B are independent if and only if P(A | B) = P(A), which gives the shortcut P(A and B) = P(A) · P(B).
  • The addition rule, P(A or B) = P(A) + P(B) - P(A and B), handles unions. Subtracting the overlap keeps you from double counting. Two-way tables, Venn diagrams, and tree diagrams are your tools for organizing all of this.

Discrete random variables (Topics 4.7-4.9)

  • A random variable assigns a number to each outcome of a random process. A discrete random variable takes a countable number of values (number of puppies in a litter, sum of two dice), each with a probability, and the probabilities must sum to 1.
  • A probability distribution can be a table, graph, or function. Interpreting one means describing shape, center, and spread of the population it represents, in context.
  • The mean (expected value) is μ_X = Σ x_i · P(x_i), a weighted average. It's the long-run average value, not the value you expect on any single trial. The standard deviation σ_X measures typical distance from that mean.
  • Linear transformations: for Y = a + bX, the mean becomes a + bμ_X and the standard deviation becomes |b|σ_X. Adding a constant shifts the center but never changes spread.
  • Combining variables: means always add (μ of aX + bY is aμ_X + bμ_Y), but variances only add when X and Y are independent, and you add variances, never standard deviations. Even for a difference X - Y, the variance is σ²_X + σ²_Y. Spread grows either way.

Binomial and geometric distributions (Topics 4.10-4.12)

  • A binomial random variable counts the number of successes in n independent trials, each with two outcomes and the same success probability p. Probability of exactly x successes: P(X = x) = C(n, x) p^x (1-p)^(n-x).
  • Binomial parameters are clean. The mean is np and the standard deviation is √(np(1-p)). Interpret both with units and context ("on average, about 12 of the 50 sampled households...").
  • A geometric random variable counts the trial number on which the first success occurs. P(X = x) = (1-p)^(x-1) p, with mean 1/p and standard deviation √(1-p)/p.
  • The quick test for telling them apart: binomial has a fixed number of trials and counts successes; geometric has no fixed n and counts how long until the first success.

Unit 4, Probability, Random Variables, and Probability Distributions at a glance

ConceptWhat it counts or measuresKey formulaMeanStandard deviation
Discrete random variableNumerical outcome of a random processProbabilities sum to 1μ_X = Σ x_i · P(x_i)σ_X = √(Σ(x_i - μ_X)² · P(x_i))
Linear transformation Y = a + bXRescaled or shifted variableSame shape as Xa + bμ_X|b|σ_X
Sum/difference (independent X, Y)Combined variable aX + bYVariances add: a²σ²_X + b²σ²_Yaμ_X + bμ_Y√(a²σ²_X + b²σ²_Y)
BinomialNumber of successes in n fixed trialsP(X = x) = C(n,x) p^x (1-p)^(n-x)np√(np(1-p))
GeometricTrial of the first successP(X = x) = (1-p)^(x-1) p1/p√(1-p)/p

Why Unit 4, Probability, Random Variables, and Probability Distributions matters in AP Stats

Everything after this unit is inference, and inference is just probability run in reverse. A p-value, a confidence level, a margin of error: all of them are probability statements about what random processes produce in the long run. Unit 4 builds the machinery those ideas stand on.

  • The long-run relative frequency interpretation of probability is the exact logic behind p-values and confidence levels in Units 6 through 9.
  • Conditional probability and independence formalize the association questions you asked informally with two-way tables in Unit 2.
  • Random variables and their parameters (mean, standard deviation) are how sampling distributions get described in Unit 5, which makes this unit the bridge between data collection and inference.
  • The variance rules for combining independent random variables explain why two-sample standard errors look the way they do later in the course.

How this unit connects across the course

  • Conditional probability is the formal version of the row and column proportions you computed from two-way tables when exploring association between categorical variables (Unit 2).
  • Random sampling and random assignment (Unit 3) are what make probability calculations valid in the first place. Probability only describes processes that are actually random.
  • The binomial distribution becomes the foundation for the sampling distribution of a sample proportion, and expected value and standard deviation of random variables underlie the sampling distribution of a sample mean (Unit 5).
  • The probability logic of "how surprising is this result if chance alone is at work" is exactly the reasoning behind significance tests for proportions (Unit 6), means (Unit 7), and chi-square procedures (Unit 8).

Key formulas and procedures

  • P(E)=outcomes in Etotal outcomesP(E) = \frac{\text{outcomes in } E}{\text{total outcomes}} when all outcomes are equally likely; the starting definition of probability.
  • Complement rule: P(Ec)=1P(E)P(E^c) = 1 - P(E); the go-to move for "at least one" problems.
  • Addition rule: P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B); finds the probability of A or B without double counting the overlap.
  • Conditional probability: P(AB)=P(AB)P(B)P(A \mid B) = \frac{P(A \cap B)}{P(B)}; updates a probability given that B occurred.
  • General multiplication rule: P(AB)=P(A)P(BA)P(A \cap B) = P(A) \cdot P(B \mid A); works for any two events, dependent or not.
  • Independence check: A and B are independent if and only if P(AB)=P(A)P(A \mid B) = P(A), which gives P(AB)=P(A)P(B)P(A \cap B) = P(A) \cdot P(B).
  • Mean of a discrete random variable: μX=xiP(xi)\mu_X = \sum x_i \cdot P(x_i); the long-run average outcome.
  • Standard deviation of a discrete random variable: σX=(xiμX)2P(xi)\sigma_X = \sqrt{\sum (x_i - \mu_X)^2 \cdot P(x_i)}; typical distance of outcomes from the mean.
  • Linear transformation Y=a+bXY = a + bX: μY=a+bμX\mu_Y = a + b\mu_X and σY=bσX\sigma_Y = |b|\sigma_X; shifting changes the center only, scaling changes both.
  • Combining independent variables: mean of $aX + bY$ is aμX+bμYa\mu_X + b\mu_Y; variance is a2σX2+b2σY2a^2\sigma_X^2 + b^2\sigma_Y^2 (add variances, then square root at the end).
  • Binomial probability: P(X=x)=(nx)px(1p)nxP(X = x) = \binom{n}{x} p^x (1-p)^{n-x}, with μ=np\mu = np and σ=np(1p)\sigma = \sqrt{np(1-p)}.
  • Geometric probability: P(X=x)=(1p)x1pP(X = x) = (1-p)^{x-1} p, with μ=1/p\mu = 1/p and σ=1pp\sigma = \frac{\sqrt{1-p}}{p}.
  • Simulation procedure: assign outcomes to random values, run many trials, record counts, and use relative frequency to estimate a probability.

Unit 4, Probability, Random Variables, and Probability Distributions on the AP exam

This unit carries 10-20% of the exam weight. On the multiple-choice section, expect calculation questions built around two-way tables (find a conditional probability, decide whether two events are independent), expected value problems framed around games or payouts, and binomial or geometric probability calculations where the first job is recognizing which distribution applies.

On the free-response section, probability content often appears as a multi-part question that mixes calculation with interpretation. A typical question might have you compute a probability from a table or tree, then interpret an expected value in context, then combine random variables to find the mean and standard deviation of a total or difference. Two habits earn points consistently. First, show your setup, not just a calculator answer (name the distribution and its parameters, like "binomial with n = 20, p = 0.3"). Second, interpret every parameter in context with units. "The mean is 6" earns less than "over many samples of 20 customers, the average number who buy a warranty is 6." Probability reasoning also resurfaces inside inference free-response questions later, so the conditional probability and independence logic you build here keeps paying off.

Essential questions

  • How can we tell whether a pattern in data reflects something real or is just what randomness looks like?
  • What does it actually mean to say an event has a probability of 0.3?
  • How does knowing one event occurred change the probability of another, and when does it change nothing at all?
  • How do expected value and standard deviation let us summarize and predict the behavior of a random process?

Key terms to know

  • Sample space: The set of all possible non-overlapping outcomes of a random process.
  • Complement: All outcomes not in event E, with probability 1 - P(E).
  • Mutually exclusive (disjoint) events: Events that cannot happen at the same time, so P(A and B) = 0.
  • Conditional probability: The probability of A given that B has occurred, written P(A | B).
  • Independent events: Events where knowing one occurred does not change the probability of the other.
  • Union: The event that A or B (or both) occurs, written P(A ∪ B).
  • Intersection (joint probability): The event that A and B both occur, written P(A ∩ B).
  • Discrete random variable: A variable taking a countable number of numerical values, each with a probability, summing to 1.
  • Expected value: The mean of a random variable, the long-run average outcome over many repetitions.
  • Parameter: A single fixed number describing a population or a random variable's distribution, like μ or σ.
  • Binomial random variable: Counts successes in a fixed number of independent trials with constant success probability p.
  • Geometric random variable: Counts the trial on which the first success occurs in repeated independent trials.
  • Simulation: A model of a random process used to estimate probabilities from the relative frequency of outcomes over many trials.
  • Cumulative probability distribution: Gives the probability that a random variable is less than or equal to each value.

Common mix-ups

  • Mutually exclusive is not the same as independent. In fact, disjoint events with nonzero probabilities are always dependent, because if one happens you know the other didn't.
  • P(A | B) and P(B | A) are different numbers. The probability of testing positive given you have a disease is not the probability of having the disease given a positive test. Always check which event is the "given."
  • When combining random variables, add variances, not standard deviations, and only when the variables are independent. And subtracting variables still adds variance: Var(X - Y) = σ²_X + σ²_Y.
  • Binomial vs. geometric comes down to the question being asked. "How many successes in 10 trials?" is binomial. "How many trials until the first success?" is geometric, and it has no fixed n.

Frequently Asked Questions

What topics are covered in AP Stats Unit 4?

AP Stats Unit 4 covers probability, random variables, and probability distributions across 12 topics. Key topics include Introduction to Probability (4.3), Mutually Exclusive Events, Conditional Probability, Independent Events, Random Variables and Probability Distributions, Mean and Standard Deviation of Random Variables, Combining Random Variables, the Binomial Distribution, and the Geometric Distribution. Here's a quick breakdown by theme: - **Probability foundations:** simulation, mutually exclusive events, conditional probability, independence - **Random variables:** introducing distributions, mean and standard deviation, combining variables - **Named distributions:** binomial distribution (including parameters) and geometric distribution See all 12 topics at /ap-stats/unit-4.

How much of the AP Stats exam is Unit 4?

AP Stats Unit 4 makes up 10-20% of the AP exam, making it one of the more heavily tested units. The unit covers probability, random variables, and probability distributions, including conditional probability, the binomial distribution, and the geometric distribution. That range means you could see anywhere from a handful to a significant chunk of multiple-choice questions drawn directly from these concepts.

What's on the AP Stats Unit 4 progress check (MCQ and FRQ)?

The AP Stats Unit 4 progress check in AP Classroom includes both MCQ and FRQ parts drawn from the unit's 12 topics on probability and probability distributions. The MCQ portion tests concepts like conditional probability, mutually exclusive events, independent events, and parameters of the binomial distribution. The FRQ portion typically asks you to set up probability calculations, interpret random variable distributions, or work through binomial or geometric distribution problems in context. Practicing with questions matched to these exact topics before the progress check helps a lot. You can find aligned practice at /ap-stats/unit-4.

How do I practice AP Stats Unit 4 FRQs?

AP Stats Unit 4 FRQs most often focus on probability calculations, interpreting random variable distributions, and applying the binomial distribution or geometric distribution to real contexts. A typical question gives you a scenario, asks you to find a probability or expected value, and then asks you to interpret it in context. That interpretation step is where most points are lost. To practice effectively: - Work through problems on conditional probability and independence, writing out your reasoning step by step - Practice binomial distribution problems by identifying n, p, and the exact probability statement before calculating - For geometric distribution questions, make sure you can explain what the mean represents in context - Check your work against scoring guidelines to see exactly where points are awarded Find FRQ-style practice questions for this unit at /ap-stats/unit-4.

Where can I find AP Stats Unit 4 practice questions?

The best place to find AP Stats Unit 4 practice questions, including multiple-choice and practice test problems, is /ap-stats/unit-4. That page has resources matched to all 12 topics in the unit, from basic probability rules through the binomial distribution and geometric distribution. For MCQ practice, focus on questions that test conditional probability, independent events, and reading probability distribution tables. For a mini practice test feel, string together questions from each topic in order so you cover the full unit before your progress check or exam.

How should I study AP Stats Unit 4?

Start by building a solid foundation in probability before moving to random variables and named distributions. Unit 4 has a clear progression, and skipping ahead to the binomial distribution without understanding conditional probability and independence makes the later topics much harder. Here's a study plan that works well: 1. **Probability rules first.** Work through mutually exclusive events, conditional probability, and independence (4.3-4.6) until the formulas feel automatic. 2. **Random variables next.** Practice calculating and interpreting the mean and standard deviation of a random variable in context, not just the numbers. 3. **Named distributions last.** For the binomial distribution, memorize when to use it (fixed n, two outcomes, independent trials, constant p) and practice identifying parameters. For the geometric distribution, focus on what the mean tells you about waiting time. 4. **Write out interpretations.** On the AP exam, a correct number with no context earns partial credit at best. Practice finishing every answer with a sentence that uses the units and situation from the problem. Find practice resources for each of these steps at /ap-stats/unit-4.