Stratified Sampling

Stratified sampling is a random sampling method where the population is divided into non-overlapping subgroups (strata) based on a shared characteristic, and a separate random sample is taken from every stratum, guaranteeing each subgroup is represented and reducing variability in estimates.

Verified for the 2027 AP Statistics examLast updated June 2026

What is Stratified Sampling?

Stratified sampling is a probability-based sampling method with two steps. First, split the population into strata, groups whose members share some characteristic that matters to your question (grade level, income bracket, region). Second, take a simple random sample from every single stratum. Because randomness drives the selection inside each group, this is a chance-based method, which is exactly what the CED says you need for trustworthy conclusions (3.1.A).

The payoff is precision. Suppose you survey a school where freshmen and seniors feel very differently about a new schedule. A plain SRS might, by bad luck, grab mostly seniors. Stratifying by grade and sampling within each grade makes that impossible. Every subgroup shows up in the sample in a controlled way, so your estimates vary less from sample to sample. The mental model that helps most is this one. Stratify when the groups are different from each other but similar within themselves. The strata capture the big differences, and random sampling handles the rest.

Why Stratified Sampling matters in AP Statistics

Stratified sampling lives in Unit 3: Collecting Data, supporting Topic 3.1 (Introducing Statistics) and Topic 3.4 (Potential Problems with Sampling). Learning objective AP Stats 3.1.A asks you to evaluate data collection methods, and the essential knowledge is blunt about it. Methods that don't rely on chance produce untrustworthy conclusions. Stratified sampling is one of your go-to chance-based methods. It also connects directly to AP Stats 3.4.A on identifying bias, because stratifying is a deliberate fix for undercoverage. If a subgroup might get skipped by other methods, building it in as a stratum guarantees it can't be left out. Unit 3 concepts are also the gatekeeper for everything in Units 6-9, since inference is only valid when the data came from a sound random sampling method.

How Stratified Sampling connects across the course

Cluster Sampling (Unit 3)

Cluster sampling is the mirror image of stratified sampling. With strata you sample SOME individuals from EVERY group; with clusters you sample EVERY individual from SOME groups. Strata should be different from each other, while each cluster should be a mini version of the whole population.

Undercoverage Bias (Topic 3.4, Unit 3)

Undercoverage happens when part of the population has a reduced chance of being sampled. Stratified sampling is the structural cure, because forcing a random sample from every stratum means no subgroup can be accidentally skipped.

Margin of Error (Unit 6)

Stratifying reduces sampling variability, and lower variability means a smaller margin of error for the same sample size. This is why 'more precise estimates' is the standard justification for choosing a stratified design over an SRS.

Convenience Sampling (Unit 3)

Convenience sampling is the cautionary tale stratified sampling fixes. Interviewing whoever is easy to reach (people leaving a gym, say) systematically favors certain responses, while stratified sampling uses chance within planned subgroups so inference to the population is actually justified.

Is Stratified Sampling on the AP Statistics exam?

Multiple-choice questions usually hand you a scenario and ask you to either name the sampling method or pick the best method for the situation. Watch for the giveaway phrasing 'divided into groups and a random sample is taken from each group.' That sentence is stratified sampling every time. You'll also see questions like the gym-at-6-PM scenario, where a non-random method makes inference inappropriate, and your job is to recognize that a chance-based design like stratified sampling would fix it. On FRQs, Unit 3 design questions can ask you to describe a sampling plan or explain why stratifying beats an SRS. The credited answer names the stratifying variable, says you take a random sample within each stratum, and explains the benefit. Stratifying guarantees representation of each subgroup and reduces variability in the estimate. 'It's more accurate' alone won't earn the point.

Stratified Sampling vs Cluster Sampling

Both methods chop the population into groups, which is why they're the most-confused pair in Unit 3. The difference is what you do next. Stratified sampling takes a random sample FROM EVERY stratum, and strata are built to be internally similar (all sophomores, all low-income households). Cluster sampling randomly selects a few WHOLE groups and surveys everyone in them, and each cluster should look like a miniature of the full population (homeroom classes, city blocks). Quick test for the MCQ: sampled from all groups means stratified, all of some groups means cluster.

Key things to remember about Stratified Sampling

  • Stratified sampling divides the population into non-overlapping strata and takes a separate random sample from each stratum.

  • Strata should be similar within themselves and different from each other, like grade levels or income brackets.

  • The big advantage over a simple random sample is reduced variability, which leads to more precise estimates and a smaller margin of error.

  • Stratified sampling is a chance-based method, so it satisfies the CED requirement (3.1.A) that trustworthy conclusions come from random selection.

  • Don't confuse it with cluster sampling, where you randomly pick a few whole groups and survey everyone in them instead of sampling from every group.

  • Stratifying prevents undercoverage of a subgroup, but it cannot fix nonresponse bias if chosen individuals refuse to answer.

Frequently asked questions about Stratified Sampling

What is stratified sampling in AP Stats?

It's a random sampling method where you divide the population into subgroups called strata based on a shared characteristic, then take a simple random sample from every stratum. It guarantees each subgroup is represented and is tested in Unit 3 (Topics 3.1 and 3.4).

What's the difference between stratified and cluster sampling?

Stratified sampling takes some individuals from every group; cluster sampling takes every individual from some randomly chosen groups. Also, strata should be internally similar while each cluster should resemble the whole population.

Is stratified sampling better than a simple random sample?

When the strata genuinely differ on the variable you're measuring, yes. Stratifying guarantees every subgroup is represented and produces estimates with less sampling variability than an SRS of the same size. If the population has no meaningful subgroups, an SRS works fine.

Does stratified sampling eliminate all bias?

No. It prevents undercoverage of the strata you define, but nonresponse bias can still occur if selected individuals refuse to answer, and bad question wording can still skew results. It also doesn't help if you stratify on a variable that's unrelated to what you're measuring.

How do I describe a stratified sample on an AP Stats FRQ?

Name the stratifying variable, state that you take a random sample within each stratum, and explain the benefit, such as guaranteeing representation of each subgroup or reducing variability in the estimate. Vague answers like 'it's more accurate' won't earn the point.