In AP Statistics, stratification is splitting a population into non-overlapping subgroups (strata) that are similar within themselves, then taking a random sample from every stratum, so each subgroup is guaranteed representation and estimates are more precise than a plain SRS.
Stratification means dividing a population into subgroups called strata based on a characteristic you think matters (age, income level, grade, region), then drawing a random sample from each stratum. The whole point is that individuals inside a stratum are similar to each other, while the strata differ from one another. By sampling every stratum, you guarantee that no important subgroup gets missed by random chance.
Think of it this way: an SRS leaves representation up to luck. Stratification removes the luck. If you stratify a school survey by grade level, you cannot accidentally end up with a sample that's 80% freshmen. Under the CED's DAT-2.D.1, every sampling method has trade-offs, and stratification's trade-off is this: you get reduced variability and guaranteed subgroup coverage, but you need to know the stratifying characteristic for everyone in the population before you sample, which isn't always possible.
Stratification lives in Topic 3.3 (Random Sampling and Data Collection) in Unit 3. It directly supports two learning objectives. For AP Stats 3.3.A, you have to recognize a stratified random sample from a study description, which means spotting the pattern "divided into groups, then randomly sampled from every group." For AP Stats 3.3.B, you have to explain why stratification is or isn't appropriate for a given situation, backed by DAT-2.D.1, which says every sampling method has advantages and disadvantages depending on the question and the population. The advantage you should be ready to articulate: when strata are homogeneous (similar within), stratified sampling produces less variable estimates than an SRS of the same size. That's the answer the exam is fishing for almost every time stratification shows up.
Keep studying AP® Statistics Unit 1
Cluster Sample (Unit 3)
Cluster sampling is stratification's mirror image. Stratification wants groups that are similar within and samples some people from every group. Clustering wants groups that are mini versions of the whole population and samples everyone in just a few groups. Same first step (divide the population), opposite logic.
Blocking in Experiments (Unit 3)
Blocking is stratification's twin on the experiment side of Unit 3. When you block by a variable like age or sex before randomly assigning treatments, you're doing the same thing stratification does in sampling, controlling a known source of variability so it can't muddy your results.
Comparing Distributions Across Groups (Unit 1)
Stratification is also how patterns get revealed, not just how samples get drawn. When you split data by a categorical variable and compare distributions side by side (parallel boxplots, segmented bar charts), you can uncover relationships that vanish in the aggregate. Same instinct, different unit.
Random Number Generator (Unit 3)
Stratification still requires randomness inside each stratum. A common setup is using a random number generator to pull an SRS from each subgroup. Dividing into strata and then hand-picking people is not a stratified random sample, and the exam will test whether you catch that.
Stratification shows up most often in multiple-choice stems that describe a study and ask you to identify the sampling method or judge whether it's appropriate. For example, a market researcher stratifying by age group, or a sociologist dividing a city into wealthy, middle-class, and low-income areas before sampling neighborhoods. Watch for that second example: dividing into areas, randomly selecting neighborhoods within each area, then randomly selecting households is actually a multistage design that starts with stratification, and the exam loves layered designs like that. On FRQs, the classic task is explaining why a stratified sample is better than an SRS in context. The full-credit answer names the stratifying variable, says the strata differ on the response of interest, and concludes that stratifying guarantees representation of each group and reduces variability in the estimate. "It's more accurate" with no reasoning earns nothing.
Both methods chop the population into groups, which is exactly why they get confused. The difference is what you do next. Stratified: sample SOME individuals from EVERY group, and you want groups that are homogeneous inside (all freshmen together, all seniors together). Cluster: sample EVERY individual from a FEW randomly chosen groups, and you want each group to be heterogeneous, a little snapshot of the whole population (like randomly picking 3 homerooms that each contain a mix of all grades). Quick check on test day: every group represented means stratified; whole groups selected means cluster.
Stratification divides a population into non-overlapping subgroups (strata) and takes a random sample from every single stratum.
Strata should be homogeneous within and different between, which is the exact opposite of what you want for clusters.
The big advantage of stratified sampling is guaranteed representation of each subgroup and lower variability in estimates compared to an SRS of the same size.
Stratified sampling still requires random selection within each stratum, so dividing into groups and hand-picking people does not count.
On FRQs, justify stratification by naming the stratifying variable and explaining that the groups differ on the variable being measured, always in context.
Blocking in experiments is the same idea as stratification in sampling: control a known source of variability before randomizing.
Stratification is dividing a population into subgroups (strata) based on a shared characteristic, like age or income, then taking a random sample from each stratum. It guarantees every subgroup is represented and typically gives more precise estimates than a simple random sample.
Often, but not automatically. If the strata genuinely differ on the variable you're measuring, stratification reduces variability and guarantees subgroup representation. If the stratifying variable is unrelated to the response, or you can't classify everyone into strata beforehand, an SRS may be the better (or only) option. That trade-off reasoning is exactly what learning objective 3.3.B asks for.
Stratified sampling takes some individuals from every group, and the groups should be similar inside. Cluster sampling takes all individuals from a few randomly chosen groups, and each group should look like a mini version of the whole population. Memory hook: stratified = some from all, cluster = all from some.
Yes, as long as the selection within each stratum is random (for example, using a random number generator to draw an SRS from each group). The strata themselves are chosen deliberately, but the individuals are not, and that within-stratum randomness is what makes it a valid probability sampling method.
Same logic, different setting. Stratification happens in sampling (observational studies) when you divide the population before selecting. Blocking happens in experiments when you divide experimental units into similar groups before randomly assigning treatments. Both control a known variable to reduce variability.
Connect this key term to the AP exam workflow: review the course, practice questions, and check related study tools.
Review units, study guides, and course resources.
Check this vocabulary in multiple-choice context.
Apply key concepts in written AP responses.
Estimate the exam score you are working toward.
Review the highest-yield facts before practice.
Put the full course together before test day.