upgrade
upgrade

📊Advanced Communication Research Methods

Sampling Methods

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Sampling methods form the backbone of statistical inference—and the AP Statistics exam tests whether you understand why certain methods produce valid conclusions while others don't. You're not just being asked to identify "stratified" versus "cluster" sampling; you're being tested on whether you can explain how each method affects bias, variability, and the validity of inference. Every confidence interval, hypothesis test, and chi-square procedure you'll encounter assumes data was collected properly, so understanding sampling is prerequisite knowledge for Units 5–9.

The key distinction that drives this entire topic is probability sampling versus non-probability sampling. Probability methods give every member of the population a known, non-zero chance of selection—this is what allows us to make legitimate generalizations. When you see questions about the 10% condition, independence assumptions, or why a study's conclusions might be flawed, you're applying sampling concepts. Don't just memorize definitions—know what makes each method statistically valid (or not) and when each is most appropriate.


Probability Sampling Methods

These methods give every population member a known chance of selection, which is the fundamental requirement for valid statistical inference. Without probability sampling, confidence intervals and hypothesis tests lose their mathematical foundation.

Simple Random Sample (SRS)

  • Every possible sample of size n has an equal probability of being selected—this is the gold standard that other methods are compared against
  • Selection uses random number generators or tables of random digits to eliminate human bias in choosing participants
  • Enables direct generalization to the population because the randomization process, not researcher judgment, determines who's included

Stratified Random Sample

  • Population is divided into homogeneous strata (groups where members share a key characteristic)—then an SRS is taken from each stratum
  • Reduces variability in estimates compared to SRS alone because you guarantee representation of important subgroups
  • Required for chi-square tests of homogeneity when comparing distributions across distinct populations

Cluster Sample

  • Population is divided into heterogeneous clusters (each cluster should mirror the diversity of the whole population)—then entire clusters are randomly selected
  • Dramatically reduces cost and logistics when populations are geographically spread out or no complete list exists
  • Introduces higher sampling variability than SRS because individuals within clusters may be more similar to each other than to the broader population

Systematic Random Sample

  • Randomly select a starting point, then choose every kth individual from an ordered list—where k equals population size divided by desired sample size
  • Spreads the sample evenly across the sampling frame and is simpler to implement than pure SRS
  • Risk of bias exists if the list has a periodic pattern that aligns with the sampling interval (e.g., every 10th house on a street might always be a corner lot)

Compare: Stratified vs. Cluster sampling—both divide populations into groups, but stratified samples from every group while cluster samples entire groups. Stratified reduces variability; cluster often increases it. If an FRQ describes dividing by geography and sampling whole areas, that's cluster. If it describes ensuring representation of subgroups, that's stratified.


Non-Probability Sampling Methods

These methods do not give every population member a chance of selection, which means results cannot be generalized to the population through statistical inference. The AP exam frequently tests whether you can identify these as sources of bias.

Convenience Sample

  • Participants are selected based on easy accessibility to the researcher—whoever happens to be available
  • Quick and inexpensive but produces selection bias because the accessible group likely differs systematically from the population
  • Violates independence assumptions required for inference; conclusions apply only to those sampled, not the broader population

Voluntary Response Sample

  • Participants self-select into the study by choosing to respond—common in online polls, call-in surveys, and product reviews
  • Produces severe bias toward strong opinions because people with moderate views rarely bother to participate
  • Classic example of undercoverage where the sampling frame systematically excludes certain population segments

Quota Sample

  • Researcher sets quotas to match population proportions for key characteristics, then fills quotas using non-random selection
  • Resembles stratified sampling but lacks randomization—the final step uses convenience or judgment, not random selection
  • Cannot support valid inference despite appearing representative, because selection within quotas introduces unknown biases

Compare: Stratified sampling vs. Quota sampling—both aim for proportional representation of subgroups, but stratified uses random selection within strata while quota uses researcher judgment. Only stratified supports valid statistical inference.


Complex Sampling Designs

Real-world studies often combine methods to balance statistical validity with practical constraints. The AP exam expects you to recognize these hybrid approaches.

Multistage Sampling

  • Combines multiple sampling methods in stages—typically starting with cluster sampling, then applying SRS or stratified sampling within selected clusters
  • Balances efficiency with representativeness by reducing travel/cost while maintaining probability-based selection
  • Used in major national surveys like the Census Bureau's Current Population Survey—a common real-world example on AP exams

Compare: Simple cluster vs. Multistage sampling—cluster samples everyone in selected clusters, while multistage adds another random selection step within clusters. Multistage typically produces more precise estimates but requires more complex analysis.


The Probability vs. Non-Probability Distinction

This conceptual division determines whether your inference is mathematically valid. Every inference procedure in Units 6–9 assumes probability sampling.

Probability Sampling (Category)

  • Every member has a known, non-zero probability of selection—this mathematical property is what makes inference work
  • Includes SRS, stratified, cluster, and systematic methods—all allow calculation of sampling distributions and standard errors
  • Supports the 10% condition and independence assumptions required for confidence intervals and hypothesis tests

Non-Probability Sampling (Category)

  • Selection probabilities are unknown or zero for some members—no mathematical basis for generalization exists
  • Includes convenience, voluntary response, and quota methods—useful for exploratory research but not confirmatory inference
  • Results in selection bias and undercoverage that cannot be corrected through larger sample sizes

Compare: Probability vs. Non-probability sampling—the distinction isn't about sample size or effort; it's about whether randomization determines selection. A carefully designed convenience sample of 10,000 is still biased, while a proper SRS of 100 supports valid inference.


Quick Reference Table

ConceptBest Examples
Equal probability of selectionSimple Random Sample
Reducing variability through subgroupsStratified Random Sample
Cost-effective for spread-out populationsCluster Sample, Multistage Sampling
Required for chi-square homogeneity testStratified Random Sample
Sources of selection biasConvenience Sample, Voluntary Response Sample
Appears representative but isn't randomQuota Sample
Supports valid statistical inferenceSRS, Stratified, Cluster, Systematic (all probability methods)
Risk from periodic patterns in listsSystematic Random Sample

Self-Check Questions

  1. A researcher divides a city into neighborhoods, randomly selects 5 neighborhoods, and surveys every household in those neighborhoods. A second researcher divides residents by income level and randomly selects participants from each income group. Which method will likely produce estimates with lower variability, and why?

  2. An online poll asks visitors to a news website to vote on a political issue and receives 50,000 responses. Why can't we construct a valid 95% confidence interval for the population proportion from this data, despite the large sample size?

  3. Both stratified and quota sampling aim to ensure representation of subgroups. What specific feature distinguishes them, and how does this affect the validity of inference?

  4. A study uses systematic sampling with k=20k = 20 from an alphabetized list of employees. Under what circumstance might this method introduce bias, and how could the researchers check for this problem?

  5. For a chi-square test of homogeneity comparing customer satisfaction across three store locations, the CED specifies that data should be collected using a stratified random sample or randomized experiment. Explain why a cluster sample (randomly selecting entire stores) would be inappropriate for this inference.