AP Statistics

One-page, printable cheatsheet

Answer questions to fill the grid

Unit 3 Overview: Collecting Data

3.1 Introducing Statistics: Do the Data We Collected Tell the Truth?

3.2 Introduction to Planning a Study

3.3 Random Sampling and Data Collection

3.4 Potential Problems with Sampling

3.5 Introduction to Experimental Design

3.6 Selecting an Experimental Design

3.7 Inference and Experiments

📊ap statistics review

3.4 Potential Problems with Sampling

Verified for the 2025 AP Statistics exam•Citation:

For the next couple sections, be sure to hammer this definition down; hence, I'll be repeating the same sentence again and again. Repeat after me:

Bias occurs when certain responses are systematically favored over others.

Bias occurs when certain responses are systematically favored over others, either intentionally or unintentionally.

That's right! Previously, we've discussed methods of sampling accurately, but now, it is imperative to discuss how not to gather data. Statistical studies are biased if it is likely to underestimate or overestimate the value you are looking for. Bias can result in a sample or a study that is not representative of the population being studied, which can lead to inaccurate or misleading conclusions.

Bias in Sampling

Voluntary Response

Voluntary response bias, also known as self-selection bias, occurs when a sample is comprised entirely of volunteers or people who choose to participate in a study. This can occur when the study relies on self-reported data or when participants are recruited through means such as online surveys, television, or radio programs.

Because voluntary response samples are not randomly selected, they are often not representative of the population being studied. For example, people who are more interested in the topic being studied or who have strong opinions on the topic may be more likely to volunteer to participate in the study. This can result in a sample that is biased and does not accurately represent the views or characteristics of the population.

Example

You are a researcher who is interested in studying the attitudes of young adults towards social media. You decide to conduct an online survey to gather data on this topic. You create a link to the survey and post it on your personal social media accounts, as well as on several online forums and discussion groups related to social media. You also share the link with your friends and ask them to share it with their friends.

Within a few days, you receive a large number of responses to your survey. However, as you begin to analyze the data, you realize that there are some problems with the sample. Many of the respondents are young adults who are very active on social media and have strong opinions about it, while others are older adults who do not use social media as much. There are also a disproportionate number of responses from people who live in urban areas, while responses from people in rural areas are much less common.

Upon further investigation, you realize that the sample for your survey is not representative of the population of young adults in the United States. Instead, it is comprised mostly of volunteers who chose to participate in the survey, and these volunteers are not representative of the overall population

Undercoverage

Undercoverage bias occurs when a portion of the population has a reduced chance of being included in the sample, resulting in a sample that is not representative of the population. This can occur for a variety of reasons, such as when the sampling frame is incomplete or when certain groups are more difficult to reach or sample than others.

Example

if you're studying the attitudes of college students towards climate change and you use a sampling frame that only includes students at four-year colleges and universities, you may be introducing undercoverage bias into your sample. This is because students at two-year colleges and trade schools may be underrepresented in your sample, even though they are part of the overall population of college students in the United States.

Nonresponse

Nonresponse bias occurs when individuals who are chosen for the sample but for whom data cannot be obtained (or who refuse to respond) differ from those for whom data can be obtained. This can happen when a study relies on self-reported data or when data is collected through methods such as surveys or interviews.

Example

If you're conducting a survey of college students' attitudes towards climate change and you send the survey to a random sample of 1000 students, but only 500 students complete and return the survey, you may be introducing nonresponse bias into your sample. This is because the students who completed and returned the survey may differ in some way from the students who did not respond, even though both groups were chosen for the sample.

Wording in Questions

Question wording bias occurs when the wording of a survey question or the way a question is asked influences the response that is given. This can happen when the question is confusing, misleading, or ambiguous, or when the question asks for too much or too little information. ✉️

Example

Consider the following two questions:

"Do you think that the government should provide free healthcare for all citizens?"
"Would you support a government healthcare program that would provide free healthcare for all citizens?"

These two questions are asking about the same topic, but they are worded differently. The first question uses a leading phrase ("Do you think that") that may bias the respondent towards agreeing with the statement, while the second question presents the idea of a government healthcare program more neutrally. As a result, the responses to these two questions may differ, even though they are asking about the same topic.

Convenience Sampling

Convenience sampling occurs when the sample is selected based on ease of access or availability, rather than through a random selection process. This means that the sample is not representative of the population and may be biased.

Example

Let's go back to the researcher who wants to study the attitudes of college students towards climate change!

Rather than using a random sampling method to select a representative sample of students, the researcher decides to conduct the study on a college campus that is close to their home. The researcher then interviews students who are available and willing to participate, such as students who are passing by or who are hanging out in a common area.

In this example, the sample is not representative of all college students in the United States, but rather a convenience sample of students from a single college campus. This sample may be biased because it only includes students who are available and willing to participate, and it does not include students who are not on campus or who are not interested in participating in the study. As a result, the results of the study may not accurately represent the attitudes of college students towards climate change.

Why does this matter, then?

If sampling is done without considering the factors that could invalidate your results, then the concluding data and analysis will not be representative of the population.

Courtesy of Twitter

Practice Problem

You are a researcher who is interested in studying the attitudes of small business owners towards the economy. You decide to conduct a survey to gather data on this topic.

To collect data for the survey, you create a list of all the small business owners in your city and send the survey to a random sample of 1000 business owners. However, you only receive responses from 500 of the business owners, and many of the respondents are from larger businesses with more than 50 employees.

Upon further analysis, you realize that your sample is not representative of the population of small business owners in your city.

a) Identify the type of bias present in this sample.

b) Explain how this bias may have affected the results of the survey.

Answer

a) In this situation, the sample is suffering from nonresponse bias because not all of the business owners who were chosen for the sample responded to the survey. This means that the sample is not representative of the population of small business owners in the city, as it only includes responses from 500 business owners, many of whom are from larger businesses.

b) This bias may have affected the results of the survey in several ways. For example, the attitudes of small business owners who did not respond to the survey may differ from those who did respond, even though both groups were chosen for the sample.

Additionally, the oversampling of larger businesses may have skewed the results in a way that is not representative of the attitudes of small business owners overall. As a result, the results of the survey may not accurately represent the attitudes of small business owners towards the economy.[](undefined)

[](undefined)🎥 Watch: AP Stats - Sampling Methods and Sources of Bias

Key Terms to Review (6)

Bias: Bias refers to a systematic error that leads to an incorrect or misleading representation of a population or phenomenon. It can affect how data is collected, analyzed, and interpreted, ultimately skewing results and conclusions in various statistical contexts.

Convenience Sampling: Convenience sampling is a non-probability sampling technique where samples are selected based on their easy availability and proximity to the researcher. This method often leads to biased results since it does not adequately represent the population, making it a common point of concern when discussing sampling methods and their associated issues.

Nonresponse Bias: Nonresponse bias occurs when individuals selected for a sample do not respond to a survey or study, leading to potential inaccuracies in the data collected. This type of bias can skew results because the nonrespondents may have different characteristics or opinions than those who participated, which affects the representativeness of the sample. Understanding this bias is essential when analyzing sampling distributions and point estimates, as it directly influences the reliability of conclusions drawn from the data.

Question Wording Bias: Question wording bias occurs when the way a question is phrased influences the responses given by participants, leading to skewed or inaccurate results. This bias can arise from the specific language, tone, or structure of the question, ultimately impacting the validity of survey data. The design and wording of questions are critical in ensuring that responses accurately reflect the views of the population being studied.

Undercoverage Bias: Undercoverage bias occurs when certain segments of a population are inadequately represented in a sample, leading to skewed or inaccurate results. This type of bias can arise from the sampling method used, where specific groups may be systematically excluded or less likely to be included, which ultimately affects the validity of the conclusions drawn from the data. Understanding undercoverage bias is crucial because it can misrepresent the true characteristics of the population and lead to erroneous interpretations.

Voluntary Response Bias: Voluntary response bias occurs when individuals select themselves into a survey or study, leading to an unrepresentative sample. This type of bias often arises when participants are given the option to respond voluntarily, resulting in a higher likelihood of certain groups, particularly those with strong opinions or experiences, being overrepresented. The impact of voluntary response bias can skew results and distort findings, making it a significant concern in the world of statistics.

Back

Practice Quiz Glossary

One-page, printable cheatsheet

Answer questions to fill the grid

Table of Contents

📊ap statistics review

3.4 Potential Problems with Sampling

Bias in Sampling

Voluntary Response

Example

Undercoverage

Example

Nonresponse

Example

Wording in Questions

Example

Convenience Sampling

Example

Why does this matter, then?

Practice Problem

Answer

Key Terms to Review (6)

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Back

3.5 Introduction to Experimental Design