Fiveable
Fiveable
pep
Fiveable
Fiveable

or

Log in

Find what you need to study


Light

3.4 Potential Problems with Sampling

6 min readdecember 29, 2022

Kanya Shah

Kanya Shah

Jed Quiaoit

Jed Quiaoit

A

Aly Moosa

Kanya Shah

Kanya Shah

Jed Quiaoit

Jed Quiaoit

A

Aly Moosa

Attend a live cram event

Review all units live with expert teachers & students

For the next couple sections, be sure to hammer this definition down; hence, I'll be repeating the same sentence again and again. Repeat after me:

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others, either intentionally or unintentionally.

That's right! Previously, we've discussed methods of sampling accurately but now, it is imperative to discuss how not to gather data. Statistical studies are biased if it is likely to underestimate or overestimate the value you are looking for.  can result in a sample or a study that is not representative of the population being studied, which can lead to inaccurate or misleading conclusions. 😕

Bias in Sampling 

Voluntary Response

, also known as self-selection , occurs when a sample is comprised entirely of volunteers or people who choose to participate in a study. This can occur when the study relies on self-reported data or when participants are recruited through means such as online surveys or television or radio programs. 🙋‍♂️

Because voluntary response samples are not randomly selected, they are often not representative of the population being studied. For example, people who are more interested in the topic being studied or who have strong opinions on the topic may be more likely to volunteer to participate in the study. This can result in a sample that is biased and does not accurately represent the views or characteristics of the population.

Example

You are a researcher who is interested in studying the attitudes of young adults towards social media. You decide to conduct an online survey to gather data on this topic. You create a link to the survey and post it on your personal social media accounts, as well as on several online forums and discussion groups related to social media. You also share the link with your friends and ask them to share it with their friends.

Within a few days, you receive a large number of responses to your survey. However, as you begin to analyze the data, you realize that there are some problems with the sample. Many of the respondents are young adults who are very active on social media and have strong opinions about it, while others are older adults who do not use social media as much. There are also a disproportionate number of responses from people who live in urban areas, while responses from people in rural areas are much less common.

Upon further investigation, you realize that the sample for your survey is not representative of the population of young adults in the United States. Instead, it is comprised mostly of volunteers who chose to participate in the survey, and these volunteers are not representative of the overall population. 😔

Undercoverage

occurs when a portion of the population has a reduced chance of being included in the sample, resulting in a sample that is not representative of the population. This can occur for a variety of reasons, such as when the sampling frame is incomplete or when certain groups are more difficult to reach or sample than others. 😡

Example

if you're studying the attitudes of college students towards climate change and you use a sampling frame that only includes students at four-year colleges and universities, you may be introducing into your sample. This is because students at two-year colleges and trade schools may be underrepresented in your sample, even though they are part of the overall population of college students in the United States. 😔

Nonresponse

occurs when individuals who are chosen for the sample but for whom data cannot be obtained (or who refuse to respond) differ from those for whom data can be obtained. This can happen when a study relies on self-reported data or when data is collected through methods such as surveys or interviews. 👻

Example

If you're conducting a survey of college students' attitudes towards climate change and you send the survey to a random sample of 1000 students, but only 500 students complete and return the survey, you may be introducing into your sample. This is because the students who completed and returned the survey may differ in some way from the students who did not respond, even though both groups were chosen for the sample.

Wording in Questions

occurs when the wording of a survey question or the way a question is asked influences the response that is given. This can happen when the question is confusing, misleading, or ambiguous, or when the question asks for too much or too little information. ✉️

Example

Consider the following two questions:

  • "Do you think that the government should provide free healthcare for all citizens?"

  • "Would you support a government healthcare program that would provide free healthcare for all citizens?"

These two questions are asking about the same topic, but they are worded differently. The first question uses a leading phrase ("Do you think that") that may the respondent towards agreeing with the statement, while the second question presents the idea of a government healthcare program more neutrally. As a result, the responses to these two questions may differ, even though they are asking about the same topic.

Convenience Sampling

occurs in which the sample is selected based on ease of access or availability, rather than through a random selection process. This means that the sample is not representative of the population and may be biased. 🥪

Example

Let's go back to the researcher who wants to study the attitudes of college students towards climate change!

Rather than using a random sampling method to select a representative sample of students, the researcher decides to conduct the study on a college campus that is close to their home. The researcher then interviews students who are available and willing to participate, such as students who are passing by or who are hanging out in a common area.

In this example, the sample is not representative of all college students in the United States, but rather a convenience sample of students from a single college campus. This sample may be biased because it only includes students who are available and willing to participate, and it does not include students who are not on campus or who are not interested in participating in the study. As a result, the results of the study may not accurately represent the attitudes of college students towards climate change.

Why does this matter, then? 

If sampling is done without considering the factors that could invalidate your results, then the concluding data and analysis will not be representative of the population. 

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FCiOtO9fVEAAB3RU.jpg?alt=media&token=0091d591-9f31-4acc-91a8-0f87767537ae

Courtesy of Twitter

Practice Problem

You are a researcher who is interested in studying the attitudes of small business owners towards the economy. You decide to conduct a survey to gather data on this topic.

To collect data for the survey, you create a list of all the small business owners in your city and send the survey to a random sample of 1000 business owners. However, you only receive responses from 500 of the business owners, and many of the respondents are from larger businesses with more than 50 employees.

Upon further analysis, you realize that your sample is not representative of the population of small business owners in your city.

a) Identify the type of present in this sample.

b) Explain how this may have affected the results of the survey.

Answer

a) In this situation, the sample is suffering from because not all of the business owners who were chosen for the sample responded to the survey. This means that the sample is not representative of the population of small business owners in the city, as it only includes responses from 500 business owners, many of whom are from larger businesses.

b) This may have affected the results of the survey in several ways. For example, the attitudes of small business owners who did not respond to the survey may differ from those who did respond, even though both groups were chosen for the sample.

Additionally, the oversampling of larger businesses may have skewed the results in a way that is not representative of the attitudes of small business owners overall. As a result, the results of the survey may not accurately represent the attitudes of small business owners towards the economy.

🎥 Watch: AP Stats - Sampling Methods and Sources of Bias

Key Terms to Review (6)

Bias

: Bias refers to a systematic deviation from the true value or an unfair influence that affects the results of statistical analysis. It can occur during data collection, sampling, or analysis and leads to inaccurate or misleading conclusions.

Convenience Sampling

: Convenience sampling is a non-probability sampling method where individuals are selected based on their easy accessibility or availability. This method often leads to biased results because the sample may not accurately represent the entire population.

Nonresponse Bias

: Nonresponse bias refers to the potential distortion of results caused by individuals who choose not to participate or fail to respond in a survey or study. This can lead to biased conclusions if those who do not respond differ systematically from those who do.

Question Wording Bias

: Question wording bias is when the way a question is phrased influences how people respond, leading to biased results. The wording can unintentionally lead respondents towards a particular answer or create confusion.

Undercoverage Bias

: Undercoverage bias refers to a type of bias that occurs when certain groups within the population are not adequately represented in the sample.

Voluntary Response Bias

: Voluntary response bias refers to a type of bias that arises when individuals self-select themselves into participating in surveys or studies.

3.4 Potential Problems with Sampling

6 min readdecember 29, 2022

Kanya Shah

Kanya Shah

Jed Quiaoit

Jed Quiaoit

A

Aly Moosa

Kanya Shah

Kanya Shah

Jed Quiaoit

Jed Quiaoit

A

Aly Moosa

Attend a live cram event

Review all units live with expert teachers & students

For the next couple sections, be sure to hammer this definition down; hence, I'll be repeating the same sentence again and again. Repeat after me:

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others. 👎

occurs when certain responses are systematically favored over others, either intentionally or unintentionally.

That's right! Previously, we've discussed methods of sampling accurately but now, it is imperative to discuss how not to gather data. Statistical studies are biased if it is likely to underestimate or overestimate the value you are looking for.  can result in a sample or a study that is not representative of the population being studied, which can lead to inaccurate or misleading conclusions. 😕

Bias in Sampling 

Voluntary Response

, also known as self-selection , occurs when a sample is comprised entirely of volunteers or people who choose to participate in a study. This can occur when the study relies on self-reported data or when participants are recruited through means such as online surveys or television or radio programs. 🙋‍♂️

Because voluntary response samples are not randomly selected, they are often not representative of the population being studied. For example, people who are more interested in the topic being studied or who have strong opinions on the topic may be more likely to volunteer to participate in the study. This can result in a sample that is biased and does not accurately represent the views or characteristics of the population.

Example

You are a researcher who is interested in studying the attitudes of young adults towards social media. You decide to conduct an online survey to gather data on this topic. You create a link to the survey and post it on your personal social media accounts, as well as on several online forums and discussion groups related to social media. You also share the link with your friends and ask them to share it with their friends.

Within a few days, you receive a large number of responses to your survey. However, as you begin to analyze the data, you realize that there are some problems with the sample. Many of the respondents are young adults who are very active on social media and have strong opinions about it, while others are older adults who do not use social media as much. There are also a disproportionate number of responses from people who live in urban areas, while responses from people in rural areas are much less common.

Upon further investigation, you realize that the sample for your survey is not representative of the population of young adults in the United States. Instead, it is comprised mostly of volunteers who chose to participate in the survey, and these volunteers are not representative of the overall population. 😔

Undercoverage

occurs when a portion of the population has a reduced chance of being included in the sample, resulting in a sample that is not representative of the population. This can occur for a variety of reasons, such as when the sampling frame is incomplete or when certain groups are more difficult to reach or sample than others. 😡

Example

if you're studying the attitudes of college students towards climate change and you use a sampling frame that only includes students at four-year colleges and universities, you may be introducing into your sample. This is because students at two-year colleges and trade schools may be underrepresented in your sample, even though they are part of the overall population of college students in the United States. 😔

Nonresponse

occurs when individuals who are chosen for the sample but for whom data cannot be obtained (or who refuse to respond) differ from those for whom data can be obtained. This can happen when a study relies on self-reported data or when data is collected through methods such as surveys or interviews. 👻

Example

If you're conducting a survey of college students' attitudes towards climate change and you send the survey to a random sample of 1000 students, but only 500 students complete and return the survey, you may be introducing into your sample. This is because the students who completed and returned the survey may differ in some way from the students who did not respond, even though both groups were chosen for the sample.

Wording in Questions

occurs when the wording of a survey question or the way a question is asked influences the response that is given. This can happen when the question is confusing, misleading, or ambiguous, or when the question asks for too much or too little information. ✉️

Example

Consider the following two questions:

  • "Do you think that the government should provide free healthcare for all citizens?"

  • "Would you support a government healthcare program that would provide free healthcare for all citizens?"

These two questions are asking about the same topic, but they are worded differently. The first question uses a leading phrase ("Do you think that") that may the respondent towards agreeing with the statement, while the second question presents the idea of a government healthcare program more neutrally. As a result, the responses to these two questions may differ, even though they are asking about the same topic.

Convenience Sampling

occurs in which the sample is selected based on ease of access or availability, rather than through a random selection process. This means that the sample is not representative of the population and may be biased. 🥪

Example

Let's go back to the researcher who wants to study the attitudes of college students towards climate change!

Rather than using a random sampling method to select a representative sample of students, the researcher decides to conduct the study on a college campus that is close to their home. The researcher then interviews students who are available and willing to participate, such as students who are passing by or who are hanging out in a common area.

In this example, the sample is not representative of all college students in the United States, but rather a convenience sample of students from a single college campus. This sample may be biased because it only includes students who are available and willing to participate, and it does not include students who are not on campus or who are not interested in participating in the study. As a result, the results of the study may not accurately represent the attitudes of college students towards climate change.

Why does this matter, then? 

If sampling is done without considering the factors that could invalidate your results, then the concluding data and analysis will not be representative of the population. 

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2FCiOtO9fVEAAB3RU.jpg?alt=media&token=0091d591-9f31-4acc-91a8-0f87767537ae

Courtesy of Twitter

Practice Problem

You are a researcher who is interested in studying the attitudes of small business owners towards the economy. You decide to conduct a survey to gather data on this topic.

To collect data for the survey, you create a list of all the small business owners in your city and send the survey to a random sample of 1000 business owners. However, you only receive responses from 500 of the business owners, and many of the respondents are from larger businesses with more than 50 employees.

Upon further analysis, you realize that your sample is not representative of the population of small business owners in your city.

a) Identify the type of present in this sample.

b) Explain how this may have affected the results of the survey.

Answer

a) In this situation, the sample is suffering from because not all of the business owners who were chosen for the sample responded to the survey. This means that the sample is not representative of the population of small business owners in the city, as it only includes responses from 500 business owners, many of whom are from larger businesses.

b) This may have affected the results of the survey in several ways. For example, the attitudes of small business owners who did not respond to the survey may differ from those who did respond, even though both groups were chosen for the sample.

Additionally, the oversampling of larger businesses may have skewed the results in a way that is not representative of the attitudes of small business owners overall. As a result, the results of the survey may not accurately represent the attitudes of small business owners towards the economy.

🎥 Watch: AP Stats - Sampling Methods and Sources of Bias

Key Terms to Review (6)

Bias

: Bias refers to a systematic deviation from the true value or an unfair influence that affects the results of statistical analysis. It can occur during data collection, sampling, or analysis and leads to inaccurate or misleading conclusions.

Convenience Sampling

: Convenience sampling is a non-probability sampling method where individuals are selected based on their easy accessibility or availability. This method often leads to biased results because the sample may not accurately represent the entire population.

Nonresponse Bias

: Nonresponse bias refers to the potential distortion of results caused by individuals who choose not to participate or fail to respond in a survey or study. This can lead to biased conclusions if those who do not respond differ systematically from those who do.

Question Wording Bias

: Question wording bias is when the way a question is phrased influences how people respond, leading to biased results. The wording can unintentionally lead respondents towards a particular answer or create confusion.

Undercoverage Bias

: Undercoverage bias refers to a type of bias that occurs when certain groups within the population are not adequately represented in the sample.

Voluntary Response Bias

: Voluntary response bias refers to a type of bias that arises when individuals self-select themselves into participating in surveys or studies.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.