---
title: "AP Statistics 1.10: The Investigative Question Revisited and Data Collection"
description: "Review AP Statistics 1.10, including components of an investigative question, observational studies, experiments, surveys, censuses, confounding, and when generalization is justified."
canonical: "https://fiveable.me/ap-stats/unit-1/investigative-question-revisited-data-collection/study-guide/f842Kr6YNnYX4G0dtAC8"
type: "study-guide"
subject: "AP Statistics"
unit: "Unit 1 – Exploring One–Variable Data and Collecting Data"
lastUpdated: "2026-06-30"
---

# AP Statistics 1.10: The Investigative Question Revisited and Data Collection

## Summary

Review AP Statistics 1.10, including components of an investigative question, observational studies, experiments, surveys, censuses, confounding, and when generalization is justified.

## Guide

By this point in [Unit 1](/ap-stats/unit-1 "fv-autolink"), you have seen how a statistical study starts with a question and produces data. This topic revisits that full process: what kind of study is being run, what [population](/ap-stats/key-terms/population "fv-autolink") the conclusions can apply to, and whether the design supports generalization or cause-and-effect claims.

## Why This Matters for the AP Statistics Exam

This topic is where [AP Statistics](/ap-stats "fv-autolink") starts treating study design as an argument instead of a vocabulary list. On the exam, you may need to identify whether a study is observational or experimental, decide whether a census or sample was used, explain whether conclusions can be generalized to a population, and distinguish [association](/ap-stats/key-terms/association "fv-autolink") from causation.

## Key Takeaways

- An investigative question should make clear what [variables](/ap-stats/unit-1/language-variation-variables/study-guide/nKpeaxi1H3Ht9aFhTHKt "fv-autolink") matter, what kind of analysis is needed, and what kind of conclusion the study might support.
- A census collects data from every member of a population, while a sample collects data from only part of the population.
- In an [observational study](/ap-stats/key-terms/observational-study "fv-autolink"), [treatments](/ap-stats/key-terms/treatments "fv-autolink") are not imposed. In an experiment, treatments are assigned to experimental units.
- A survey is an observational study using a standard set of questions.
- [Confounding variables](/ap-stats/key-terms/confounding-variables "fv-autolink") can create an alternative explanation for an observed relationship.
- [Random selection](/ap-stats/key-terms/random-selection "fv-autolink") supports generalization to a population. [Random assignment](/ap-stats/key-terms/random-assignment "fv-autolink") supports cause-and-effect conclusions.

## Reconnecting to the Investigative Question

An **investigative question** should guide the whole statistical study. In the revised course, that means the question should do three things:

1. Make clear what variable or variables are being studied.
2. Hint at the kind of analysis or comparison that will be needed.
3. Clarify what kind of conclusion might be possible.

If the question is vague, the study design and conclusions will be weak too.

## Census or Sample?

A **census** records information from every individual or item in the population. A **sample** includes only part of the population.

Most statistical studies use samples because the population is too large, too expensive, or too difficult to measure completely. A census removes [sampling variability](/ap-stats/key-terms/sampling-variability "fv-autolink"), but it is often impractical.

## Observational Study or Experiment?

The fastest way to separate study types is to ask whether the researcher **imposed a treatment**.

- In an **observational study**, the researcher records what already happens and does not assign treatments.
- In an **experiment**, the researcher assigns treatments to experimental units and then measures the response.

Within observational studies, you may also see surveys, retrospective studies, and prospective studies.

## Confounding and the Limits of Conclusions

A **[confounding variable](/ap-stats/key-terms/confounding-variable "fv-autolink")** offers an alternative explanation for an observed relationship. It is tied to both the [explanatory variable](/ap-stats/key-terms/explanatory-variable "fv-autolink") and the response variable, which makes the relationship hard to interpret cleanly.

This matters most in observational studies. If students who sleep more also tend to have lighter course loads, course load may be a confounding variable when you study sleep and quiz scores.

## Generalization and Cause-and-Effect

AP Statistics keeps two ideas separate:

- **Generalization** asks whether conclusions can apply to a population.
- **Cause and effect** asks whether the explanatory variable actually caused changes in the response.

To [generalize](/ap-stats/unit-3/inference-experiments/study-guide/ijQtfZ5uUJiFJtYjB74v "fv-autolink") to a population, the sample or experimental units should be randomly selected from that population.

To support cause and effect, the study needs random assignment of treatments in a well-designed experiment.

Those are different ideas. Random selection is not the same as random assignment.

## How to Use This on the AP Statistics Exam

### MCQ

- Ask first: did the researcher assign treatments or just observe?
- Check whether the study used a sample or a census.
- If the question asks about conclusions, separate "can generalize" from "can [claim](/ap-stats/unit-6/justifying-claim-based-on-confidence-interval-for-population-proportion/study-guide/YeTpyj6nyq03j0AJO3Bm "fv-autolink") cause."

### Free Response

- State clearly whether the study is observational or experimental.
- Name the exact feature that supports the conclusion: random selection for generalization, random assignment for cause-and-effect.
- Use the study's real variables in your explanation rather than generic labels.

### Common Trap

Mixing up random selection and random assignment is one of the most common AP Statistics mistakes.

## Common Misconceptions

- **A large sample automatically supports causation.** It does not.
- **An observational study can prove causation if the association is strong.** It cannot.
- **A census is always the best choice.** It may be impractical or unnecessary.
- **Random selection and random assignment are interchangeable.** They are not.

## Related AP Statistics Guides

- [Unit 1 Overview: Exploring One-Variable Data and Collecting Data](/ap-stats/unit-1/review/study-guide/UILH3nCm0GydbWfK8VfE)
- [1.11 Random Sampling](/ap-stats/unit-1/random-sampling-data-collection/study-guide/nQz8XwRMmIKKBS59qrew)
- [1.12 Potential Problems with Sampling](/ap-stats/unit-1/potential-problems-with-sampling/study-guide/nndgaR2dJGCIQs2UsTYa)
- [1.13 Experimental Design](/ap-stats/unit-1/experimental-design/study-guide/gsdVWumN3cEYmXOIVv95)

## Vocabulary

- **approximately normally distributed**: A description of data sets that closely follow the pattern of a normal distribution with a mound-shaped, symmetric curve.
- **empirical rule**: A rule stating that for a normal distribution, approximately 68% of observations fall within 1 standard deviation of the mean, 95% within 2 standard deviations, and 99.7% within 3 standard deviations.
- **mean**: The average value of a dataset, represented by μ in the context of a population.
- **normal curve**: The bell-shaped graph of a normal distribution that is symmetric and mound-shaped.
- **normal distribution**: A probability distribution that is mound-shaped and symmetric, characterized by a population mean (μ) and population standard deviation (σ).
- **normally distributed random variable**: A random variable that follows a normal distribution, allowing for the calculation of probabilities for specific intervals.
- **parameter**: A numerical summary that describes a characteristic of an entire population.
- **percentile**: A value such that p% of the data is less than or equal to it, used to describe the position of a data point within a distribution.
- **population mean**: The average of all values in an entire population, denoted as μ.
- **population means**: The average values of two distinct populations being compared, denoted as μ₁ and μ₂.
- **population standard deviation**: A measure of the spread or dispersion of all values in a population, denoted by σ, which is a parameter of the normal distribution.
- **proportion**: A part or share of a whole, expressed as a fraction, decimal, or percentage.
- **relative position**: The location of a data point within a data set, often expressed in comparison to other values or as a measure of how it ranks relative to the distribution.
- **standard deviation**: A measure of how spread out data values are from the mean, represented by σ in the context of a population.
- **standard normal table**: A reference table that provides the cumulative probabilities (areas under the curve) for the standard normal distribution.
- **z-score**: A standardized score calculated as (xi - μ)/σ that measures how many standard deviations a data value is from the mean.

## FAQs

### What is the normal distribution in AP Stats?

The normal distribution is a symmetric, mound-shaped model used to represent some population distributions. It is described by a population mean and population standard deviation.

### What does N(mu, sigma) mean?

N(mu, sigma) means a normal distribution with population mean mu and population standard deviation sigma. The standard normal distribution is N(0, 1).

### How do z-scores work in a normal distribution?

A z-score tells how many standard deviations a value is above or below the mean. Positive z-scores are above the mean, and negative z-scores are below it.

### What is the empirical rule in AP Statistics?

For a normal distribution, about 68% of values are within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3.

### When should you not use a normal model?

Do not use a normal model when the data are clearly skewed, multimodal, or not approximately symmetric and mound-shaped.

### How does Topic 1.10 show up on the AP Stats exam?

Questions may ask you to calculate or interpret z-scores, apply the empirical rule, find proportions or percentiles using technology, or decide whether a normal model is appropriate.

## Structured Data

```json
{"@context":"https://schema.org","@type":"FAQPage","inLanguage":"en","mainEntity":[{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-1/investigative-question-revisited-data-collection/study-guide/f842Kr6YNnYX4G0dtAC8#what-is-the-normal-distribution-in-ap-stats","name":"What is the normal distribution in AP Stats?","acceptedAnswer":{"@type":"Answer","text":"The normal distribution is a symmetric, mound-shaped model used to represent some population distributions. It is described by a population mean and population standard deviation."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-1/investigative-question-revisited-data-collection/study-guide/f842Kr6YNnYX4G0dtAC8#what-does-nmu-sigma-mean","name":"What does N(mu, sigma) mean?","acceptedAnswer":{"@type":"Answer","text":"N(mu, sigma) means a normal distribution with population mean mu and population standard deviation sigma. The standard normal distribution is N(0, 1)."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-1/investigative-question-revisited-data-collection/study-guide/f842Kr6YNnYX4G0dtAC8#how-do-z-scores-work-in-a-normal-distribution","name":"How do z-scores work in a normal distribution?","acceptedAnswer":{"@type":"Answer","text":"A z-score tells how many standard deviations a value is above or below the mean. Positive z-scores are above the mean, and negative z-scores are below it."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-1/investigative-question-revisited-data-collection/study-guide/f842Kr6YNnYX4G0dtAC8#what-is-the-empirical-rule-in-ap-statistics","name":"What is the empirical rule in AP Statistics?","acceptedAnswer":{"@type":"Answer","text":"For a normal distribution, about 68% of values are within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-1/investigative-question-revisited-data-collection/study-guide/f842Kr6YNnYX4G0dtAC8#when-should-you-not-use-a-normal-model","name":"When should you not use a normal model?","acceptedAnswer":{"@type":"Answer","text":"Do not use a normal model when the data are clearly skewed, multimodal, or not approximately symmetric and mound-shaped."}},{"@type":"Question","@id":"https://fiveable.me/ap-stats/unit-1/investigative-question-revisited-data-collection/study-guide/f842Kr6YNnYX4G0dtAC8#how-does-topic-110-show-up-on-the-ap-stats-exam","name":"How does Topic 1.10 show up on the AP Stats exam?","acceptedAnswer":{"@type":"Answer","text":"Questions may ask you to calculate or interpret z-scores, apply the empirical rule, find proportions or percentiles using technology, or decide whether a normal model is appropriate."}}]}
```