---
title: "AP Statistics Collect Data: Statistical Practice Guide"
description: "Learn AP Statistics Collect Data: identify information, justify ethical data methods, choose inference procedures, and write null and alternative hypotheses."
canonical: "https://fiveable.me/ap-stats/statistical-practices/collect-data/study-guide/EwIR3UxX2LdHZEh4wi1k"
type: "study-guide"
subject: "AP Statistics"
unit: "**Statistical Practices"
lastUpdated: "2026-06-18"
---

# AP Statistics Collect Data: Statistical Practice Guide

## Summary

Learn AP Statistics Collect Data: identify information, justify ethical data methods, choose inference procedures, and write null and alternative hypotheses.

## Guide

## Overview

[AP Statistics](/ap-stats "fv-autolink") Collect Data is the statistical practice where you figure out what information you need, choose a sound and ethical way to gather it, and pick the right inference method to answer a question. In short, you decide how the data should come into existence and how you will eventually test or estimate something with it.

This practice shows up everywhere on the exam, not just in the data collection topics. Anytime a question asks you to set up a study, pick a procedure, name the hypotheses, or identify a Type I or [Type II error](/ap-stats/key-terms/type-ii-error "fv-autolink"), you are using Collect Data skills.

## What Collect Data Means

This practice covers the planning side of [statistics](/ap-stats/key-terms/statistic "fv-autolink"). Before you calculate or interpret anything, you make decisions that determine whether your conclusions will be valid.

The grouping description sums it up: identify and justify methods for [collecting data](/ap-stats/unit-3 "fv-autolink") and conducting [statistical inference](/ap-stats/key-terms/statistical-inference "fv-autolink"). That splits into two connected jobs:

- Plan how data are gathered so the results are trustworthy and ethical.
- Plan the inference framework, including which procedure fits and what you are testing.

These two jobs are linked. How you collect data decides what conclusions you can draw later. Random sampling lets you generalize to a [population](/ap-stats/key-terms/population "fv-autolink"). A [randomized experiment](/ap-stats/key-terms/randomized-experiment "fv-autolink") lets you claim a treatment caused an effect.

## What This Practice Requires

You do not crunch numbers in this practice. Instead, you make and defend choices. A strong response usually does some of the following:

- States exactly what data or information answers the question.
- Names a sampling or experimental method and explains why it fits.
- Flags ethical concerns and [bias](/ap-stats/key-terms/bias "fv-autolink") risks.
- Selects a specific inference procedure that matches the data type and question.
- Writes correct [null and alternative hypotheses](/ap-stats/key-terms/null-and-alternative-hypotheses "fv-autolink") in context.
- Identifies error types and how the pieces of a test relate.

Practical tip: always reread the prompt and underline the task before answering. The Collecting Data unit specifically rewards giving a clear yes or no decision and then justifying it.

## Subskills You Need

### 2.A: Identify information to answer a question or solve a problem

Decide what [variables](/ap-stats/unit-1/language-variation-variables/study-guide/nKpeaxi1H3Ht9aFhTHKt "fv-autolink"), measurements, or counts you actually need. If a question asks whether a new study method raises test scores, the information you need is test scores for students who did and did not use the method.

- Match the data to the question, not to what is easy to grab.
- Identify the population, the variable of interest, and the [parameter](/ap-stats/key-terms/parameter "fv-autolink") being estimated or tested.

### 2.B: Justify an appropriate method for ethically gathering and representing data

Choose and defend a data collection design, and watch for ethics and bias.

- For surveys, use random sampling methods like simple random, stratified, or [cluster sampling](/ap-stats/key-terms/cluster-sampling "fv-autolink") to reduce bias.
- For cause and effect, use a randomized experiment with control, randomization, and [replication](/ap-stats/key-terms/replication "fv-autolink").
- Ethical concerns include confidentiality, informed consent, and not misrepresenting results.
- Representing data ethically means using graphs and summaries that do not distort the truth.

### 2.C: Identify appropriate statistical inference methods

Pick the procedure that matches the situation. You are not running it here, just naming it correctly.

- [Categorical variable](/ap-stats/key-terms/categorical-variable "fv-autolink"), one group: confidence interval or significance test for a [proportion](/ap-stats/unit-1/representing-categorical-variable-with-tables/study-guide/JUZVd7cRAnbarZyNoEAg "fv-autolink").
- Two groups, categorical: procedures for the difference of two proportions.
- [Quantitative variable](/ap-stats/key-terms/quantitative-variable "fv-autolink"), mean: t procedures for a mean or difference of means.
- More than two categories or two-way tables: chi-square procedures.
- Relationship between two quantitative variables: regression-based inference.

Ask two questions: Is the variable categorical or quantitative? Am I estimating with an [interval](/ap-stats/unit-1/representing-quantitative-variable-with-graphs/study-guide/VWtyLVDvjzEgtbAi6v6j "fv-autolink") or testing a claim?

### 2.D: Identify types of errors and relationships among components in statistical inference methods

Understand how the parts of a test connect.

- Type I error: rejecting a true null hypothesis. Its [probability](/ap-stats/unit-4/intro-probability/study-guide/gfnBWfyMANOxF3vWLrbA "fv-autolink") is the significance level, often called alpha.
- Type II error: failing to reject a false null hypothesis.
- Power is the probability of correctly rejecting a false null hypothesis.
- Lowering alpha reduces Type I error risk but tends to raise Type II error risk.
- Larger sample sizes generally increase power.

### 2.E: Identify the null and alternative hypotheses

Write what you are testing in clear terms.

- The null hypothesis usually states no effect, no difference, or a specific parameter value.
- The alternative states the effect, difference, or direction you suspect.
- Hypotheses are always about population parameters, not sample statistics.

Example in words: null says the true proportion equals 0.5; alternative says the true proportion is greater than 0.5.

## How It Shows Up on the AP Exam

Every subskill in this practice can appear on both multiple-choice and free-response questions.

- Multiple choice may ask you to pick the correct sampling design, the right inference procedure, the correct hypotheses, or to identify a Type I versus Type II error.
- Free response often asks you to justify a design choice, name and set up a procedure, write hypotheses, or describe an error and its consequence in context.

The exam has 40 multiple-choice questions and 6 free-response questions, including an investigative task. Collect Data skills support many of these, especially the study design topics and the setup steps of every inference question.

Practical tip: when a free-response question says justify or explain, you need a reason tied to the context, not just a label.

## Examples Across the Course

These examples show how Collect Data appears in different parts of the course.

1. **Study design (Collecting Data).** A school wants to know if a tutoring program improves grades. You identify the needed data, propose a randomized experiment with a treatment and control group, and explain that randomization supports a cause and effect conclusion (2.A, 2.B).

2. **Inference for proportions.** A city claims more than 60 percent of residents support a new park. You choose a one-sample test for a proportion and write the hypotheses: null that the true proportion equals 0.60, alternative that it is greater than 0.60 (2.C, 2.E).

3. **Inference for means.** Two factories produce bolts and you want to compare average lengths. You identify quantitative data from each factory and select a two-sample t procedure for the [difference of two means](/ap-stats/unit-5/sampling-distributions-for-differences-sample-means/study-guide/hPyIdIhuKKF731eU2qOT "fv-autolink") (2.A, 2.C).

4. **Chi-square setting.** A researcher checks whether favorite music genre is associated with age group using a two-way table. You identify a chi-square test for independence as the appropriate method (2.C).

5. **Errors in context.** A drug test sets a null that the drug has no effect. You explain that a Type I error means concluding the drug works when it does not, and connect a smaller alpha to a lower [chance](/ap-stats/unit-3/do-data-we-collected-tell-truth/study-guide/e1IBCDyqgTmEE88ZrUcY "fv-autolink") of that mistake (2.D).

## How to Practice Collect Data

- Sort practice problems by variable type first. Categorical or quantitative drives the procedure choice.
- For every study you read, ask whether random sampling, a random experiment, both, or neither was used, then state what that allows you to conclude.
- Write hypotheses in symbols and in plain words until both come naturally.
- Build a quick decision chart linking data scenarios to inference procedures.
- After picking an answer, force yourself to give one sentence of justification. The exam rewards reasons.

## Common Mistakes

- Writing hypotheses about a sample statistic instead of a population parameter.
- Choosing an inference procedure that does not match the variable type, such as using a mean procedure for categorical data.
- Confusing Type I and Type II errors. Type I is a false alarm; Type II is a miss.
- Claiming cause and effect from an observational study that lacked randomized treatment assignment.
- Generalizing to a whole population when the sample was not randomly selected.
- Labeling a design without justifying why it reduces bias or supports the conclusion.

## Quick Review

- Collect Data is the planning practice: decide what information you need, how to gather it ethically, and how to test or estimate.
- 2.A: identify the needed data, population, and parameter.
- 2.B: justify an ethical, low-bias collection method.
- 2.C: match the right inference procedure to the data and question.
- 2.D: connect alpha, Type I and Type II errors, and power.
- 2.E: write null and alternative hypotheses about parameters in context.
- Random sampling supports generalizing. Randomized experiments support cause and effect.
- When in doubt, name the task, choose the procedure, and back it up with a reason.
