Overview
AP Statistics Analyze Data is Skill Category 3 in the course. It is the part of statistical work where you build representations of data and produce numerical outputs. You take raw data, a model, or a problem setup and turn it into graphs, tables, summary statistics, probabilities, distribution parameters, and inference results.
In short: Analyze Data is the "do the math and make the picture" skill. Earlier skills set up the question and plan the data. Later skills interpret what your numbers mean. This skill is where the calculations actually happen.
What Analyze Data Means
This skill is about construction and calculation, not interpretation. You are answering questions like:
- What does this distribution look like when I graph it?
- What are the mean, median, standard deviation, and quartiles?
- What is the probability or expected count here?
- What are the parameters of this probability distribution?
- What does this confidence interval or test statistic come out to?
The grouping description says it directly: construct representations of data and distributions, and calculate numerical statistical outputs. You produce the outputs. Describing and justifying them belongs to the Interpret Results skill.
What This Practice Requires
To do well here you need to:
- Choose the right type of graph or table for the variable type (categorical vs quantitative, one variable vs two).
- Run calculations accurately, often using a graphing calculator with statistical capabilities.
- Track which formula or distribution applies to the situation.
- Report results with correct values, and on free response, show enough work to support the result.
Formulas and tables are provided on the AP Exam, so you do not have to memorize every formula. You do need to know which one to use and how to apply it.
Subskills You Need
3.A: Construct tabular and graphical representations of data and distributions. Build the displays that show data. Examples include frequency and relative frequency tables, two-way tables, bar graphs, dotplots, histograms, stemplots, boxplots, and scatterplots. You may also shade a region under a normal curve to represent a probability.
3.B: Calculate summary statistics, relative positions, and predicted responses. This covers mean, median, standard deviation, range, IQR, and the five-number summary. Relative position includes percentiles and z-scores. Predicted responses come from a regression line, such as plugging an x-value into ŷ = a + bx.
3.C: Calculate and estimate expected counts, percentages, probabilities, and intervals. This includes probabilities from simulations and from probability rules, expected counts in two-way tables, percentages from distributions, and confidence intervals.
3.D: Calculate means, standard deviations, and parameters for probability distributions. This is the parameter side of probability models: the mean and standard deviation of a discrete random variable, and parameters for the binomial and geometric distributions. It also covers combining random variables.
3.E: Calculate appropriate statistical inference method results. This is computing the actual inference output: test statistics, p-values, and confidence interval bounds for proportions, means, chi-square procedures, and slopes.
How It Shows Up on the AP Exam
All five subskills appear in both multiple-choice and free-response questions.
- Multiple choice: You might calculate a z-score, read or compute a value from a table, find an expected count, or pick the correct test statistic value.
- Free response: You often construct a graph or table, then compute statistics, a probability, or an inference result. Showing the calculation setup matters, not just the final number.
The exam gives you formulas and tables, and you are expected to bring an approved graphing calculator with statistical capabilities. Practical tip: know your calculator's stat functions well enough that you can produce values quickly and check them against the provided formulas.
Examples Across the Course
These examples come from different units so you can see how broadly this skill spirals.
Exploring One-Variable Data: Given a data set of quiz scores, construct a boxplot (3.A) and calculate the five-number summary and IQR (3.B). Then find the z-score for a single student's score relative to the mean.
Exploring Two-Variable Data (Regression Analysis): From a scatterplot and a least-squares regression line, calculate a predicted response for a given explanatory value (3.B), or compute a residual using the formula residual = observed minus predicted.
Probability, Random Variables, and Probability Distributions: Given a discrete probability distribution table, calculate the expected value (mean) and standard deviation of the random variable (3.D). For a binomial setting, compute the probability of a specific number of successes (3.C).
Inference for Categorical Data: In a two-way table, calculate the expected counts under the assumption of no association (3.C), then compute the chi-square test statistic (3.E).
Inference for Quantitative Data: Means: Construct a confidence interval for a population mean and calculate a t test statistic and its p-value (3.C and 3.E).
How to Practice Analyze Data
- Sort each problem by the type first. Categorical or quantitative? One variable or two? A sample or a probability model? The type tells you which display and formula to use.
- Drill calculator routines: 1-Var Stats, regression, normalcdf, binompdf/binomcdf, and the inference tests. Speed and accuracy here free up time for interpretation.
- For free response, write the setup. For example, write the test statistic formula with numbers plugged in before reporting the value.
- Match each provided formula to a scenario so the formula sheet becomes a tool, not a puzzle.
- After computing, sanity-check: Is the probability between 0 and 1? Is the standard deviation positive? Does the interval contain reasonable values?
Common Mistakes
- Confusing standard deviation of a single data set with the standard deviation of a sampling distribution or a random variable.
- Choosing the wrong display, such as a bar graph for quantitative data or a histogram for categorical data.
- Plugging into a z formula when the population standard deviation is unknown, where a t procedure applies for means.
- Reporting only a final number on free response without showing the setup that supports it.
- Mixing up expected counts (calculated from the model) with observed counts (the actual data).
- Forgetting to define which probability or interval was requested and computing a related but different quantity.
Quick Review
- Analyze Data (Skill 3) is about building displays and calculating outputs, not interpreting them.
- 3.A: tables and graphs for data and distributions.
- 3.B: summary statistics, z-scores, percentiles, predicted responses.
- 3.C: expected counts, percentages, probabilities, intervals.
- 3.D: means, standard deviations, and parameters for probability distributions.
- 3.E: inference results like test statistics, p-values, and interval bounds.
- Identify the variable type and scenario first, then pick the right display and formula.
- Formulas and tables are provided, and an approved graphing calculator is expected. Show your setup on free response.