The sample proportion, written p̂ ("p-hat"), is the number of successes in a sample divided by the sample size n. It's the statistic you use to estimate the population proportion p, and it powers the sampling distributions in Unit 5 and the z-intervals and z-tests in Unit 6 of AP Statistics.
The sample proportion is exactly what it sounds like. Take a sample, count how many observations are "successes," and divide by the sample size. If 144 out of 400 voters support a tax proposal, p̂ = 144/400 = 0.36. The hat on p̂ is the universal stats signal for "this came from a sample, not the whole population."
What makes p̂ more than basic arithmetic is its sampling distribution. Per the CED, when you take random samples of size n from a population with true proportion p, the values of p̂ center at p (so p̂ is an unbiased estimator) with standard deviation √(p(1-p)/n). And if np ≥ 10 and n(1-p) ≥ 10, that distribution is approximately normal. Those two facts are the engine behind every confidence interval and significance test for proportions on the exam. The thing to internalize is that p̂ varies from sample to sample while p stays fixed. Inference is the art of using one p̂ to say something honest about p.
Sample proportion is the thread connecting Unit 5 (Sampling Distributions) to Unit 6 (Inference for Categorical Data: Proportions). In Topic 5.5, you determine the mean and standard deviation of the sampling distribution of p̂ (AP Stats 5.5.A) and check when it's approximately normal (AP Stats 5.5.B). Topic 5.6 extends this to differences, p̂₁ - p̂₂. Then Unit 6 puts p̂ to work. It's the point estimate in a one-sample z-interval (AP Stats 6.2.D, where the interval is p̂ ± z*√(p̂(1-p̂)/n)) and the evidence you compare against p₀ in a one-sample z-test (Topic 6.4). If you understand how p̂ behaves, the formulas in Unit 6 stop being memorization and start being obvious.
Keep studying AP Statistics Unit 5
Population Proportion (Units 5-6)
p is the fixed, usually unknown truth about the whole population. p̂ is your sample's snapshot of it. Every inference procedure for proportions is just a structured way of using p̂ to estimate or test claims about p.
Sampling Distribution (Unit 5)
The sampling distribution of p̂ describes what you'd see if you took every possible sample of size n. Its mean is p and its standard deviation is √(p(1-p)/n), which is why bigger samples give p̂ values that cluster tighter around the truth.
Confidence Interval (Unit 6)
A one-sample z-interval is built directly on p̂. It's the center of the interval, and it's also inside the standard error formula. The whole interval is just p̂ plus or minus a few standard errors of wiggle room.
1-Prop Z-Test (Unit 6)
The test statistic measures how many standard deviations your p̂ sits from the hypothesized value p₀. A p̂ that would be shockingly far from p₀ if H₀ were true is your evidence against the null.
Multiple choice questions hand you a p̂ (or the raw counts to compute one) and ask you to build the margin of error, construct the interval, or judge what would change it. For example, given p̂ = 0.52 from 400 voters, you'd need to recognize the margin of error as z*√(0.52(0.48)/400). Other stems test whether you know that increasing n shrinks the interval or that conditions involve checking np̂ and n(1-p̂). On FRQs, inference problems for proportions expect the full four-step routine, and p̂ shows up at every step. You name it as your statistic, use it to check the large counts condition, plug it into the standard error, and interpret your result in context. The fastest way to lose points is mixing up p̂ and p in your hypotheses, since hypotheses are always about the parameter p, never the statistic p̂.
p is the true proportion for the entire population, a fixed parameter you almost never know. p̂ is the proportion computed from one sample, and it changes from sample to sample. On the exam, hypotheses must be written in terms of p (H₀: p = p₀), never p̂, because there's nothing to test about p̂. You already know its value. Also watch the formulas. The standard deviation of the sampling distribution uses p, but when p is unknown, the standard error swaps in p̂.
The sample proportion p̂ equals the number of successes divided by the sample size n, and it serves as the point estimate for the population proportion p.
The sampling distribution of p̂ has mean p and standard deviation √(p(1-p)/n), which means p̂ is unbiased and its variability shrinks as n grows.
The sampling distribution of p̂ is approximately normal when np ≥ 10 and n(1-p) ≥ 10, the large counts condition you must verify before any z-interval or z-test.
A one-sample z-interval for a proportion is p̂ ± z*√(p̂(1-p̂)/n), so p̂ is both the center of the interval and an ingredient in the standard error.
Always write hypotheses about the parameter p, not the statistic p̂, because you already know p̂ from your data.
For two groups, the difference p̂₁ - p̂₂ has its own sampling distribution centered at p₁ - p₂, extending the same logic to comparisons.
The sample proportion p̂ is the number of successes in a sample divided by the sample size. If 42 of 150 trees show disease, p̂ = 42/150 = 0.28. It's the statistic you use to estimate the unknown population proportion p.
p is the fixed, unknown population proportion (a parameter), while p̂ is the proportion from one sample (a statistic) that varies from sample to sample. Hypotheses are always written about p, like H₀: p = 0.5, never about p̂.
No. The sampling distribution of p̂ is only approximately normal when the sample is large enough, specifically when np ≥ 10 and n(1-p) ≥ 10. With small samples or extreme proportions, the distribution is skewed and z-procedures aren't valid.
It depends on the procedure. For a confidence interval, use p̂ in the standard error √(p̂(1-p̂)/n) because p is unknown. For a one-sample z-test, use the hypothesized value p₀, since the test assumes H₀ is true.
On average, yes. The standard deviation of p̂ is √(p(1-p)/n), so quadrupling n cuts the typical distance between p̂ and p in half. That's also why larger samples produce narrower confidence intervals.