Hypothesis Testing for a Single Mean
Hypothesis testing gives you a structured way to decide whether sample data supports a specific claim about a population parameter. For means, you'll use either z-tests or t-tests depending on what you know about the population. For proportions, you'll use z-tests when certain conditions are met.
This section covers the full process: choosing the right test, setting up hypotheses, computing test statistics, and interpreting results in context.

Hypothesis Tests for Population Mean
The first decision you need to make is which test statistic to use. That depends on two things: whether you know the population standard deviation (), and how large your sample is.
- Use the z-statistic when is known or when the sample size is large (). With large samples, the Central Limit Theorem ensures the sampling distribution is approximately normal.
- Use the t-statistic when is unknown and the sample is small (). The t-distribution has heavier tails than the normal distribution, which accounts for the extra uncertainty from estimating with . Degrees of freedom are .
Here's the full testing procedure, step by step:
Step 1: State the hypotheses.
- Null hypothesis (): (the population mean equals some claimed value)
- Alternative hypothesis (): Choose based on the research question:
- (left-tailed)
- (right-tailed)
- (two-tailed)
Step 2: Calculate the test statistic.
- Z-test:
- T-test:
In both formulas, is the sample mean, is the hypothesized value, and is the sample size. The denominator is the standard error, which measures how much the sample mean typically varies from the true mean.
Step 3: Find the p-value or critical value.
- The p-value is the probability of getting a test statistic as extreme as (or more extreme than) what you observed, assuming is true. Use a z-table, t-table, or calculator.
- Alternatively, find the critical value that corresponds to your significance level () and compare it directly to your test statistic.
Step 4: Make your decision.
- If the p-value , reject . The data provides sufficient evidence to support .
- If the p-value , fail to reject . The data does not provide sufficient evidence to support .
Failing to reject does not mean you've proven is true. It just means the sample didn't give you enough evidence to say otherwise.

Statistical Power and Effect Size
Power is the probability of correctly rejecting when it's actually false. In other words, it's your test's ability to detect a real difference. Power depends on three things:
- Sample size (): Larger samples give more power because they reduce the standard error.
- Significance level (): A larger (say 0.10 vs. 0.05) increases power but also increases the risk of a Type I error.
- Effect size: The bigger the true difference between the actual parameter and the hypothesized value, the easier it is to detect.
Effect size quantifies how large a difference is in standardized terms. For a single mean, a common measure is Cohen's :
This tells you the difference in units of standard deviations, which makes it easier to judge practical importance regardless of the original scale.
The Central Limit Theorem underpins all of this: as increases, the sampling distribution of approaches a normal distribution, no matter what the population distribution looks like. That's why z-tests work for large samples even when the population isn't normal.

Hypothesis Testing for a Single Proportion
Population Proportion Hypothesis Tests
Before running a proportion z-test, you need to verify three conditions:
- The sample was randomly selected from the population.
- The independence condition: the population is at least 10 times larger than the sample (the 10% rule).
- The success-failure condition: and , where is the hypothesized proportion. This ensures the sampling distribution of is approximately normal.
Note: the success-failure condition uses (the hypothesized value), not , because you're checking conditions under the assumption that is true.
Step 1: State the hypotheses.
- :
- : , , or
Step 2: Calculate the test statistic.
Here, is the sample proportion (successes divided by ), and the denominator is the standard error of the proportion under .
Step 3: Find the p-value using the standard normal distribution.
Step 4: Compare and decide using the same logic as the mean test: reject if p-value , fail to reject if p-value .
Interpreting Statistical Results
P-value interpretation
The p-value is not the probability that is true. It's the probability of observing data this extreme (or more so) if were true. A small p-value (typically ) means the observed result would be unlikely under , so you reject it. A large p-value means the data is consistent with .
Confidence interval interpretation
A confidence interval gives a range of plausible values for the population parameter. If you construct a 95% confidence interval, that means if you repeated the sampling process many times, about 95% of the resulting intervals would contain the true parameter.
You can connect confidence intervals to hypothesis tests: if the hypothesized value (or ) falls inside the confidence interval, you would fail to reject at the corresponding significance level. If it falls outside, you would reject.
Drawing conclusions in context
- Always state your conclusion in the context of the original problem. Don't just say "reject ." Say something like: "There is sufficient evidence at the level to conclude that the mean battery life is less than 500 hours."
- Consider practical significance alongside statistical significance. A result can be statistically significant but too small to matter in practice. For example, a drug that lowers blood pressure by 0.5 mmHg might be statistically significant with a huge sample, but clinically meaningless.
- Acknowledge limitations: sampling bias, measurement error, and whether the sample truly represents the population all affect how much weight you should give the results.