Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
When you're analyzing data, the distribution tells you everything about what statistical tools you can use and what conclusions you can draw. You're being tested on your ability to recognize distribution shapes, understand their parameters, and—most critically—know when to apply each one. A normal distribution lets you use z-scores and standard confidence intervals; a Poisson distribution handles count data; a t-distribution saves you when sample sizes are small. Choosing the wrong distribution leads to invalid results.
Think of distributions as the foundation of statistical inference. Every hypothesis test, every confidence interval, and every predictive model assumes some underlying distribution. The concepts you need to master include probability density functions, parameters and their meanings, when distributions approximate each other, and real-world applications. Don't just memorize shapes—know what each distribution models and why you'd reach for it in a given scenario.
These distributions are balanced around their center, making them mathematically convenient and widely applicable. Symmetry means the mean, median, and mode coincide, which simplifies calculations and interpretation.
Compare: Normal vs. t-Distribution—both are symmetric and bell-shaped, but the t-distribution has heavier tails to handle small-sample uncertainty. If an FRQ gives you a small sample without population standard deviation, reach for the t-distribution.
These distributions handle scenarios where you're counting occurrences—successes in trials, events over time, or items in categories. The key is recognizing whether you have a fixed number of trials or an open-ended count.
Compare: Binomial vs. Poisson—binomial requires a fixed number of trials with known , while Poisson handles unlimited potential events with a known average rate. Use binomial for "X successes out of 50 attempts" and Poisson for "X customers arriving per hour."
When you're measuring how long until something happens rather than how many times it happens, these continuous distributions apply. They're essential for reliability engineering, survival analysis, and queuing systems.
Compare: Poisson vs. Exponential—these are two sides of the same coin. Poisson counts events per interval (discrete), while exponential measures time between events (continuous). Knowing one parameter gives you both distributions.
These specialized distributions emerge when you're testing claims about data. They're derived from combinations of other distributions and are essential for formal inference.
Compare: Chi-Square vs. F-Distribution—chi-square tests categorical relationships and goodness-of-fit, while F-distribution compares variances across groups. Both are right-skewed and derived from normal distributions, but they answer different research questions.
Real-world data often violates the assumption of symmetry. Recognizing asymmetry is crucial because it affects which measures of center and statistical tests are appropriate.
Compare: Skewed vs. Multimodal—skewed distributions have one peak with an extended tail, while multimodal distributions have multiple peaks. Skewness suggests outliers or bounded data; multimodality suggests distinct subpopulations that may need separate analysis.
| Concept | Best Examples |
|---|---|
| Symmetric, continuous | Normal, Uniform, t-Distribution |
| Counting successes/events | Binomial, Poisson |
| Time until event | Exponential |
| Hypothesis testing | Chi-Square, F-Distribution, t-Distribution |
| Small sample inference | t-Distribution |
| Comparing variances/groups | F-Distribution, Chi-Square |
| Non-normal data patterns | Skewed, Multimodal |
| Rate-based modeling | Poisson (counts), Exponential (times) |
Which two distributions are mathematically related as "two sides of the same coin," with one modeling counts and the other modeling wait times?
You have survey data from 200 respondents answering yes/no questions. Which distribution models the number of "yes" responses, and what parameters would you need?
Compare and contrast the normal distribution and t-distribution: when would you use each, and what happens to the t-distribution as sample size increases?
A dataset of household incomes shows mean significantly higher than median. What type of distribution is this, and why would using the mean be misleading?
An FRQ asks you to test whether three treatment groups have different average outcomes. Which distribution would your test statistic follow, and why is it appropriate for this comparison?