upgrade
upgrade

📈Intro to Probability for Business

Probability Distribution Types

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Probability distributions are the mathematical backbone of business analytics—they're how you translate real-world uncertainty into quantifiable predictions. Whether you're forecasting demand, assessing quality control risks, or modeling customer behavior, you're being tested on your ability to select the right distribution for a given scenario. This isn't just about memorizing formulas; it's about understanding when discrete vs. continuous models apply, how parameters shape outcomes, and why certain distributions emerge from specific data-generating processes.

The exam will push you to distinguish between distributions that look similar but behave differently. A Poisson and a binomial can both count events, but they model fundamentally different situations. A normal and an exponential are both continuous, but one is symmetric and the other is skewed. Don't just memorize the formulas—know what real-world process each distribution represents and what assumptions must hold for it to be valid.


Binary and Count-Based Discrete Distributions

These distributions model situations where you're counting discrete outcomes—successes, failures, or events. The key distinction is whether trials are independent, how many trials occur, and whether you're sampling with or without replacement.

Bernoulli Distribution

  • Single trial with two outcomes—the simplest building block for all binary models, where success = 1 and failure = 0
  • One parameter: pp (probability of success), with mean =p= p and variance =p(1p)= p(1-p)
  • Foundation for binomial and geometric distributions—understand this first, and the others follow logically

Binomial Distribution

  • Counts successes in nn fixed, independent trials—each trial has the same probability pp of success
  • Parameters: nn (trials) and pp (success probability); mean =np= np, variance =np(1p)= np(1-p)
  • Business applications include quality control and A/B testing—use when sample size is fixed and trials are independent

Geometric Distribution

  • Counts trials until the first success—models "how long until something happens" in discrete time
  • Single parameter pp (success probability); mean =1p= \frac{1}{p}, capturing the expected wait time
  • Useful for customer acquisition and churn analysis—when you need to know how many attempts before conversion

Hypergeometric Distribution

  • Counts successes when sampling without replacement—critical distinction from binomial, which assumes replacement
  • Three parameters: NN (population), KK (successes in population), nn (sample size)
  • Required for small populations or finite lots—use in quality inspection when you can't assume independence between draws

Compare: Binomial vs. Hypergeometric—both count successes in a sample, but binomial assumes independent trials (sampling with replacement or large population), while hypergeometric accounts for dependence when sampling without replacement. If an FRQ describes a small lot or finite population, hypergeometric is your answer.


Event Rate Distributions

These distributions model how often events occur over time or space. The underlying mechanism is a Poisson process—events happen randomly and independently at some average rate.

Poisson Distribution

  • Counts events in a fixed interval of time, space, or other continuous measure—arrivals, defects, accidents
  • Single parameter λ\lambda (average rate); uniquely, mean =λ= \lambda and variance =λ= \lambda
  • Assumes events are independent and occur at constant rate—violations (clustering, seasonality) break the model

Exponential Distribution

  • Models time between Poisson events—the continuous counterpart to the discrete Poisson count
  • Single parameter λ\lambda (rate); mean =1λ= \frac{1}{\lambda}, with the memoryless property (past waiting doesn't affect future)
  • Essential for queuing theory and reliability analysis—service times, equipment failure, customer interarrival times

Gamma Distribution

  • Generalizes exponential to model time until the kk-th event—when you need more than one event to occur
  • Two parameters: shape kk and scale θ\theta (or rate β=1θ\beta = \frac{1}{\theta}); mean =kθ= k\theta
  • Used in insurance claims and project duration modeling—when waiting for multiple independent stages to complete

Compare: Poisson vs. Exponential—Poisson counts how many events in a fixed time; exponential measures how long until the next event. They're two sides of the same coin: if arrivals follow Poisson with rate λ\lambda, interarrival times follow exponential with the same λ\lambda.


Continuous Distributions for Measurement Data

These distributions model variables that can take any value within a range. They're defined by probability density functions, where area under the curve—not individual points—represents probability.

Normal Distribution

  • Symmetric, bell-shaped curve defined by mean μ\mu (center) and standard deviation σ\sigma (spread)
  • Central Limit Theorem makes this critical—sample means approach normal regardless of the underlying distribution
  • Standard normal (ZZ-distribution) has μ=0\mu = 0, σ=1\sigma = 1—used for hypothesis testing and confidence intervals

Uniform Distribution

  • All outcomes equally likely between minimum aa and maximum bb—the "I have no idea" distribution
  • Mean =a+b2= \frac{a+b}{2}, variance =(ba)212= \frac{(b-a)^2}{12}—flat probability density across the entire range
  • Used in simulation and as a prior when no information exists—also models rounding errors and random number generation

Beta Distribution

  • Defined only on [0,1][0, 1]—perfect for modeling proportions, probabilities, and percentages
  • Two shape parameters α\alpha and β\beta create flexible shapes: symmetric, skewed, U-shaped, or uniform
  • Bayesian statistics workhorse—used as a prior for unknown probabilities and in PERT project estimation

Compare: Normal vs. Uniform—normal concentrates probability near the mean with tails extending infinitely; uniform spreads probability evenly with hard boundaries. Use normal when values cluster around a center; use uniform when all values in a range are equally plausible.


Quick Reference Table

ConceptBest Examples
Binary/success-failure outcomesBernoulli, Binomial
Counting until first successGeometric
Sampling without replacementHypergeometric
Events per time intervalPoisson
Time between eventsExponential, Gamma
Symmetric measurement dataNormal
Equal probability across rangeUniform
Proportions and probabilitiesBeta

Self-Check Questions

  1. A quality inspector examines 15 items from a shipment of 100, where 8 are known to be defective. Which distribution models the number of defectives found—binomial or hypergeometric? Why?

  2. Compare the Poisson and binomial distributions: under what conditions does binomial approximate Poisson, and what real-world scenario would make this approximation useful?

  3. If customer arrivals follow a Poisson distribution with λ=12\lambda = 12 per hour, what distribution describes the time between consecutive arrivals, and what is its mean?

  4. You need to model the probability that a new product captures between 15% and 25% of market share. Which distribution is most appropriate, and what makes it suitable for this scenario?

  5. Explain why the Central Limit Theorem makes the normal distribution essential for business statistics, even when the underlying data isn't normally distributed.