🎲Intro to Probability Unit 8 – Discrete Distributions: Bernoulli to Poisson

Discrete distributions are the backbone of probability theory, modeling random events with countable outcomes. From the simple Bernoulli to the complex Poisson, these distributions help us understand and predict various phenomena in science, engineering, and everyday life. This unit covers five key discrete distributions: Bernoulli, Binomial, Geometric, Negative Binomial, and Poisson. Each distribution has unique properties and applications, from modeling coin flips to predicting rare events, providing essential tools for statistical analysis and decision-making.

Key Concepts

  • Discrete probability distributions assign probabilities to discrete random variables
  • Probability mass function (PMF) defines the probability of each possible value of a discrete random variable
  • Cumulative distribution function (CDF) gives the probability that a random variable is less than or equal to a specific value
  • Expected value represents the average value of a random variable over many trials
  • Variance measures the spread or dispersion of a random variable around its expected value
    • Calculated as the average squared deviation from the mean
  • Independent and identically distributed (i.i.d.) random variables have the same probability distribution and are mutually independent
  • Memoryless property states that the probability of an event occurring is independent of the past history of the process

Bernoulli Distribution

  • Models a single trial with two possible outcomes: success (probability pp) and failure (probability 1p1-p)
  • Random variable XX follows a Bernoulli distribution with parameter pp, denoted as XBern(p)X \sim \text{Bern}(p)
  • Probability mass function: P(X=x)=px(1p)1xP(X = x) = p^x (1-p)^{1-x} for x{0,1}x \in \{0, 1\}
    • P(X=1)=pP(X = 1) = p and P(X=0)=1pP(X = 0) = 1-p
  • Expected value: E(X)=pE(X) = p
  • Variance: Var(X)=p(1p)\text{Var}(X) = p(1-p)
  • Used to model binary outcomes (coin flips, defective/non-defective items, pass/fail exams)

Binomial Distribution

  • Models the number of successes in a fixed number of independent Bernoulli trials with constant success probability
  • Random variable XX follows a binomial distribution with parameters nn and pp, denoted as XBin(n,p)X \sim \text{Bin}(n, p)
  • Probability mass function: P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k} p^k (1-p)^{n-k} for k=0,1,,nk = 0, 1, \ldots, n
    • (nk)\binom{n}{k} is the binomial coefficient, representing the number of ways to choose kk successes from nn trials
  • Expected value: E(X)=npE(X) = np
  • Variance: Var(X)=np(1p)\text{Var}(X) = np(1-p)
  • Models the number of successes in a fixed number of trials (number of heads in 10 coin flips, number of defective items in a batch)

Geometric Distribution

  • Models the number of trials until the first success in a sequence of independent Bernoulli trials with constant success probability
  • Random variable XX follows a geometric distribution with parameter pp, denoted as XGeom(p)X \sim \text{Geom}(p)
  • Probability mass function: P(X=k)=(1p)k1pP(X = k) = (1-p)^{k-1}p for k=1,2,k = 1, 2, \ldots
  • Expected value: E(X)=1pE(X) = \frac{1}{p}
  • Variance: Var(X)=1pp2\text{Var}(X) = \frac{1-p}{p^2}
  • Memoryless property: P(X>m+nX>m)=P(X>n)P(X > m+n | X > m) = P(X > n) for any non-negative integers mm and nn
  • Models waiting times until the first success (number of coin flips until the first head, number of quality checks until the first defective item)

Negative Binomial Distribution

  • Generalizes the geometric distribution to model the number of trials until the rr-th success in a sequence of independent Bernoulli trials with constant success probability
  • Random variable XX follows a negative binomial distribution with parameters rr and pp, denoted as XNB(r,p)X \sim \text{NB}(r, p)
  • Probability mass function: P(X=k)=(k1r1)pr(1p)krP(X = k) = \binom{k-1}{r-1} p^r (1-p)^{k-r} for k=r,r+1,k = r, r+1, \ldots
  • Expected value: E(X)=rpE(X) = \frac{r}{p}
  • Variance: Var(X)=r(1p)p2\text{Var}(X) = \frac{r(1-p)}{p^2}
  • Models the number of trials until a fixed number of successes (number of coin flips until the 5th head, number of job interviews until 3 offers)

Poisson Distribution

  • Models the number of rare events occurring in a fixed interval of time or space, given a known average rate of occurrence
  • Random variable XX follows a Poisson distribution with parameter λ\lambda, denoted as XPois(λ)X \sim \text{Pois}(\lambda)
  • Probability mass function: P(X=k)=eλλkk!P(X = k) = \frac{e^{-\lambda}\lambda^k}{k!} for k=0,1,2,k = 0, 1, 2, \ldots
  • Expected value: E(X)=λE(X) = \lambda
  • Variance: Var(X)=λ\text{Var}(X) = \lambda
  • Poisson process: Events occur independently and at a constant average rate
    • Inter-arrival times between events follow an exponential distribution with rate λ\lambda
  • Approximates binomial distribution when nn is large and pp is small, such that np=λnp = \lambda
  • Models rare events (number of car accidents per day, number of typos per page, number of customers arriving per hour)

Applications and Examples

  • Quality control: Binomial distribution to model the number of defective items in a batch, geometric distribution to model the number of inspections until a defective item is found
  • Genetics: Binomial distribution to model the number of individuals with a specific genotype in a population
  • Finance: Geometric distribution to model the number of days until a stock price exceeds a certain threshold
  • Queueing theory: Poisson distribution to model the number of customers arriving at a service counter per hour
  • Reliability engineering: Negative binomial distribution to model the number of component failures until a system breaks down
  • Epidemiology: Poisson distribution to model the number of new cases of a rare disease in a population per year
  • Telecommunications: Poisson distribution to model the number of phone calls arriving at a call center per minute

Common Pitfalls and Tips

  • Ensure that the assumptions of each distribution are met before applying them to a problem
    • Independent trials, constant success probability, and fixed number of trials for binomial distribution
    • Rare events occurring independently and at a constant average rate for Poisson distribution
  • Be careful when using the Poisson approximation to the binomial distribution, as it is only valid when nn is large and pp is small
  • Remember that the geometric and negative binomial distributions start counting from the first trial, while the binomial distribution counts the total number of successes in a fixed number of trials
  • Use the memoryless property of the geometric distribution to simplify calculations when appropriate
  • When working with Poisson distribution, make sure to consider the units of the parameter λ\lambda (e.g., events per unit time or space)
  • Double-check the formulas for PMF, expected value, and variance, as they differ for each distribution
  • Practice solving problems using various approaches, such as using the PMF, CDF, or moment-generating functions, to gain a deeper understanding of the distributions


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.