Intro to Probability

🎲Intro to Probability Unit 5 – Discrete Random Variables & Distributions

Discrete random variables are the building blocks of probability theory, describing outcomes that can be counted or listed. They're essential for modeling real-world scenarios like coin flips, dice rolls, and survey responses. Understanding their properties and distributions is crucial for data analysis and decision-making. This unit covers key concepts like probability mass functions, cumulative distribution functions, expected values, and variance. It also explores common discrete distributions such as binomial, Poisson, and geometric, providing tools to analyze and predict outcomes in various fields like finance, engineering, and social sciences.

Key Concepts

  • Discrete random variables take on a countable number of distinct values
  • Probability mass function (PMF) assigns probabilities to each possible value of a discrete random variable
  • Cumulative distribution function (CDF) gives the probability that a discrete random variable is less than or equal to a specific value
  • Expected value represents the average value of a discrete random variable over many trials
  • Variance measures the spread or dispersion of a discrete random variable around its expected value
    • Calculated by taking the average of the squared differences between each value and the expected value
  • Independence and mutual exclusivity are important properties of discrete random variables
    • Independent events do not affect each other's probabilities
    • Mutually exclusive events cannot occur simultaneously
  • Discrete distributions describe the probabilities of different outcomes for a discrete random variable (coin flips, dice rolls)

Types of Discrete Random Variables

  • Bernoulli random variables have only two possible outcomes (success or failure)
    • Examples include coin flips (heads or tails) and yes/no survey questions
  • Binomial random variables count the number of successes in a fixed number of independent Bernoulli trials
    • Characterized by the number of trials nn and the probability of success pp
  • Geometric random variables represent the number of trials needed to achieve the first success in a series of independent Bernoulli trials
  • Poisson random variables model the number of events occurring in a fixed interval of time or space
    • Characterized by the average rate of occurrence λ\lambda
  • Hypergeometric random variables describe the number of successes in a fixed number of draws from a population without replacement
  • Negative binomial random variables represent the number of failures before a specified number of successes in a series of independent Bernoulli trials
  • Discrete uniform random variables assign equal probabilities to a finite set of values

Probability Mass Functions (PMF)

  • A probability mass function (PMF) is a function that gives the probability of a discrete random variable taking on a specific value
  • Denoted as P(X=x)P(X = x), where XX is the random variable and xx is a possible value
  • The sum of all probabilities in a PMF must equal 1
    • xP(X=x)=1\sum_{x} P(X = x) = 1
  • PMFs can be represented as tables, formulas, or graphs
    • In a PMF graph, the height of each point represents the probability of the corresponding value
  • The PMF of a sum of independent discrete random variables is the convolution of their individual PMFs
  • PMFs are used to calculate probabilities, expected values, and other properties of discrete random variables

Cumulative Distribution Functions (CDF)

  • A cumulative distribution function (CDF) gives the probability that a discrete random variable is less than or equal to a specific value
  • Denoted as F(x)=P(Xx)F(x) = P(X \leq x), where XX is the random variable and xx is a possible value
  • CDFs are non-decreasing functions, meaning that F(a)F(b)F(a) \leq F(b) if aba \leq b
  • The CDF can be obtained by summing the PMF values for all values less than or equal to xx
    • F(x)=txP(X=t)F(x) = \sum_{t \leq x} P(X = t)
  • CDFs can be used to calculate probabilities for intervals of values
    • P(a<Xb)=F(b)F(a)P(a < X \leq b) = F(b) - F(a)
  • The CDF of a discrete random variable is a step function, with jumps at each possible value
  • CDFs are useful for comparing different discrete distributions and determining percentiles

Expected Value and Variance

  • The expected value (or mean) of a discrete random variable is the weighted average of all possible values
    • Denoted as E(X)E(X) or μ\mu
    • Calculated by summing the product of each value and its probability: E(X)=xxP(X=x)E(X) = \sum_{x} x \cdot P(X = x)
  • The variance of a discrete random variable measures the average squared deviation from the expected value
    • Denoted as Var(X)Var(X) or σ2\sigma^2
    • Calculated using the formula: Var(X)=E(X2)[E(X)]2Var(X) = E(X^2) - [E(X)]^2
  • The standard deviation is the square root of the variance and has the same units as the random variable
    • Denoted as σ\sigma
  • The expected value and variance have important properties:
    • Linearity of expectation: E(aX+b)=aE(X)+bE(aX + b) = aE(X) + b
    • Variance of a constant: Var(a)=0Var(a) = 0
    • Variance of a sum of independent random variables: Var(X+Y)=Var(X)+Var(Y)Var(X + Y) = Var(X) + Var(Y)
  • These properties are useful for calculating the expected value and variance of transformed or combined discrete random variables

Common Discrete Distributions

  • Binomial distribution: models the number of successes in a fixed number of independent Bernoulli trials
    • PMF: P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k} p^k (1-p)^{n-k}
    • Expected value: E(X)=npE(X) = np
    • Variance: Var(X)=np(1p)Var(X) = np(1-p)
  • Poisson distribution: models the number of events occurring in a fixed interval of time or space
    • PMF: P(X=k)=eλλkk!P(X = k) = \frac{e^{-\lambda}\lambda^k}{k!}
    • Expected value: E(X)=λE(X) = \lambda
    • Variance: Var(X)=λVar(X) = \lambda
  • Geometric distribution: models the number of trials needed to achieve the first success in a series of independent Bernoulli trials
    • PMF: P(X=k)=(1p)k1pP(X = k) = (1-p)^{k-1}p
    • Expected value: E(X)=1pE(X) = \frac{1}{p}
    • Variance: Var(X)=1pp2Var(X) = \frac{1-p}{p^2}
  • Hypergeometric distribution: models the number of successes in a fixed number of draws from a population without replacement
    • PMF: P(X=k)=(Kk)(NKnk)(Nn)P(X = k) = \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}
    • Expected value: E(X)=nKNE(X) = n\frac{K}{N}
    • Variance: Var(X)=nKN(1KN)NnN1Var(X) = n\frac{K}{N}\left(1-\frac{K}{N}\right)\frac{N-n}{N-1}
  • Negative binomial distribution: models the number of failures before a specified number of successes in a series of independent Bernoulli trials
    • PMF: P(X=k)=(k+r1k)(1p)kprP(X = k) = \binom{k+r-1}{k}(1-p)^k p^r
    • Expected value: E(X)=r(1p)pE(X) = \frac{r(1-p)}{p}
    • Variance: Var(X)=r(1p)p2Var(X) = \frac{r(1-p)}{p^2}

Properties and Applications

  • Discrete random variables have several important properties:
    • The probability of any single value is between 0 and 1 (inclusive)
    • The sum of all probabilities in a PMF equals 1
    • The CDF is a non-decreasing function with values between 0 and 1 (inclusive)
  • Discrete distributions can be used to model various real-world phenomena:
    • Binomial distribution: quality control (defective items), medical trials (treatment success)
    • Poisson distribution: call center arrivals, website traffic, radioactive decay
    • Geometric distribution: number of attempts until first success (job interviews, sales)
    • Hypergeometric distribution: sampling without replacement (defective items in a batch)
    • Negative binomial distribution: number of failures before a specified number of successes (customer complaints before a product is redesigned)
  • Discrete random variables can be transformed or combined to create new random variables
    • Example: the sum of two independent discrete random variables is a new discrete random variable
  • Understanding the properties and applications of discrete random variables is crucial for modeling and analyzing real-world situations

Problem-Solving Techniques

  • Identify the type of discrete random variable and its parameters (e.g., binomial with nn and pp)
  • Write the PMF or CDF of the random variable using the appropriate formula
  • Calculate probabilities using the PMF, CDF, or properties of the distribution
    • For PMF: P(X=x)P(X = x)
    • For CDF: P(a<Xb)=F(b)F(a)P(a < X \leq b) = F(b) - F(a)
    • For complement: P(X>a)=1P(Xa)P(X > a) = 1 - P(X \leq a)
  • Use the expected value and variance formulas to calculate these properties for the given distribution
  • Apply the linearity of expectation and properties of variance when working with transformed or combined random variables
  • Recognize when to use the PMF, CDF, or other properties of the distribution to solve a problem
  • Interpret the results in the context of the problem, considering the real-world implications
  • Verify that the solution makes sense and satisfies any given conditions or constraints


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.