Random variables are the mathematical bridge between real-world uncertainty and rigorous probability theory. When you're analyzing everything from quantum mechanics to financial markets, you're choosing which type of random variable best captures the underlying randomness. This topic connects directly to probability distributions, expected value calculations, and statistical inferenceโconcepts that appear throughout your coursework and exams.
You're being tested on more than definitions here. Examiners want to see that you understand when to apply each distribution, what parameters define it, and how different random variables relate to one another. Can you recognize that a binomial is just repeated Bernoulli trials? Do you know why exponential and Poisson distributions are mathematically linked? Don't just memorize formulasโknow what real-world scenario each random variable models and what makes it the right tool for that job.
Discrete vs. Continuous: The Fundamental Split
Before diving into specific distributions, you need to internalize the core distinction: discrete random variables count, continuous random variables measure. This determines everything from how we calculate probabilities to what functions describe them.
Discrete Random Variables
Countable outcomesโthese variables take on specific, separated values you could list (even if that list is infinite)
Probability mass function (PMF) assigns exact probabilities to each value; P(X=x) makes sense and can be nonzero
Summation is used for expected value: E[X]=โxโxโ P(X=x)
Continuous Random Variables
Uncountably infinite outcomesโvalues fill an entire interval with no gaps between possible results
Probability density function (PDF) describes relative likelihood; P(X=x)=0 for any specific value, so we integrate over intervals
Compare: Discrete vs. Continuousโboth use functions to describe probability, but PMFs give point probabilities while PDFs require integration over intervals. If an FRQ asks you to find P(X=5) for a continuous variable, the answer is always zero.
Binary and Count-Based Discrete Distributions
These distributions model scenarios where you're counting successes, events, or trials. The key is identifying what's being counted and under what conditions.
Bernoulli Random Variables
Single trial with two outcomesโthe simplest random variable, taking value 1 (success) with probability p or 0 (failure) with probability 1โp
Building block for more complex distributions; binomial, geometric, and negative binomial all derive from repeated Bernoulli trials
Mean and variance are E[X]=p and Var(X)=p(1โp), both determined by the single parameter p
Binomial Random Variables
Fixed number of independent trialsโcounts successes in n Bernoulli trials, each with success probability p
PMF formula: P(X=k)=(knโ)pk(1โp)nโk combines counting (how many ways) with probability (how likely)
Mean np and variance np(1โp) scale linearly with trial count, making this ideal for sampling with replacement
Geometric Random Variables
Trials until first successโcounts how many Bernoulli trials occur before (or including) the first success
Memoryless property among discrete distributions; past failures don't affect future success probability
PMF: P(X=k)=(1โp)kโ1p, with mean 1/pโhigher success probability means fewer expected trials
Poisson Random Variables
Events in fixed intervalsโmodels count of occurrences when events happen at a constant average rate ฮป
Single parameter ฮป serves as both mean and variance; useful approximation for binomial when n is large and p is small
Independence assumptionโevents in non-overlapping intervals are independent, making this ideal for rare event modeling
Compare: Binomial vs. Poissonโboth count discrete events, but binomial has a fixed trial count while Poisson models events in continuous time/space. Use Poisson when nโโ and pโ0 with np=ฮป held constant.
Hypergeometric Random Variables
Sampling without replacementโcounts successes when drawing n items from a population of N containing K successes
Three parameters (N, K, n) capture population structure; probabilities change with each draw
Approaches binomial when population size N is much larger than sample size n, since replacement effects become negligible
Compare: Binomial vs. Hypergeometricโboth count successes in samples, but binomial assumes independence (replacement) while hypergeometric accounts for changing probabilities (no replacement). Quality control with small lots? Hypergeometric. Large population surveys? Binomial approximation works.
Continuous Distributions for Measurement and Time
These distributions model quantities that can take any value in a range. Focus on what each distribution's shape tells you about the underlying phenomenon.
Uniform Random Variables
Equal likelihood across an intervalโevery value between a and b is equally probable, with PDF f(x)=bโa1โ
Maximum entropy distribution when you only know the range; represents complete uncertainty within bounds
Mean (a+b)/2 and variance (bโa)2/12 depend only on the interval endpoints
Normal (Gaussian) Random Variables
Bell-shaped symmetryโdefined by mean ฮผ (center) and standard deviation ฯ (spread), with PDF f(x)=ฯ2ฯโ1โeโ2ฯ2(xโฮผ)2โ
Central Limit Theorem connectionโsums of independent random variables converge to normal, explaining its ubiquity in nature
68-95-99.7 ruleโapproximately 68%, 95%, and 99.7% of values fall within 1, 2, and 3 standard deviations of the mean
Exponential Random Variables
Waiting time until an eventโmodels time between Poisson events, with rate parameter ฮป and PDF f(x)=ฮปeโฮปx for xโฅ0
Memoryless property is unique among continuous distributions; P(X>s+tโฃX>s)=P(X>t)
Mean 1/ฮป and variance 1/ฮป2โhigher rate means shorter expected waiting time
Compare: Exponential vs. Poissonโthese are two sides of the same coin. Poisson counts events in an interval; exponential measures time between events. Same parameter ฮป links them: if arrivals are Poisson with rate ฮป, inter-arrival times are exponential with rate ฮป.
Compare: Normal vs. Uniformโboth are continuous, but normal concentrates probability near the mean while uniform spreads it evenly. Normal arises from aggregation (CLT); uniform represents maximum ignorance within bounds.
Quick Reference Table
Concept
Best Examples
Binary outcomes
Bernoulli
Counting successes (fixed trials)
Binomial, Hypergeometric
Counting events (fixed interval)
Poisson
Trials until success
Geometric
Waiting/duration times
Exponential
Equal likelihood
Uniform (discrete or continuous)
Natural phenomena, aggregation
Normal (Gaussian)
Memoryless property
Exponential (continuous), Geometric (discrete)
Self-Check Questions
Which two distributions share the memoryless property, and what distinguishes them from each other?
You're modeling the number of defective items in a batch of 20 drawn from a shipment of 100. Should you use binomial or hypergeometric, and why?
Compare and contrast the Poisson and exponential distributions: what real-world scenario would use both, and how are their parameters related?
A student claims that P(X=2.5)=0.3 for a continuous random variable. What's wrong with this statement, and how should probabilities for continuous variables be expressed?
If you sum 50 independent Bernoulli random variables (each with p=0.4), what distribution describes the result? What distribution would the sum approximate if you instead summed 1000 such variables and standardized the result?