Random variables are the mathematical bridge between real-world uncertainty and rigorous probability theory. When you're analyzing everything from quantum mechanics to financial markets, you're choosing which type of random variable best captures the underlying randomness. This topic connects directly to probability distributions, expected value calculations, and statistical inference—concepts that appear throughout your coursework and exams.
You're being tested on more than definitions here. Examiners want to see that you understand when to apply each distribution, what parameters define it, and how different random variables relate to one another. Can you recognize that a binomial is just repeated Bernoulli trials? Do you know why exponential and Poisson distributions are mathematically linked? Don't just memorize formulas—know what real-world scenario each random variable models and what makes it the right tool for that job.
Discrete vs. Continuous: The Fundamental Split
Before diving into specific distributions, you need to internalize the core distinction: discrete random variables count, continuous random variables measure. This determines everything from how we calculate probabilities to what functions describe them.
Discrete Random Variables
Countable outcomes—these variables take on specific, separated values you could list (even if that list is infinite)
Probability mass function (PMF) assigns exact probabilities to each value; P(X=x) makes sense and can be nonzero
Summation is used for expected value: E[X]=∑xx⋅P(X=x)
Continuous Random Variables
Uncountably infinite outcomes—values fill an entire interval with no gaps between possible results
Probability density function (PDF) describes relative likelihood; P(X=x)=0 for any specific value, so we integrate over intervals
Compare: Discrete vs. Continuous—both use functions to describe probability, but PMFs give point probabilities while PDFs require integration over intervals. If an FRQ asks you to find P(X=5) for a continuous variable, the answer is always zero.
Binary and Count-Based Discrete Distributions
These distributions model scenarios where you're counting successes, events, or trials. The key is identifying what's being counted and under what conditions.
Bernoulli Random Variables
Single trial with two outcomes—the simplest random variable, taking value 1 (success) with probability p or 0 (failure) with probability 1−p
Building block for more complex distributions; binomial, geometric, and negative binomial all derive from repeated Bernoulli trials
Mean and variance are E[X]=p and Var(X)=p(1−p), both determined by the single parameter p
Binomial Random Variables
Fixed number of independent trials—counts successes in n Bernoulli trials, each with success probability p
PMF formula: P(X=k)=(kn)pk(1−p)n−k combines counting (how many ways) with probability (how likely)
Mean np and variance np(1−p) scale linearly with trial count, making this ideal for sampling with replacement
Geometric Random Variables
Trials until first success—counts how many Bernoulli trials occur before (or including) the first success
Memoryless property among discrete distributions; past failures don't affect future success probability
PMF: P(X=k)=(1−p)k−1p, with mean 1/p—higher success probability means fewer expected trials
Poisson Random Variables
Events in fixed intervals—models count of occurrences when events happen at a constant average rate λ
Single parameter λ serves as both mean and variance; useful approximation for binomial when n is large and p is small
Independence assumption—events in non-overlapping intervals are independent, making this ideal for rare event modeling
Compare: Binomial vs. Poisson—both count discrete events, but binomial has a fixed trial count while Poisson models events in continuous time/space. Use Poisson when n→∞ and p→0 with np=λ held constant.
Hypergeometric Random Variables
Sampling without replacement—counts successes when drawing n items from a population of N containing K successes
Three parameters (N, K, n) capture population structure; probabilities change with each draw
Approaches binomial when population size N is much larger than sample size n, since replacement effects become negligible
Compare: Binomial vs. Hypergeometric—both count successes in samples, but binomial assumes independence (replacement) while hypergeometric accounts for changing probabilities (no replacement). Quality control with small lots? Hypergeometric. Large population surveys? Binomial approximation works.
Continuous Distributions for Measurement and Time
These distributions model quantities that can take any value in a range. Focus on what each distribution's shape tells you about the underlying phenomenon.
Uniform Random Variables
Equal likelihood across an interval—every value between a and b is equally probable, with PDF f(x)=b−a1
Maximum entropy distribution when you only know the range; represents complete uncertainty within bounds
Mean (a+b)/2 and variance (b−a)2/12 depend only on the interval endpoints
Normal (Gaussian) Random Variables
Bell-shaped symmetry—defined by mean μ (center) and standard deviation σ (spread), with PDF f(x)=σ2π1e−2σ2(x−μ)2
Central Limit Theorem connection—sums of independent random variables converge to normal, explaining its ubiquity in nature
68-95-99.7 rule—approximately 68%, 95%, and 99.7% of values fall within 1, 2, and 3 standard deviations of the mean
Exponential Random Variables
Waiting time until an event—models time between Poisson events, with rate parameter λ and PDF f(x)=λe−λx for x≥0
Memoryless property is unique among continuous distributions; P(X>s+t∣X>s)=P(X>t)
Mean 1/λ and variance 1/λ2—higher rate means shorter expected waiting time
Compare: Exponential vs. Poisson—these are two sides of the same coin. Poisson counts events in an interval; exponential measures time between events. Same parameter λ links them: if arrivals are Poisson with rate λ, inter-arrival times are exponential with rate λ.
Compare: Normal vs. Uniform—both are continuous, but normal concentrates probability near the mean while uniform spreads it evenly. Normal arises from aggregation (CLT); uniform represents maximum ignorance within bounds.
Quick Reference Table
Concept
Best Examples
Binary outcomes
Bernoulli
Counting successes (fixed trials)
Binomial, Hypergeometric
Counting events (fixed interval)
Poisson
Trials until success
Geometric
Waiting/duration times
Exponential
Equal likelihood
Uniform (discrete or continuous)
Natural phenomena, aggregation
Normal (Gaussian)
Memoryless property
Exponential (continuous), Geometric (discrete)
Self-Check Questions
Which two distributions share the memoryless property, and what distinguishes them from each other?
You're modeling the number of defective items in a batch of 20 drawn from a shipment of 100. Should you use binomial or hypergeometric, and why?
Compare and contrast the Poisson and exponential distributions: what real-world scenario would use both, and how are their parameters related?
A student claims that P(X=2.5)=0.3 for a continuous random variable. What's wrong with this statement, and how should probabilities for continuous variables be expressed?
If you sum 50 independent Bernoulli random variables (each with p=0.4), what distribution describes the result? What distribution would the sum approximate if you instead summed 1000 such variables and standardized the result?
Random Variable Types to Know for Intro to Probabilistic Methods