📈Theoretical Statistics Unit 1 – Probability Theory Basics
Probability theory forms the foundation of statistical analysis, providing tools to quantify uncertainty and make predictions. This unit covers key concepts like sample spaces, events, and probability axioms, as well as techniques for calculating probabilities and working with random variables.
Students learn about various probability distributions, including discrete and continuous types, and their applications. The unit also explores important concepts like expectation, variance, and covariance, which are essential for understanding and analyzing random phenomena in real-world situations.
Random variable (X) function that assigns a real number to each outcome in a sample space
Discrete random variable has a countable number of possible values (number of heads in 10 coin flips)
Continuous random variable has an uncountable number of possible values within a range (time taken to complete a task)
Probability mass function (PMF) for a discrete random variable X, denoted by pX(x), gives the probability of X taking on a specific value x
Properties: pX(x)≥0 for all x, and ∑xpX(x)=1
Cumulative distribution function (CDF) for a random variable X, denoted by FX(x), gives the probability of X being less than or equal to a specific value x
Formula: FX(x)=P(X≤x)
Properties: 0≤FX(x)≤1, limx→−∞FX(x)=0, limx→∞FX(x)=1, and FX(x) is non-decreasing
Probability density function (PDF) for a continuous random variable X, denoted by fX(x), is used to calculate probabilities for ranges of values
Properties: fX(x)≥0 for all x, and ∫−∞∞fX(x)dx=1
Relationship with CDF: FX(x)=∫−∞xfX(t)dt
Expectation and Variance
Expectation (mean) of a discrete random variable X, denoted by E[X] or μX, is the weighted average of all possible values
Formula: E[X]=∑xx⋅pX(x)
Expectation of a continuous random variable X is calculated using the PDF
Formula: E[X]=∫−∞∞x⋅fX(x)dx
Linearity of expectation for random variables X and Y and constants a and b: E[aX+bY]=aE[X]+bE[Y]
Variance of a random variable X, denoted by Var(X) or σX2, measures the average squared deviation from the mean
Formula for discrete X: Var(X)=E[(X−μX)2]=∑x(x−μX)2⋅pX(x)
Formula for continuous X: Var(X)=∫−∞∞(x−μX)2⋅fX(x)dx
Standard deviation σX is the square root of the variance
Properties of variance: Var(aX+b)=a2Var(X) for constants a and b, and Var(X+Y)=Var(X)+Var(Y) for independent random variables X and Y
Covariance Cov(X,Y) measures the linear relationship between two random variables X and Y
Formula: Cov(X,Y)=E[(X−μX)(Y−μY)]
Correlation coefficient ρX,Y standardizes covariance to be between -1 and 1
Formula: ρX,Y=σXσYCov(X,Y)
Common Probability Distributions
Bernoulli distribution models a single trial with two possible outcomes (success with probability p, failure with probability 1-p)
PMF: pX(x)=px(1−p)1−x for x∈{0,1}
Mean: E[X]=p, Variance: Var(X)=p(1−p)
Binomial distribution models the number of successes in a fixed number of independent Bernoulli trials
PMF: pX(x)=(xn)px(1−p)n−x for x∈{0,1,…,n}
Mean: E[X]=np, Variance: Var(X)=np(1−p)
Poisson distribution models the number of rare events occurring in a fixed interval of time or space
PMF: pX(x)=x!e−λλx for x∈{0,1,2,…}
Mean: E[X]=λ, Variance: Var(X)=λ
Uniform distribution models a random variable with constant probability density over a specified range
PDF (continuous): fX(x)=b−a1 for x∈[a,b]
Mean: E[X]=2a+b, Variance: Var(X)=12(b−a)2
Normal (Gaussian) distribution models many natural phenomena and has a bell-shaped PDF
PDF: fX(x)=σ2π1e−2σ2(x−μ)2 for x∈(−∞,∞)
Mean: E[X]=μ, Variance: Var(X)=σ2
Exponential distribution models the time between rare events in a Poisson process
PDF: fX(x)=λe−λx for x≥0
Mean: E[X]=λ1, Variance: Var(X)=λ21
Applications and Problem-Solving Techniques
Identify the sample space and events relevant to the problem
Determine the type of probability distribution that best models the situation (discrete or continuous, specific distribution)
Use the given information to find the parameters of the distribution (success probability, mean, variance)
Apply the appropriate probability rules and formulas to calculate the desired probabilities or values
Example: using the binomial PMF to find the probability of a specific number of successes in a fixed number of trials
Utilize conditional probability and Bayes' theorem when dealing with dependent events or updating probabilities based on new information
Recognize when to use the law of total probability to break down a complex problem into simpler subproblems
Apply the properties of expectation and variance to solve problems involving random variables
Example: using linearity of expectation to find the mean of a sum of random variables
Interpret the results in the context of the original problem and communicate the findings clearly
Verify the reasonableness of the solution by checking if the probabilities are within the valid range [0, 1] and if the results make sense intuitively