🧰Engineering Applications of Statistics Unit 3 – Random Variables & Probability Models
Random variables and probability models form the backbone of statistical analysis in engineering. These concepts allow engineers to quantify uncertainty and variability in systems, from material properties to process outcomes. By assigning numerical values to random events, we can predict and analyze complex phenomena.
Probability distributions, expected values, and variance provide powerful tools for modeling real-world scenarios. Engineers use these concepts to assess risks, optimize designs, and make data-driven decisions. Understanding joint distributions and transformations of random variables enables more sophisticated analysis of interrelated systems and processes.
Random variables assign numerical values to outcomes of random experiments
Probability distributions describe the likelihood of different values occurring for a random variable
Expected value represents the average value of a random variable over many trials
Variance measures the spread or dispersion of a random variable around its expected value
Calculated by taking the average of the squared differences between each value and the mean
Joint probability distributions describe the probability of two or more random variables occurring together
Transformations of random variables involve applying functions to change their properties or create new random variables
Random variables and probability distributions have numerous applications in engineering for modeling uncertainty and variability
Types of Random Variables
Discrete random variables have countable values (integers, finite set of numbers)
Examples include the number of defective items in a batch or the number of customers arriving in an hour
Continuous random variables can take on any value within a specified range
Often represented by real numbers on a continuous scale (weight, temperature, time)
Mixed random variables have both discrete and continuous components
Bernoulli random variables have only two possible outcomes (success or failure, 0 or 1)
Used to model binary events like a coin flip or a component functioning or failing
Binomial random variables count the number of successes in a fixed number of independent Bernoulli trials
Poisson random variables model the number of events occurring in a fixed interval of time or space
Useful for modeling rare events (number of defects per unit area, customer arrivals per hour)
Probability Distributions
Probability mass functions (PMFs) define the probability of each possible value for discrete random variables
The sum of all probabilities in a PMF must equal 1
Probability density functions (PDFs) describe the relative likelihood of different values for continuous random variables
The area under the PDF curve between two values represents the probability of the variable falling in that range
Cumulative distribution functions (CDFs) give the probability that a random variable is less than or equal to a certain value
Common discrete distributions include Bernoulli, binomial, Poisson, and geometric
Common continuous distributions include uniform, normal (Gaussian), exponential, and beta
The parameters of a distribution (mean, variance, etc.) determine its shape and properties
Expected Value and Variance
The expected value (mean) of a discrete random variable X is calculated as: E[X]=∑xx⋅P(X=x)
Multiply each possible value by its probability and sum the results
For a continuous random variable, the expected value is: E[X]=∫−∞∞x⋅f(x)dx
The variance of a random variable X is: Var(X)=E[(X−E[X])2]=E[X2]−(E[X])2
Measures how far the values are from the mean on average
The standard deviation is the square root of the variance: σ=Var(X)
Chebyshev's inequality bounds the probability of a random variable deviating from its mean by a certain number of standard deviations
For any distribution, at least 1−k21 of the values lie within k standard deviations of the mean
Joint Probability Distributions
Joint probability mass functions (JPMFs) give the probability of two or more discrete random variables occurring together
P(X=x,Y=y) is the probability that X takes on value x and Y takes on value y simultaneously
Joint probability density functions (JPDFs) describe the relative likelihood of different combinations of values for continuous random variables
Marginal distributions are obtained by summing (discrete) or integrating (continuous) the joint distribution over the other variables
Conditional distributions give the probability of one variable given the value of another
For discrete variables: P(Y=y∣X=x)=P(X=x)P(X=x,Y=y)
Independence means the probability of one variable does not depend on the value of the other
For independent variables, the joint distribution is the product of the marginal distributions
Transformations of Random Variables
Linear transformations involve multiplying a random variable by a constant and/or adding a constant
If Y=aX+b, then E[Y]=aE[X]+b and Var(Y)=a2Var(X)
Non-linear transformations apply a function to a random variable to create a new one
The PDF of the transformed variable is related to the original PDF by the Jacobian of the transformation
Convolution is used to find the distribution of the sum of two independent random variables
The PDF of the sum is the convolution of the individual PDFs
The Central Limit Theorem states that the sum of many independent random variables approaches a normal distribution
Useful for approximating complex distributions with the normal distribution
Applications in Engineering
Modeling the strength of materials as random variables to assess reliability and failure probabilities
Analyzing measurement errors and uncertainties using probability distributions
Using the Poisson distribution to model the occurrence of rare events like equipment failures or defects
Applying the normal distribution to describe variability in manufacturing processes and quality control
Employing joint distributions to study the relationship between multiple variables (stress and strain, demand and supply)
Transforming random variables to create new distributions that better fit empirical data or simplify calculations
Utilizing the Central Limit Theorem to justify the use of normal-based statistical methods for large sample sizes
Practice Problems and Examples
A fair six-sided die is rolled. Let X be the number shown on the top face. Find the PMF, expected value, and variance of X.
The time until failure of a certain component follows an exponential distribution with a mean of 10 hours. What is the probability that the component lasts more than 15 hours?
Two fair coins are flipped. Let X be the number of heads observed. Find the PMF of X.
Now let Y be the number of tails observed. Find the joint PMF of X and Y.
The weights of apples harvested from a tree follow a normal distribution with a mean of 150 grams and a standard deviation of 20 grams. What proportion of apples weigh between 120 and 170 grams?
A machine fills bottles with a liquid. The volume of liquid in each bottle follows a normal distribution with a mean of 500 mL and a standard deviation of 10 mL. Find the probability that the total volume in a random sample of 50 bottles exceeds 25,500 mL.
A company produces steel cables whose breaking strengths are normally distributed with a mean of 1800 kg and a standard deviation of 150 kg. If the cables are designed to withstand a load of 1500 kg, what percentage of cables will fail under this load?