4.2 Mean or Expected Value and Standard Deviation

3 min readjune 27, 2024

When dealing with random variables, and standard deviation are key concepts. They help us understand the average outcome and spread of possible values, which is crucial for making predictions and analyzing data.

These measures are especially important for discrete variables, like dice rolls or coin flips. By calculating expected value and standard deviation, we can better grasp the behavior of random events and make informed decisions based on probability.

Expected Value and Standard Deviation

Expected value of discrete variables

Top images from around the web for Expected value of discrete variables
Top images from around the web for Expected value of discrete variables
  • [E(X)](https://www.fiveableKeyTerm:E(X))[E(X)](https://www.fiveableKeyTerm:E(X)) or μ\mu represents the average value of a XX over many trials
  • Calculated by multiplying each possible value xx by its probability P(X=x)P(X = x) and summing the products
  • Example: For a fair six-sided die, E(X)=116+216+316+416+516+616=3.5E(X) = 1 \cdot \frac{1}{6} + 2 \cdot \frac{1}{6} + 3 \cdot \frac{1}{6} + 4 \cdot \frac{1}{6} + 5 \cdot \frac{1}{6} + 6 \cdot \frac{1}{6} = 3.5
  • Provides a central value around which the random variable is distributed
  • Useful for making predictions and comparing different probability distributions
  • The probability mass function describes the for discrete random variables

Law of large numbers interpretation

  • Connects () with as the number of trials increases
  • Experimental probability converges to the theoretical probability for a large size
  • Example: Flipping a fair coin (theoretical probability of heads = 0.5) many times results in the proportion of heads approaching 0.5
  • Justifies the use of theoretical probability for making predictions in real-world situations
  • Explains why large sample sizes are preferred for accurate estimation of parameters
  • Related to the , which describes the distribution of sample means for large samples

Variance and standard deviation computation

  • [Var(X)](https://www.fiveableKeyTerm:Var(X))[Var(X)](https://www.fiveableKeyTerm:Var(X)) measures the average squared distance between the random variable XX and its mean μ\mu
  • Standard deviation σ\sigma is the square root of the variance, providing a measure of dispersion in the original units
  • Steps to calculate:
    1. Find the expected value (mean) μ\mu of the random variable XX
    2. For each possible value xx, calculate (xμ)2(x - \mu)^2 and multiply by its probability P(X=x)P(X = x)
    3. Sum the products to obtain the variance Var(X)Var(X)
    4. Take the square root of the variance to find the standard deviation σ\sigma
  • Example: For a discrete random variable XX with P(X=1)=0.2P(X = 1) = 0.2, P(X=2)=0.5P(X = 2) = 0.5, and P(X=3)=0.3P(X = 3) = 0.3, Var(X)=(12)20.2+(22)20.5+(32)20.3=0.5Var(X) = (1 - 2)^2 \cdot 0.2 + (2 - 2)^2 \cdot 0.5 + (3 - 2)^2 \cdot 0.3 = 0.5 and σ=0.50.71\sigma = \sqrt{0.5} \approx 0.71
  • Higher variance and standard deviation indicate greater variability in the random variable's values
  • The (CV) is a standardized measure of dispersion, calculated as the ratio of the standard deviation to the mean

Population and Sample Statistics

  • Population parameters describe characteristics of entire populations
  • Sample statistics estimate population parameters using data from a subset (sample) of the population
  • The is a common probability distribution for continuous random variables, often used to model population distributions

Key Terms to Review (17)

Central Limit Theorem: The central limit theorem states that the sampling distribution of the sample mean will be approximately normal, regardless of the shape of the population distribution, as the sample size increases. This theorem is a fundamental concept in statistics that underpins many statistical inferences and analyses.
Coefficient of Variation: The coefficient of variation (CV) is a statistical measure that quantifies the relative dispersion or variability of a dataset. It is calculated as the ratio of the standard deviation to the mean, and is often expressed as a percentage. The coefficient of variation provides a standardized way to compare the spread of different datasets, even if they have different units or means.
Discrete Random Variable: A discrete random variable is a variable that can only take on a countable number of distinct values, usually integers. It represents a quantity that is measured or observed in a random experiment, where the outcome can only be one of a set of specific, non-overlapping values.
E(X): E(X), or the expected value of a random variable X, represents the long-term average or mean of the possible values that X can take on. It is a measure of central tendency that describes the typical or central value of the probability distribution of X.
Expected Value: Expected value is a statistical concept that represents the average or central tendency of a probability distribution. It is the sum of the products of each possible outcome and its corresponding probability, and it provides a measure of the typical or expected result of a random experiment or process.
Experimental Probability: Experimental probability is the likelihood of an outcome based on the results of an experiment or observation, rather than theoretical calculations. It is a measure of how often a particular event occurs in a series of trials or repeated experiments.
Law of Large Numbers: The law of large numbers is a fundamental principle in probability theory that states that as the number of independent trials or observations increases, the average of the results will converge towards the expected value or mean of the underlying probability distribution. This law underpins many important statistical concepts and applications.
Normal Distribution: The normal distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetrical and bell-shaped. It is a fundamental concept in statistics and probability theory, with widespread applications across various fields, including the topics covered in this course.
Population: In the context of statistics, a population refers to the entire set of individuals, objects, or measurements of interest that a researcher wants to study or draw conclusions about. It represents the complete group that is the focus of the statistical analysis, from which a sample may be drawn for further investigation.
Probability Distribution: A probability distribution is a mathematical function that describes the likelihood or probability of different possible outcomes or values occurring in a given situation or experiment. It is a fundamental concept in the field of statistics and probability that helps quantify and analyze the uncertainty associated with random variables.
Relative Frequency: Relative frequency is a statistical measure that expresses the proportion or percentage of observations in a dataset that fall into a particular category or bin. It is a way to describe the frequency of occurrence of a specific value or event relative to the total number of observations in the dataset.
Sample: A sample is a subset of a larger population that is selected to represent the characteristics of the entire population. It is a crucial concept in statistics, probability, and data analysis, as it allows researchers to draw inferences about the population based on the information gathered from the sample.
Theoretical Probability: Theoretical probability is the likelihood or chance of an event occurring, calculated based on the underlying mathematical model or theory, rather than observed data. It is a fundamental concept in probability theory that provides a systematic way to quantify the expected likelihood of events.
Var(X): Var(X), or the variance of a random variable X, is a measure of the spread or dispersion of the values that X can take on. It quantifies how much the values of X tend to deviate from the expected value or mean of X.
Variance: Variance is a statistical measure that quantifies the amount of variation or dispersion in a dataset. It represents the average squared deviation from the mean, providing a way to understand the spread or distribution of data points around the central tendency.
μ (Mu): μ, or mu, is a Greek letter that represents the population mean or average in statistical analysis. It is a fundamental concept that is crucial in understanding various statistical topics, including measures of central tendency, probability distributions, and hypothesis testing.
σ: σ, or the Greek letter sigma, is a statistical term that represents the standard deviation of a dataset. The standard deviation is a measure of the spread or dispersion of the data points around the mean, and it is a fundamental concept in probability and statistics that is used across a wide range of topics in this course.
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.