Measures of Central Tendency
Calculation of central tendency measures
Mean is what most people think of as "the average." You add up every value in a data set and divide by how many values there are.
where is the sum of all values and is the number of values.
For example, if your test scores are 80, 85, 90, 95, and 100, the mean is .
One thing to watch out for: the mean is sensitive to outliers. If one score were 20 instead of 80, the mean would drop to 78, even though most scores are still high.
Median is the middle value when you line up all the data from least to greatest.
- If you have an odd number of values, the median is the one right in the center.
- If you have an even number of values, you take the average of the two middle values.
For example, with the data set 3, 7, 9, 12, 15, the median is 9. With the data set 3, 7, 9, 12, the median is .
The median is resistant to outliers, which makes it a better choice for skewed data like income or housing prices.
Mode is the value that appears most often in a data set.
- A data set can have no mode (every value appears once, like 1, 2, 3, 4, 5)
- One mode / unimodal (like 1, 2, 2, 3, 4, where 2 is the mode)
- Multiple modes / bimodal or multimodal (like 1, 2, 2, 3, 3, 4, where both 2 and 3 are modes)
The mode is especially useful for categorical data like favorite color or shoe size, where calculating a mean wouldn't make sense.

Best measure for data representation
Picking the right measure of central tendency depends on what your data looks like:
- Use the mean when values are evenly distributed with no major outliers. If the data points cluster around a central value without big gaps, the mean gives you the most informative summary.
- Use the median when the data is skewed or contains outliers. For example, a few very expensive houses in a neighborhood would pull the mean way up, so real estate reports typically use median home price instead.
- Use the mode when working with categorical or discrete data. If you want to know the most common blood type or the best-selling product, the mode is the only measure that applies.

Real-world applications of central tendency
When you're solving a problem that asks for a "typical" or "representative" value, follow these steps:
- Identify whether the data is numerical (test scores, heights) or categorical (eye color, car brands).
- Choose the right measure based on the data type and distribution.
- Calculate that measure using the given data.
- Interpret the result in context. A median income of $45,000 tells you something different than a mean income of $62,000, and understanding why they differ (skew from high earners) is part of the answer.
Probability Concepts
Probability concepts for outcome prediction
Probability measures how likely an event is to happen. It's always a value between 0 (impossible) and 1 (certain), and you can also express it as a percentage.
For example, the probability of rolling a 3 on a standard die is , since there's 1 favorable outcome out of 6 total.
Sample space is the set of all possible outcomes of an experiment. For rolling a die, the sample space is {1, 2, 3, 4, 5, 6}. For flipping a coin, it's {heads, tails}.
Event is any subset of the sample space. It can be a single outcome or a collection of outcomes. Rolling an even number on a die is an event that includes {2, 4, 6}, so .
Complementary events are two events that together cover the entire sample space with no overlap. Their probabilities always add up to 1:
This is useful as a shortcut. If the probability of rain is 0.3, the probability of no rain is .
Independent events are events where one happening doesn't change the probability of the other. To find the probability of both occurring, multiply their individual probabilities:
For example, rolling a 4 on a die and then flipping heads on a coin: .
Mutually exclusive events are events that cannot happen at the same time. Rolling an even number and rolling an odd number on a single die are mutually exclusive. The probability of either one occurring is the sum of their probabilities:
So , which makes sense since every roll is one or the other.
Conditional probability is the probability of an event happening given that another event has already occurred. The formula is:
For example, a standard deck has 12 face cards, and 4 of those are kings. The probability of drawing a king given that you've drawn a face card is .