Frequency and Frequency Tables
Frequency tables and measurement levels give you the tools to organize raw data and figure out what you can actually do with it. Frequency tables show how often each value appears, while measurement levels tell you which statistical operations make sense for your data. Getting these right early on matters because they directly affect every analysis choice you'll make later.
Frequency and Relative Frequency Calculations
Frequency () is simply the count of how many times a particular value or category shows up in your dataset. If you survey 50 cars in a parking lot and count 10 red ones, the frequency of red cars is 10.
Relative frequency takes that count and turns it into a proportion of the whole dataset. You calculate it by dividing the frequency by the total number of observations ():
For those 10 red cars out of 50 total: , or 20%. This is more useful than raw frequency when you want to compare groups of different sizes.
Frequency Tables with Cumulative Data
A frequency table organizes every distinct value or category alongside its frequency, relative frequency, and cumulative totals. To build one:
- List all unique values or categories in the first column
- Count and record the frequency of each in the second column
- Compute the relative frequency for each () in the third column
- Calculate the cumulative frequency by adding each row's frequency to the running total of all rows above it
- Calculate the cumulative relative frequency the same way, but with relative frequencies
Here's a completed example using car color data ():
| Color | Frequency | Relative Frequency | Cumulative Frequency | Cumulative Relative Frequency |
|---|---|---|---|---|
| Red | 10 | 0.20 | 10 | 0.20 |
| Blue | 15 | 0.30 | 25 | 0.50 |
| Green | 12 | 0.24 | 37 | 0.74 |
| Black | 13 | 0.26 | 50 | 1.00 |
Notice that the cumulative relative frequency in the last row always equals 1.0 (or 100%). If it doesn't, you've made an error somewhere. The cumulative columns are especially helpful for answering questions like "What proportion of cars are blue or red?" You can read that directly from the Blue row's cumulative relative frequency: 0.50, or 50%.
Levels of Measurement
The four levels of measurement form a hierarchy. Each level includes the properties of the ones below it, and unlocks additional statistical operations.
Levels of Measurement Comparison
Nominal data consists of categories or labels with no inherent order. You can count how often each category appears, but ranking or averaging them makes no sense. Examples: eye color (blue, brown, green), blood type (A, B, AB, O), zip codes.
Ordinal data has categories with a meaningful order or rank, but the distances between ranks aren't consistent or measurable. You know that "strongly agree" is higher than "agree," but you can't say the gap between them equals the gap between "agree" and "neutral." Examples: education level (high school, bachelor's, master's), satisfaction ratings on a Likert scale.
Interval data has both meaningful order and equal spacing between values, but no true zero point. A temperature of 0°F doesn't mean "no temperature," so you can't say 80°F is "twice as hot" as 40°F. You can say the difference between 40°F and 60°F is the same as between 60°F and 80°F. Examples: temperature in °C or °F, calendar years.
Ratio data has everything interval data has, plus a true zero that means "none of this quantity exists." This makes ratios meaningful: someone who weighs 200 pounds genuinely weighs twice as much as someone who weighs 100 pounds. Examples: height (inches), weight (pounds), income (dollars), age (years).

Data Classification and Measurement Scales
Types of Data
Discrete data can only take specific, separate values, typically whole numbers. You can't have 2.7 students in a class or sell 4.5 cars. You can count discrete data, but you can't subdivide it meaningfully.
Continuous data can take any value within a range, including decimals and fractions. Height, weight, and temperature are continuous because a person could be 5.6237 feet tall, even if you'd normally round that.
The distinction matters because discrete and continuous data call for different types of graphs and statistical tests. For example, bar charts work well for discrete data, while histograms are better suited for continuous data.
Measurement Scales
The four levels (nominal, ordinal, interval, ratio) build on each other in a hierarchy. Nominal is the most limited; ratio is the most flexible. As you move up the hierarchy, you gain access to more statistical tools:
- Nominal: mode only
- Ordinal: mode, median
- Interval: mode, median, mean, standard deviation
- Ratio: all of the above, plus meaningful ratios and proportions
Identifying your data's measurement level before you start analyzing is one of the most practical habits you can build. It prevents mistakes like averaging zip codes or computing a ratio of temperatures in Fahrenheit.