๐Ÿ“ŠHonors Statistics

Types of Variables

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

In statistics, everything starts with understanding your data, and that means knowing what type of variable you're working with. The type of variable determines which statistical tests you can use, how you should display your data, and what conclusions you can draw. Classifying variables correctly matters because choosing the wrong analysis for your data type is one of the most common mistakes in statistics.

This topic connects directly to data collection, graphical displays, measures of center and spread, correlation, and regression. When a question asks you to "describe the distribution" or "justify your choice of test," your answer depends on correctly identifying variable types. Don't just memorize definitions. Know how each variable type behaves and what it allows (or doesn't allow) you to calculate.


Categorical vs. Quantitative: The First Split

Every variable falls into one of two broad camps: categorical (qualitative) or quantitative. This distinction drives every analysis decision you'll make. Categorical variables sort individuals into groups; quantitative variables measure or count something numerical.

Qualitative (Categorical) Variables

  • Describes categories or groups. You cannot perform meaningful arithmetic on these values.
  • Summarized using frequencies and proportions, displayed with bar charts or pie charts (not histograms).
  • Examples: eye color, zip code, favorite music genre. Even numeric codes like jersey numbers are categorical if doing math on them is meaningless. The "average jersey number" on a team tells you nothing useful.

Quantitative Variables

  • Represents numerical measurements or counts. Arithmetic operations like finding the mean are meaningful.
  • Summarized using center, spread, and shape, displayed with histograms, boxplots, or stemplots.
  • Examples: height, income, number of siblings. A good test: if you can average it and the result makes sense, it's quantitative.

Compare: Qualitative vs. Quantitative: both describe individuals in a dataset, but only quantitative variables allow you to calculate meaningful statistics like mean or standard deviation. Always justify your choice of graph by referencing the variable type.


Levels of Measurement: How Much Information Does Your Variable Carry?

Not all categorical variables are equal, and not all quantitative variables behave the same way. The level of measurement tells you what operations and comparisons are valid. Moving from nominal to ratio, each level adds more mathematical meaning.

Nominal Variables

  • Categories with no natural order. You can only check if values are the same or different.
  • Mode is the only appropriate measure of center; mean and median are meaningless here.
  • Examples: blood type (A, B, AB, O), political party affiliation, types of pets. There's no logical way to rank these categories.

Ordinal Variables

  • Categories with a meaningful rank or order, but the gaps between categories aren't necessarily equal.
  • Median is appropriate, but mean is problematic because you can't assume equal spacing between ranks. The difference between "strongly disagree" and "disagree" might not be the same size as the difference between "agree" and "strongly agree."
  • Examples: survey responses (strongly disagree to strongly agree), class rank, education level (high school, bachelor's, master's).

Compare: Nominal vs. Ordinal: both are categorical, but ordinal variables have a logical sequence. If a question involves satisfaction ratings, recognize that "satisfied > neutral > dissatisfied" is ordinal, not nominal.

Interval Variables

  • Meaningful, consistent differences between values, but no true zero point, so ratios don't make sense.
  • Both mean and median are appropriate. You can add and subtract, but you can't meaningfully say "twice as much."
  • Examples: temperature in Celsius or Fahrenheit (0ยฐF0ยฐF doesn't mean "no temperature"), calendar years (year 0 isn't "no time"), and SAT scores.

Ratio Variables

  • True zero point exists, meaning zero represents the complete absence of the quantity.
  • All arithmetic operations are meaningful, including statements like "twice as much" or "half as long."
  • Examples: weight, height, age, income, and time elapsed. Saying someone is "twice as old" only works because age has a true zero.

Compare: Interval vs. Ratio: both are quantitative with equal intervals, but only ratio variables have a true zero. Temperature in Kelvin is ratio (0ย K0 \text{ K} = no molecular motion), while Fahrenheit is interval. This distinction matters when you're interpreting ratios in context.


Discrete vs. Continuous: How Does the Variable Behave?

Within quantitative variables, there's another important distinction based on what values are possible. Discrete variables jump between values; continuous variables flow smoothly across a range.

Discrete Variables

  • Can only take specific, countable values, typically whole numbers with gaps between possible outcomes.
  • Often arise from counting rather than measuring. You could, in theory, list all possible values.
  • Examples: number of text messages sent, number of pets owned, and shoe sizes. Even with half-sizes, shoe sizes take only distinct, separated values.

Continuous Variables

  • Can take any value within a range, including decimals and fractions with infinite precision.
  • Arise from measuring rather than counting. The precision is limited only by your instrument.
  • Examples: exact height, reaction time, blood pressure. Between any two measurements, there's always a more precise value possible.

Compare: Discrete vs. Continuous: both are quantitative, but continuous variables require intervals or ranges when creating frequency tables (you can't list every possible value). When choosing between a bar chart and a histogram, ask yourself: "Is this counted or measured?"


Experimental Roles: Independent and Dependent Variables

In studies and experiments, variables play different roles depending on what you're investigating. The independent variable is what you change or compare; the dependent variable is what you measure as a result.

Independent Variables

  • The explanatory or predictor variable. It's manipulated by the researcher or used to form comparison groups.
  • Plotted on the x-axis in scatterplots; used to explain variation in the response.
  • Examples: drug dosage in a clinical trial, hours of study time, type of fertilizer applied.

Dependent Variables

  • The response or outcome variable. It's measured to see if it changes when the independent variable changes.
  • Plotted on the y-axis in scatterplots; this is what you're trying to predict or explain.
  • Examples: blood pressure after treatment, exam score, crop yield.

A helpful way to remember: the dependent variable depends on the independent variable. In a sentence, you'd say "Does [independent] affect [dependent]?" For instance, "Does study time affect exam score?"

Compare: Independent vs. Dependent: the independent variable does the explaining, the dependent variable gets explained. In regression, you predict y^\hat{y} (dependent) from xx (independent). Questions often ask you to identify which is which. Look for what's being manipulated or compared versus what's being measured as an outcome.


Quick Reference Table

ConceptTypeBest Examples
Categorical (Qualitative)โ€”Eye color, zip code, favorite genre
Quantitativeโ€”Height, income, test score
NominalCategoricalBlood type, political party, pet type
OrdinalCategoricalSurvey ratings, class rank, education level
IntervalQuantitativeTemperature (ยฐF/ยฐC), calendar year
RatioQuantitativeWeight, age, distance, time elapsed
DiscreteQuantitativeNumber of siblings, shoe size, cars owned
ContinuousQuantitativeExact height, reaction time, blood pressure

Self-Check Questions

  1. A survey asks respondents to rate their agreement on a scale from 1 (strongly disagree) to 5 (strongly agree). What type of variable is this, and why would calculating the mean be problematic?

  2. Which two variable types both involve categories but differ in whether the categories can be ranked? Give an example of each.

  3. Temperature measured in Kelvin and temperature measured in Fahrenheit are both quantitative, but they represent different levels of measurement. Explain the distinction and why it matters for interpretation.

  4. A researcher records the number of hours students sleep (to the nearest tenth) and the number of classes they skip per week. Classify each variable as discrete or continuous, and explain your reasoning.

  5. In a study examining whether caffeine affects reaction time, identify the independent and dependent variables. If you were to create a scatterplot, which variable would go on each axis?