June 3, 2020

We start off things with a review of **categorical** and **quantitative** data. **Categorical data** is data that is an attribute that is usually represented with a percentage or proportion, while **quantitative** data is data that can be represented by a number. If data can be averaged, it's quantitative.

When we collect data, we will record data from two different variables at the same time. Sometimes we have two categorical variables such as “class” and “does homework on time,” for example. We will want to know whether certain classes, such as juniors, finish their homework on time more than compared to other classes.

Bivariate categorical data can be represented using a histogram, frequency chart, or a mosaic plot.

Other times, we’ll have two sets of qualitative data such as plant height and amount of fertilizer used and we’ll want to see whether more fertilizer is correlated with a taller plant. This type of data is usually represented using a scatterplot.

In both cases, we are trying to see whether both variables are related to each other. If we know that two variables are related to each other, then we may be able to predict the behavior of another variable if we know the value of one variable. Keep in mind that finding that two variables do not influence each other can also be just as strong of a analytical discovery than finding that they do influence each other.

**🎥Watch: AP Stats - Scatterplots and Association**

