We talked a lot about distributions and how to describe, summarize and represent them, now it is time to put more into practice by comparing the multiple sets of data. Comparing sets of histograms and boxplots, the above information would come alive to you.
Comparing Groups with Histograms
Records are kept by each state in the United States on the number of pupils enrolled in public schools and the number of teachers employed by public schools for each school year. From these records, the ratio of the number of pupils to the number of teachers (P-T ratio) can be calculated for each state. The histograms below show the P-T ratio for every state during the 2001–2002 school year. The histogram on the left displays the ratios for the 24 states that are west of the Mississippi River, and the histogram on the right displays the ratios for the 26 states that are east of the Mississippi River.
Source: The College Board AP Classroom
The question asks us to estimate the median ( not to compute but estimate). For states west of the Mississippi (n = 24), n/2, the median falls between the 12th and 13th value in the ordered list, and both the 12th and 13th values fall in the interval 15–16. For states east of the Mississippi (n = 26) the median falls between the 13th and 14th value in the ordered list, and both of these values also fall in the interval 15–16. So both groups have median at least 15 or at most 16 students per teacher.
b. Write a few sentences comparing the distributions of P-T ratios for states in the two groups (west and east) during the 2001–2002 school year.
Here you apply the three things about the distribution: shape, center and spread one by one. Always start with shape first. The shapes of the two histograms look different. The histogram for West is unimodal and skewed to the right, whereas the histogram East is unimodal and nearly symmetric.
For the center we already found in part (a), that the medians of the two distributions are about the same, between 15 and 16 for both distributions.
And last report the spread. Look at how the values are scattered or concentrated next to its center on the distributions. The histograms show that West values vary more than in East.. Although the data are grouped but we still can approximate the range. The range for the west is at most 22 – 12 = 10, and the range for the east is at most 19 –12 = 7. The east has less variability compared to the West.
c. Using your answers in parts (a) and (b), explain how you think the mean P-T ratio during the 2001–2002 school year will compare for the two groups (west and east).
The two histograms have different shapes. Since West is skewed to the right , the mean will be higher and greater than the median. The highest number on the right tail will affect the mean number. For East, since it is fairly symmetric, the mean will be close to the median. To compare the two groups, we can conclude that the mean for the west group will probably be greater than the mean for the east group.
Comparing Groups with Boxplots
Basketball visualization - Investigative Task
A team of psychologists studied the concept of visualization in basketball, where players visualize making a basket before shooting the ball. They conducted an experiment in which 20 basketball players with similar abilities were randomly assigned to two groups. The 10 players in group 1 received visualization training, and the 10 players in group 2 did not.
Each player stood 22 feet from the basket at the same location on the basketball court. Each player was then instructed to attempt to make the basket until two consecutive baskets were made. The players who received visualization training were instructed to use visualization techniques before attempting to make the basket. The total number of attempts, including the last two attempts, were recorded for each player.
The total number of attempts for each of the 20 players are summarized in the following boxplots.
Source: The College Board
We have two groups, with 10 basketball players randomly assigned to each group. We learn from the question that group 1 received visualization training but group 2. There are a few things here we can compare to find the answer to the question. We can see both groups have the same minimum attempts, and all other measures are different. 25 % of the time the group 1 made the basket in 3 trials but group 2 in 4 trials. Look at the median. The median is much lower for group 1 than for group 2. Group 1 has an outlier, which is still less than the maximum of group 2. We can see that the training had an impact on group 1, as all the 5 summary measures are less than from group 2. But we are not asked to generalize this finding yet.
Finally, to answer the question it is good enough only to report the median. Because the median number of attempts for players who received visualization training (4) is less than the median number of attempts for players who did not receive training (7), those who received visualization training tend to need fewer attempts to make two consecutive baskets.