AP Statistics
11 min read•Last Updated on July 11, 2024
Jerry Kosoff
Jerry Kosoff
Practicing with FRQs is a great way to prep for the AP exam! Work through this FRQ from Unit 6, then review sample student responses and corresponding feedback from Fiveable teacher Jerry Kosoff!
After completing a sale, a car company likes to send a follow-up survey where customers can indicate their level of satisfaction with their experience. One of the questions in the survey asks “would you recommend our company to a friend looking to purchase a vehicle?” The company wonders if people would answer the question differently based on whether they bought a new or used vehicle. From a list of all 2018 vehicle sales, the company randomly selects 105 customers who bought a new vehicle 120 customers who bought a used vehicle. 88 of the customers who bought new vehicles answered “yes,” while 85 of the customers who bought used vehicles answered “yes.”
At the significance level of 0.05, do the data provide convincing statistical evidence that the proportion of customers who would answer “yes” to the survey question is different for new vs used vehicle sales?
P(yes|new) = 88/105 = 0.838
P(yes|used) = 85/120 = 0.708
margin of error = +/- (1.960)sqrt[(0.838(0.162)/105)+(0.708(0.292)/120)]
margin of error = +/- 0.108
confidence interval = 0.05 +/- 0.108 = (-0.058, 0.113)
No, the data do not provide convincing statistical evidence that the proportion of customers who would answer “yes” to the survey question is different for new vs used vehicle sales. Since 0 is captured in the 95% confidence interval of -0.058 and 0.113, the data shows that the true difference in proportions could be 0.
I’ll give feedback on your work below, but I want to start with noticing that you used a 2-sample confidence interval to answer the question. That is a totally valid strategy for a situation like this, but only because the alternative hypothesis was “different”; had the scenario asked “higher” or “lower” the confidence interval would not work in the same way. Typically, when given a significance level, and asked if there is “convincing statistical evidence” of something, we should be running a hypothesis test. That said, you will still be scored for your work with the confidence interval.
The scoring for a “convincing statistical evidence…” scenario includes:
Given that list (some parts are scored together to create a question with 3-4 scoring components), you can likely see that your work doesn’t have enough there to be earning much of the available credit. You calculate the appropriate margin of error, and therefore obtain a confidence interval, but never name the interval, check conditions (random samples, approximately normal sampling distribution [at least 10 successes/at least 10 failures], 10% condition), or write hypotheses. Additionally, you used “0.05” in the interval, instead of using (0.838 - 0.708 = 0.13) as your difference of proportions to add/subtract the margin of error. That would have led you to a different confidence interval where 0 was not included. Given that your interval did include 0 though, your conclusion that we do not have convincing evidence would get scored as correct, because you interpreted the answer you got correctly. Unfortunately, you would not get credit for the other components of the question.
p_1 = the proportion of customers who bought a new vehicle and answered yes to the survey question
p_2 = the proportion of customers who bought a new vehicle and answered yes to the survey question
2-sample z test for p_1 - p_2
H_0: p_1 - p_2 = 0, H_a: p_1 - p_2 not equal 0
Conditions:
p-hat_2 = 85/120 = 0.708, n_2=120
z* = 2.303
P-val = P(z>=2.303 or z<=-2.303) = 0.021247
Since 0.021247 < alpha of 0.05, we reject the null hypothesis, because there is convincing statistical evidence that the proportion of customers who would answer “yes” to the survey question is different for new vs used vehicle sales.
Strong execution from top to bottom, presented clearly. One thing that you’re going to facepalm about: you defined the parameters p1 and p2 as the exact same thing. “p2” should say “used”
Test type- two sample proportional difference hypothesis z-test
simple random sample: stated in the problem- “the company randomly selects 105 customers…”
independence: 10(105)=1050 Assume that the population of new cars purchased in 2018 is greater than 1050.
normal: .84(105)=(88 >= 10), .162(105)=(17 >= 10)All of the conditions for new cars are met. Conditions for old cars:
simple random sample: stated in the problem- “the company randomly selects…120 customers…”
independence: 10(120)=1200 Assume that the population of new cars purchased in 2018 is greater than 1200.
normal: .708(120)=(85 >= 10), .292(120)=(35 >= 10)
All of the conditions for old cars are met. Solve
z=.84-.708/(sqrt.(.25x.75)/105 + (.25x.75)/120) = 2.28 --> *I looked at table z to find the p-value of -2.28, p-value=.0129(2)=.0258 Conclusion
(.0258<.05 our significance level) --> Reject the H0 in favor of the Ha. We have significant evidence that the proportion of all customers from a list of 2018 vehicle sales who bought a new vehicle and said they would recommend the company is different than the proportion of all customers from a list of 2018 vehicle sales who bought a used vehicle and said that they would recommend our company to a friend.
This is about as thorough a response as I’ve seen! Very well done - you’ve nailed all of the components.
p1= Proportion of customers who bought a new car and answers “yes” to the survey question.
p2= Proportion of customers who bought a used car and answered “yes” to the survey question.
Ho= p1-p2=0 Ha= p1-p2 does not equal 0
We are interested in conducting a 2 sample z test for a difference in population proportions.
Conditions:
Since the p-value of 0.0212 is less than our alpha level of 0.05, we have convincing statistical evidence to reject the null hypothesis. The proportion of customers who would answer “yes” to the survey question is different for new vs. used vehicle sales.
Nice job! You’ve defined parameters, checked conditions, named the test, obtained appropriate test statistic and p-value, and made an appropriate conclusion.
Hypotheses
H_o: p_1 = p_2
H_a: p_1 ≠ p_2
Where p_1 is the true proportion of customers who bought a new vehicle and answered “yes” to the survey question.
Where p_2 is the true proportion of customers who bought a used vehicle and answered “yes” to the survey question.
Assumptions
p_hat_combined = 105(0.8381) + 120(0.7083) / 105+102 = 0.7689
z = (0.8381 - 0.7083) - 0 / sqrt((0.7689 * (1-0.7689) / 105) + (0.7869 * (1-0.7869) / 120) = 2.3036
p-value = 2*normalcdf(2.3036, 1E99, 0, 1) = 0.0212
alpha = 0.05
p-value<alpha
Conclusion
Since the p-value<alpha, we reject the H_o. There is sufficient evidence to suggest that the proportion of customers who bought a new vehicle and answered “yes” to the survey question is different from the proportion of customers who bought a used vehicle and answered “yes” to the survey question.
Well done from top to bottom - you’ve got parameters, conditions, appropriate calculations, and appropriate conclusions. You’re ready!
H null: p1-p2=0
H alternative: p1-p2 doesn’t equal 0.
Anything with subscript 1 pertains the new vehicle group while any thing with subscript 2 pertains to the used vehicle group.
The test we will be using is a two sample z-test (for difference in proportion).
Conditions:
p1 hat = 0.8381
p2 hat = 0.7083
Z=0.1298/((0.7361)(0.2639)(0.0179))^.5
Z=2.19
pvalue = 2(1-0.9857) = 0.0286.
Conclusion: Since the pvalue of 0.0286 is less than the alpha level of 0.05, we reject the null hypothesis, leadings us to the conclusion that the data does provide convincing statistical evidence that the proportion of customers who would answer yes to the survey question is different for new vs used vehicle sales.
Good work! You’ve done all required parts of a hypothesis test and answered appropriately. The only place where you might not receive full credit is in your hypotheses: you’ve used symbols, and named which group is which, but did not define the parameter in context. The parameter here was “proportion of people who would say yes to the survey”, and then can be differentiated between new/used sales.
STATE
Ho: p1=p2
Ha: p1=/p2 (=/ means not equal to)
Alpha = 0.05
p1: The proportions of customers that answered “yes” to “would you recommend our company to a friend looking to purchase a vehicle?”from buying the new vehicle
p2: The proportions of customers that answered “yes” to “would you recommend our company to a friend looking to purchase a vehicle?”from buying the used vehicle
2 sample z test for proportions
PLAN
Random: met, since stated in the problem that company selects customers randomly.
10%: 105<= 1/10 all customers who bought new vehicle
120<= 1/10 all customers who bought used vehicle
Large Counts: For new: 105(88/105)>=10 105(1-(88/105))>=10
For Used: 120(85/120)>=10 120(1-(85/120)) >=10
Since, Large counts conditions are met for both we can assume that the distribution is approximately normal.
DO pc = 88+85/105+120 = 109/840
z score = 109/840 / root 0.769(1-0.769)/105 + 0.769(1-0.769)/120 = 2.3039
p-value=0.0212
CONCLUDE Since, the p value is 0.0212 and the alpha level is 0.05, the p value is smaller than alpha. Therefore, we reject the null hypothesis of p1=p2. We have convincing evidence to say that there is a difference between the proportion of customers who would answer “Yes” to the survey question is different from new vs used vehicle sales.
Nicely done! You’ve done all components of a hypothesis test correctly (and shout-out to the state-plan-do-conclude method that’s in the textbook I use as well).
2-sample z test for proportion
Ho: p1 = p2
Ha: p1 =/= p2
Conditions:
z= (88/105-85/120)-0/ sqrt(173/225)(52/225) x sqrt(1/105+1/120) = 2.304
p-value: .0212
Interpret: Because or p-value (.0212) is below the alpha (0.05) we reject the Ho. There is significant evidence that the proportion of customers who would answer “yes” to the survey question is different for new vs used vehicle sales.
The only part you need more on is your hypotheses; you don’t define what you mean by “p1” and “p2”, so you’d only get partial credit there (you didn’t define the parameter). Every other part is done appropriately and would earn full credit.
pnew= proportion of customers who bought a new car and would recommend to a friend.
pused= proportion of customers who bought a used car and would recommend to a friend.
H0=pnew−pold=0 Ha=pnew−pold≠0
Conditions:
Calculations:
p=0.021
Conclusion: Since our p-value of 0.021<0.05 , we can reject the H0 . We have convincing statistical evidence that there is a difference between population proportion of the people who would recommend to a friend between the people who bought new and old cars.
Very good work. Little thing: on “calculations” part, it is typical to show both the test statistic (in this case, your z), as well as the p-value.