11 min read•january 12, 2021

Jerry Kosoff

Practicing with FRQs is a great way to prep for the AP exam! Work through this FRQ from Unit 6, then review sample student responses and corresponding feedback from Fiveable teacher Jerry Kosoff!

After completing a sale, a car company likes to send a follow-up survey where customers can indicate their level of satisfaction with their experience. One of the questions in the survey asks “would you recommend our company to a friend looking to purchase a vehicle?” The company wonders if people would answer the question differently based on whether they bought a new or used vehicle. From a list of all 2018 vehicle sales, the company randomly selects 105 customers who bought a new vehicle 120 customers who bought a used vehicle. 88 of the customers who bought new vehicles answered “yes,” while 85 of the customers who bought used vehicles answered “yes.”

P(yes|new) = 88/105 = 0.838

P(yes|used) = 85/120 = 0.708

margin of error = +/- (1.960)sqrt[(0.838(0.162)/105)+(0.708(0.292)/120)]

margin of error = +/- 0.108

confidence interval = 0.05 +/- 0.108 = (-0.058, 0.113)

No, the data do not provide convincing statistical evidence that the proportion of customers who would answer “yes” to the survey question is different for new vs used vehicle sales. Since 0 is captured in the 95% confidence interval of -0.058 and 0.113, the data shows that the true difference in proportions could be 0.

Teacher Feedback

I’ll give feedback on your work below, but I want to start with noticing that you used a 2-sample confidence interval to answer the question. That is a totally valid strategy for a situation like this, but only because the alternative hypothesis was “different”; had the scenario asked “higher” or “lower” the confidence interval would not work in the same way. Typically, when given a significance level, and asked if there is “convincing statistical evidence” of something, we should be running a hypothesis test. That said, you will still be scored for your work with the confidence interval.

The scoring for a “convincing statistical evidence…” scenario includes:

- Stating null/alternative hypotheses
- Defining the parameters in the null/alternative hypotheses
- Choosing an appropriate test/interval by name
- Checking the conditions to run the chosen test/interval
- Writing the results from the chosen test/interval
- Correctly interpreting the results from the chosen test/interval in terms of whether we do or don’t have evidence for the alternative hypothesis.

Given that list (some parts are scored together to create a question with 3-4 scoring components), you can likely see that your work doesn’t have enough there to be earning much of the available credit. You calculate the appropriate margin of error, and therefore obtain a confidence interval, but never name the interval, check conditions (random samples, approximately normal sampling distribution [at least 10 successes/at least 10 failures], 10% condition), or write hypotheses. Additionally, you used “0.05” in the interval, instead of using (0.838 - 0.708 = 0.13) as your difference of proportions to add/subtract the margin of error. That would have led you to a different confidence interval where 0 wasnotincluded. Given that your interval did include 0 though, your conclusion that we do not have convincing evidence would get scored as correct, because you interpreted the answer you got correctly. Unfortunately, you would not get credit for the other components of the question.

p_1 = the proportion of customers who bought a new vehicle and answered yes to the survey question

p_2 = the proportion of customers who bought a new vehicle and answered yes to the survey question

2-sample z test for p_1 - p_2

H_0: p_1 - p_2 = 0, H_a: p_1 - p_2 not equal 0

- Random - Stated that the company “randomly selects” customers for the survey
- 10% Condition for Independence - satisfied since it is safe to assume that there are at least 105(10) = 1050 customers who bought a new vehicle at the car company, and at least 120(10) = 1200 customers who bought a used vehicle at the car company.
- Large Counts Condition - satisfied since
- n_1
*p-hat_1 = 105*0.838 = 87.99 >=10 - n_1*(1-p-hat_1) = 105
*0.162 = 17.01 >=10* *n_2*p-hat_2 = 120*0.708 = 84.96 >= 10**n_2*(1-p-hat_2) = 120*0.292 = 35.04 >=10- With Large Counts Condition satisfied, the sampling distribution of p-hat_1 - p-hat_2 is approximately normal.

p-hat_1 = 88/105 = 0.838, n_1=105

p-hat_2 = 85/120 = 0.708, n_2=120

z* = 2.303

P-val = P(z>=2.303 or z<=-2.303) = 0.021247

Since 0.021247 < alpha of 0.05, we reject the null hypothesis, because there is convincing statistical evidence that the proportion of customers who would answer “yes” to the survey question is different for new vs used vehicle sales.

Teacher Feedback

Strong execution from top to bottom, presented clearly. One thing that you’re going to facepalm about: you defined the parametersp1andp2as the exact same thing. “p2” should say “used”

Test type- two sample proportional difference hypothesis z-test

- H0= p1-p2=0
- Ha=p1-p2 does not = 0
- p1=the proportion of all customers from a list of 2018 vehicle sales who bought a new vehicle and would recommend our company to a friend
- p2=the proportion of all customers from a list of 2018 vehicle sales who bought a used vehicle and would recommend our company to a friend

Conditions

- Since we are dealing with a two-sample proportional difference hypothesis test, we will have to pool/combine our proportions.
- pc=88+85/105+120=0.75
- qc=1-0.75=0.25
- *we have to check for the independence of our pooled data --> .75(225)=168.75 >=10 & .25(225)=56.25 >= 10

Conditions for new cars:

- simple random sample: stated in the problem- “the company randomly selects 105 customers…”
- independence: 10(105)=1050 Assume that the population of new cars purchased in 2018 is greater than 1050.
- normal: .84(105)=(88 >= 10), .162(105)=(17 >= 10) All of the conditions for new cars are met.

Conditions for old cars:

- simple random sample: stated in the problem- “the company randomly selects…120 customers…”
- independence: 10(120)=1200 Assume that the population of new cars purchased in 2018 is greater than 1200.
- normal: .708(120)=(85 >= 10), .292(120)=(35 >= 10)
- All of the conditions for old cars are met.

Solve

- z=.84-.708/(sqrt.(.25x.75)/105 + (.25x.75)/120) = 2.28 --> *I looked at table z to find the p-value of -2.28, p-value=.0129(2)=.0258

Conclusion

- (.0258<.05 our significance level) --> Reject the H0 in favor of the Ha. We have significant evidence that the proportion of all customers from a list of 2018 vehicle sales who bought a new vehicle and said they would recommend the company is different than the proportion of all customers from a list of 2018 vehicle sales who bought a used vehicle and said that they would recommend our company to a friend.

Teacher Feedback

This is about as thorough a response as I’ve seen! Very well done - you’ve nailed all of the components.

p1= Proportion of customers who bought a new car and answers “yes” to the survey question.

p2= Proportion of customers who bought a used car and answered “yes” to the survey question.

Ho= p1-p2=0 Ha= p1-p2 does not equal 0

We are interested in conducting a 2 sample z test for a difference in population proportions.

Conditions:

- Random- A random sample of 105 customers who bought a new vehicle and 120 customers who bought a used vehicle is taken
- Normal-
- Sample of new cars: np = 105 * 0.838= 88 is greater than or equal to 10.
- n(1-p)= 105(0.162)= 17 is greater than or equal to 10.

- Sample of used cars: np= 120* 0.708= 85 is greater than or equal to 10.
- n(1-p)= 120(0.292)= 35 is greater than or equal to 10.

Calculator: 2-Prop Z Test {x1=88, n1=105, x2=85, n2=120, p1 does not equal p2} = p: 0.0212

Since the p-value of 0.0212 is less than our alpha level of 0.05, we have convincing statistical evidence to reject the null hypothesis. The proportion of customers who would answer “yes” to the survey question is different for new vs. used vehicle sales.

Teacher Feedback

Nice job! You’ve defined parameters, checked conditions, named the test, obtained appropriate test statistic and p-value, and made an appropriate conclusion.

H_o: p_1 = p_2

H_a: p_1 ≠ p_2

Where p_1 is the true proportion of customers who bought a new vehicle and answered “yes” to the survey question.

Where p_2 is the true proportion of customers who bought a used vehicle and answered “yes” to the survey question.

- Independence:
- We have 2 independent random samples of customers from 2018 vehicle sales.
- Population of new vehicle customers is at least 1050 and the population of used vehicle customers is at least 1200.

- Normality:
- n_1 * p-hat_1 = 105 * 0.8381 = 88 ≥ 10
- n_1 * (1-p-hat_1) = 105* (0.1619) = 16.9995 ≥ 10
- n_2 * p-hat_2 = 120 * 0.7083 = 84.996 ≥ 10
- n_2 * (1-p-hat_2) = 120 * 0.2917 = 35.004 ≥ 10
- Since all 4 are greater than 10, the sampling distribution is approximately normal.

p_hat_combined = 105(0.8381) + 120(0.7083) / 105+102 = 0.7689

z = (0.8381 - 0.7083) - 0 / sqrt((0.7689 * (1-0.7689) / 105) + (0.7869 * (1-0.7869) / 120) = 2.3036

p-value = 2*normalcdf(2.3036, 1E99, 0, 1) = 0.0212

alpha = 0.05

p-value<alpha

Since the p-value<alpha, we reject the H_o. There is sufficient evidence to suggest that the proportion of customers who bought a new vehicle and answered “yes” to the survey question is different from the proportion of customers who bought a used vehicle and answered “yes” to the survey question.

Teacher Feedback

Well done from top to bottom - you’ve got parameters, conditions, appropriate calculations, and appropriate conclusions. You’re ready!

H null: p1-p2=0

H alternative: p1-p2 doesn’t equal 0.

Anything with subscript 1 pertains the new vehicle group while any thing with subscript 2 pertains to the used vehicle group.

The test we will be using is a two sample z-test (for difference in proportion).

- np1 hat and nq1 hat both greater than 10 (or 15). np2 hat and nq2 hat both greater than 10 or (15). Calculations omitted, but it is clear that both groups both have more than 10 successes and failures.
- Samples are random and independent. This is met as the question specifies that the selection of customers from both groups was random, and it is clear that the selection of a customer from one group does not affect the selection of a customer from the other group.
- It is also safe to assume that the number of customers who bought new and used cars is at least 105
*10 = 1050 and 120*10 - 1200 people, respectively.

Z= (p1 hat - p2 hat)/(phat *(1-phat)(1/n1 + 1/n2))^.5 where phat is the pooled proportion = 173/235 = 0.7361

p1 hat = 0.8381

p2 hat = 0.7083

Z=0.1298/((0.7361)(0.2639)(0.0179))^.5

Z=2.19

pvalue = 2(1-0.9857) = 0.0286.

Conclusion: Since the pvalue of 0.0286 is less than the alpha level of 0.05, we reject the null hypothesis, leadings us to the conclusion that the data does provide convincing statistical evidence that the proportion of customers who would answer yes to the survey question is different for new vs used vehicle sales.

Teacher Feedback

Good work! You’ve done all required parts of a hypothesis test and answered appropriately. The only place where you might not receive full credit is in your hypotheses: you’ve used symbols, and named which group is which, but did not define the parameter in context. The parameter here was “proportion of people who would say yes to the survey”, and then can be differentiated between new/used sales.

Ho: p1=p2

Ha: p1=/p2 (=/ means not equal to)

Alpha = 0.05

p1: The proportions of customers that answered “yes” to “would you recommend our company to a friend looking to purchase a vehicle?”from buying the new vehicle

p2: The proportions of customers that answered “yes” to “would you recommend our company to a friend looking to purchase a vehicle?”from buying the used vehicle

2 sample z test for proportions

Random: met, since stated in the problem that company selects customers randomly.

10%: 105<= 1/10 all customers who bought new vehicle

120<= 1/10 all customers who bought used vehicle

Large Counts: For new: 105(88/105)>=10 105(1-(88/105))>=10

For Used: 120(85/120)>=10 120(1-(85/120)) >=10

Since, Large counts conditions are met for both we can assume that the distribution is approximately normal.

z score = 109/840 / root 0.769(1-0.769)/105 + 0.769(1-0.769)/120 = 2.3039

p-value=0.0212

Teacher Feedback

Nicely done! You’ve done all components of a hypothesis test correctly (and shout-out to the state-plan-do-conclude method that’s in the textbook I use as well).

2-sample z test for proportion

Ho: p1 = p2

Ha: p1 =/= p2

Conditions:

- Random: We are told that the customers are “randomly selected”
- Normal: It is approximately normal because:
- (88/105)(105)=88 ≥10 (17/105)(105) = 17 ≥ 10
- (85/120)(120)=85 ≥10 (35/120)(120) = 35 ≥10

- 10% condition: There are more than 10 x 120 and 10 x 105 customers. The two samples are independent from each other.

Calculation: Pc = 88+85/105+120 = 173/225

z= (88/105-85/120)-0/ sqrt(173/225)(52/225) x sqrt(1/105+1/120) = 2.304

p-value: .0212

Interpret: Because or p-value (.0212) is below the alpha (0.05) we reject the Ho. There is significant evidence that the proportion of customers who would answer “yes” to the survey question is different for new vs used vehicle sales.

Teacher Feedback

The only part you need more on is your hypotheses; you don’t define what you mean by “p1” and “p2”, so you’d only get partial credit there (you didn’t define the parameter). Every other part is done appropriately and would earn full credit.

pnew= proportion of customers who bought a new car and would recommend to a friend.

pused= proportion of customers who bought a used car and would recommend to a friend.

H0=pnew−pold=0
Ha=pnew−pold≠0

- Random: “The company randomly selects 105 customers who bought a new vehicle and 120 customers who bought a used vehicle”
- Normal: For the new vehicle population, there is 88≥10 successes and 105−88=17≥10 failures. For the old vehicle population, there is 85≥10 successes and 120−85=35≥10 failures.
- Independent: It is reasonable to assume that the population of the people who bought new and old cars is at least 1050 and 1200 respectively.

We will be conducting a 2 sample z-test for difference of population proportions.

Calculations:

p=0.021

Teacher Feedback

Very good work. Little thing: on “calculations” part, it is typical to show both the test statistic (in this case, yourz), as well as the p-value.

Sign up now for instant access to 2 amazing downloads to help you get a 5

Browse Study Guides By Unit

📆

Big Reviews: Finals & Exam Prep

✏️

Blogs

✍️

Free Response Questions (FRQs)

👆

Unit 1: Exploring One-Variable Data

✌️

Unit 2: Exploring Two-Variable Data

🔎

Unit 3: Collecting Data

🎲

Unit 4: Probability, Random Variables, and Probability Distributions

📊

Unit 5: Sampling Distributions

⚖️

Unit 6: Inference for Categorical Data: Proportions

😼

Unit 7: Inference for Qualitative Data: Means

✳️

Unit 8: Inference for Categorical Data: Chi-Square

📈

Unit 9: Inference for Quantitative Data: Slopes

Take this quiz for a progress check on what you’ve learned this year and get a personalized study plan to grab that 5!

START QUIZStudying with Hours = the ultimate focus mode

Start a free study session