Fiveable
Fiveable

or

Log in

Find what you need to study


Light

Find what you need to study

8.3 Carrying Out a Chi Square Goodness of Fit Test

5 min readjanuary 7, 2023

Jed Quiaoit

Jed Quiaoit

Josh Argo

Josh Argo

Jed Quiaoit

Jed Quiaoit

Josh Argo

Josh Argo

Recall from the previous section that a determines if an observed frequency distribution differs significantly from a theoretical expected distribution. It is used to test whether the observed frequencies in one or more categories differ significantly from the expected frequencies in those categories. 💪

The big picture procedure for carrying out a goes:

(1) Hypotheses: State the null and alternative hypotheses: The null hypothesis is that the observed frequency distribution is the same as the expected frequency distribution, while the alternative hypothesis is that the observed and expected frequency distributions are significantly different.

(2) : Choose a : This is the probability of rejecting the null hypothesis when it is true. Commonly used values are 0.1, 0.05, and 0.01.

(3) : Calculate the : The is calculated using the formula:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-UrHHqnhn3P23.jpg?alt=media&token=55c8c403-f644-428f-b078-8e4b3a5af6a5

Source: Cochrane

where "observed" is the observed frequency for each category, and "expected" is the expected frequency for each category.

(4) DF Analysis: Determine the : The is equal to the number of categories minus 1.

(5) & Tables: Look up the of chi-square in a chi-square table: The is the value that corresponds to the chosen and .

(6) Comparisons! Compare the to the : If the is greater than the , then the null hypothesis is rejected and the alternative hypothesis is accepted. If the is less than or equal to the , then the null hypothesis cannot be rejected.

(7) Conclusion: If the null hypothesis is rejected, then the observed frequency distribution is significantly different from the expected frequency distribution. If the null hypothesis is not rejected, then the observed frequency distribution is not significantly different from the expected frequency distribution.

Doing The Test!

Now that we have checked our necessary conditions and written our hypotheses for our test, it is now time to actually carry out the test! Our test will consist of two mathematical elements: the test statistic (χ2 statistic) and our . 🤖

Test Statistic

The first thing we need to calculate in order to finish our test is our χ2 value which is found using the formula found in the image above. We are going to take each of our observed counts, subtract the expected counts, square that difference and then divide by the expected count. After we have done that for all of our counts, we will sum up the total of these and get our χ2 value for that test. 📝

As with our other test statistics when we used z-scores and t-scores, a χ2 value close to 0 will support the null hypothesis, because it shows that there is not much difference between the observed and expected counts. As that difference increases more and more, we get more of an idea that our expected counts are not accurate. Therefore, leading us to in favor of the alternate hypothesis (which states that at least one of the null proportions is incorrect).

Example

For example, let’s return to our happiness survey with this null hypothesis: 😊

  • 10% said they were unhappy (1), 

  • 15% said they were somewhat unhappy (2), 

  • 28% said they were sometimes happy and sometimes sad (3), 

  • 30% said they were happy (4), and

  • 17% said they were always happy (5)

We take a random sample of 1000 people where 120 respond 1, 180 respond 2, 220 respond 3, 480 respond 4 and 0 respond 5.

We would:

  1. Take our observed counts of 120, 180, 220, 480 and 0,

  2. Subtract the expected counts of 100, 150, 280, 300, 170 respectively,

  3. Square our results;

  4. Divide each of the squared results by their respective expected count;

  5. Sum up all five of the outcomes in step 4.

Or… use your handy, dandy, TI84 (or similar) graphing calculator to do this for you (highly recommended)! A general example of calculating chi-square values (in the context of political views in a sample of 300 people) is shown below as well.

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-5x4bTBIj2yVO.png?alt=media&token=f4c5b0c8-97a1-4eb9-b05f-966fd23684d9

Degrees of Freedom

As with our t-score tests and intervals, we have to find our in order to complete our test. To find our , we simply take the number of categories and subtract 1. So with our happiness scale example, we would have 4 . ➖

P-Value

Recall that the is the probability of obtaining a that is at least as extreme as the one observed, given that the null hypothesis is true. 🅿️

Once you finally get your χ2 value, you calculate your by finding the probability of getting that particular χ2 by random chance. As always, if our p is low, we reject the Ho. 

To determine the , you will need to use a chi-square table or a computer program to look up the of chi-square that corresponds to the chosen and . The is then calculated based on the observed and the .

Once you have calculated the and , you can then compare the to the to determine whether to reject or fail to . If the is greater than the , then the null hypothesis is rejected and the alternative hypothesis is accepted. If the is less than or equal to the , then the null hypothesis cannot be rejected.

Example

After calculating our test for the happiness example, this was the calculator output that we got:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-FNm9Rbl5YCU6.png?alt=media&token=08dbdb56-2450-4b9a-bf26-c54bdb29aeb0

Conclusion

Just as we concluded hypothesis tests in previous units, we must compare our to a given ɑ value. If it is less than our alpha, we conclude that we reject the H0 and have convincing evidence of the Ha. Otherwise, we fail to reject the null and do not have convincing evidence of the Ha. Remember two things:

  1. Never “accept” anything!

  2. Include context!

In the example above, we can see that our is essentially 0. Therefore we would say something like this:

Since our (~0) is less than 0.05, we . We have convincing evidence that at least one of the proportions for how people rank on the happiness scale is incorrect. 😔

🎥  Watch: AP Stats Unit 8 - Chi Squared Tests

Key Terms to Review (8)

Chi-Square Goodness of Fit Test

: The chi-square goodness of fit test is a statistical test used to determine if observed categorical data fits an expected distribution. It compares the observed frequencies with the expected frequencies and assesses whether any significant differences exist.

Chi-Square Statistic

: The chi-square statistic is a measure used in hypothesis testing to determine if there is a significant difference between observed and expected frequencies in categorical data. It quantifies how much the observed data deviates from what would be expected under the null hypothesis.

Critical Value

: A critical value is a specific value that separates the rejection region from the non-rejection region in hypothesis testing. It is compared to the test statistic to determine whether to reject or fail to reject the null hypothesis.

Degrees of Freedom

: Degrees of freedom refers to the number of values in a calculation that are free to vary. In statistics, it represents the number of independent pieces of information available for estimating a parameter.

Hypothesis Testing

: Hypothesis testing is a statistical method used to make inferences about population parameters based on sample data. It involves formulating a null hypothesis and an alternative hypothesis, collecting data, calculating test statistics, and making decisions about rejecting or failing to reject the null hypothesis.

P-value

: The p-value is a probability value that helps determine whether an observed result is statistically significant or occurred by chance. It quantifies how strong or weak evidence against a null hypothesis exists.

Reject the Null Hypothesis

: When conducting a hypothesis test, rejecting the null hypothesis means that there is enough evidence to support the alternative hypothesis. In other words, it suggests that there is a significant difference or relationship between variables.

Significance Level

: The significance level, also known as alpha (α), determines how much evidence we need to reject the null hypothesis. It represents the probability of making a Type I error.

8.3 Carrying Out a Chi Square Goodness of Fit Test

5 min readjanuary 7, 2023

Jed Quiaoit

Jed Quiaoit

Josh Argo

Josh Argo

Jed Quiaoit

Jed Quiaoit

Josh Argo

Josh Argo

Recall from the previous section that a determines if an observed frequency distribution differs significantly from a theoretical expected distribution. It is used to test whether the observed frequencies in one or more categories differ significantly from the expected frequencies in those categories. 💪

The big picture procedure for carrying out a goes:

(1) Hypotheses: State the null and alternative hypotheses: The null hypothesis is that the observed frequency distribution is the same as the expected frequency distribution, while the alternative hypothesis is that the observed and expected frequency distributions are significantly different.

(2) : Choose a : This is the probability of rejecting the null hypothesis when it is true. Commonly used values are 0.1, 0.05, and 0.01.

(3) : Calculate the : The is calculated using the formula:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-UrHHqnhn3P23.jpg?alt=media&token=55c8c403-f644-428f-b078-8e4b3a5af6a5

Source: Cochrane

where "observed" is the observed frequency for each category, and "expected" is the expected frequency for each category.

(4) DF Analysis: Determine the : The is equal to the number of categories minus 1.

(5) & Tables: Look up the of chi-square in a chi-square table: The is the value that corresponds to the chosen and .

(6) Comparisons! Compare the to the : If the is greater than the , then the null hypothesis is rejected and the alternative hypothesis is accepted. If the is less than or equal to the , then the null hypothesis cannot be rejected.

(7) Conclusion: If the null hypothesis is rejected, then the observed frequency distribution is significantly different from the expected frequency distribution. If the null hypothesis is not rejected, then the observed frequency distribution is not significantly different from the expected frequency distribution.

Doing The Test!

Now that we have checked our necessary conditions and written our hypotheses for our test, it is now time to actually carry out the test! Our test will consist of two mathematical elements: the test statistic (χ2 statistic) and our . 🤖

Test Statistic

The first thing we need to calculate in order to finish our test is our χ2 value which is found using the formula found in the image above. We are going to take each of our observed counts, subtract the expected counts, square that difference and then divide by the expected count. After we have done that for all of our counts, we will sum up the total of these and get our χ2 value for that test. 📝

As with our other test statistics when we used z-scores and t-scores, a χ2 value close to 0 will support the null hypothesis, because it shows that there is not much difference between the observed and expected counts. As that difference increases more and more, we get more of an idea that our expected counts are not accurate. Therefore, leading us to in favor of the alternate hypothesis (which states that at least one of the null proportions is incorrect).

Example

For example, let’s return to our happiness survey with this null hypothesis: 😊

  • 10% said they were unhappy (1), 

  • 15% said they were somewhat unhappy (2), 

  • 28% said they were sometimes happy and sometimes sad (3), 

  • 30% said they were happy (4), and

  • 17% said they were always happy (5)

We take a random sample of 1000 people where 120 respond 1, 180 respond 2, 220 respond 3, 480 respond 4 and 0 respond 5.

We would:

  1. Take our observed counts of 120, 180, 220, 480 and 0,

  2. Subtract the expected counts of 100, 150, 280, 300, 170 respectively,

  3. Square our results;

  4. Divide each of the squared results by their respective expected count;

  5. Sum up all five of the outcomes in step 4.

Or… use your handy, dandy, TI84 (or similar) graphing calculator to do this for you (highly recommended)! A general example of calculating chi-square values (in the context of political views in a sample of 300 people) is shown below as well.

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-5x4bTBIj2yVO.png?alt=media&token=f4c5b0c8-97a1-4eb9-b05f-966fd23684d9

Degrees of Freedom

As with our t-score tests and intervals, we have to find our in order to complete our test. To find our , we simply take the number of categories and subtract 1. So with our happiness scale example, we would have 4 . ➖

P-Value

Recall that the is the probability of obtaining a that is at least as extreme as the one observed, given that the null hypothesis is true. 🅿️

Once you finally get your χ2 value, you calculate your by finding the probability of getting that particular χ2 by random chance. As always, if our p is low, we reject the Ho. 

To determine the , you will need to use a chi-square table or a computer program to look up the of chi-square that corresponds to the chosen and . The is then calculated based on the observed and the .

Once you have calculated the and , you can then compare the to the to determine whether to reject or fail to . If the is greater than the , then the null hypothesis is rejected and the alternative hypothesis is accepted. If the is less than or equal to the , then the null hypothesis cannot be rejected.

Example

After calculating our test for the happiness example, this was the calculator output that we got:

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-FNm9Rbl5YCU6.png?alt=media&token=08dbdb56-2450-4b9a-bf26-c54bdb29aeb0

Conclusion

Just as we concluded hypothesis tests in previous units, we must compare our to a given ɑ value. If it is less than our alpha, we conclude that we reject the H0 and have convincing evidence of the Ha. Otherwise, we fail to reject the null and do not have convincing evidence of the Ha. Remember two things:

  1. Never “accept” anything!

  2. Include context!

In the example above, we can see that our is essentially 0. Therefore we would say something like this:

Since our (~0) is less than 0.05, we . We have convincing evidence that at least one of the proportions for how people rank on the happiness scale is incorrect. 😔

🎥  Watch: AP Stats Unit 8 - Chi Squared Tests

Key Terms to Review (8)

Chi-Square Goodness of Fit Test

: The chi-square goodness of fit test is a statistical test used to determine if observed categorical data fits an expected distribution. It compares the observed frequencies with the expected frequencies and assesses whether any significant differences exist.

Chi-Square Statistic

: The chi-square statistic is a measure used in hypothesis testing to determine if there is a significant difference between observed and expected frequencies in categorical data. It quantifies how much the observed data deviates from what would be expected under the null hypothesis.

Critical Value

: A critical value is a specific value that separates the rejection region from the non-rejection region in hypothesis testing. It is compared to the test statistic to determine whether to reject or fail to reject the null hypothesis.

Degrees of Freedom

: Degrees of freedom refers to the number of values in a calculation that are free to vary. In statistics, it represents the number of independent pieces of information available for estimating a parameter.

Hypothesis Testing

: Hypothesis testing is a statistical method used to make inferences about population parameters based on sample data. It involves formulating a null hypothesis and an alternative hypothesis, collecting data, calculating test statistics, and making decisions about rejecting or failing to reject the null hypothesis.

P-value

: The p-value is a probability value that helps determine whether an observed result is statistically significant or occurred by chance. It quantifies how strong or weak evidence against a null hypothesis exists.

Reject the Null Hypothesis

: When conducting a hypothesis test, rejecting the null hypothesis means that there is enough evidence to support the alternative hypothesis. In other words, it suggests that there is a significant difference or relationship between variables.

Significance Level

: The significance level, also known as alpha (α), determines how much evidence we need to reject the null hypothesis. It represents the probability of making a Type I error.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.