upgrade
upgrade
✳️AP Statistics Unit 8 Vocabulary

75 essential vocabulary terms and definitions for Unit 8 – Chi–Squares

Study Unit 8
Practice Vocabulary
✳️Unit 8 – Chi–Squares
Topics

✳️Unit 8 – Chi–Squares

8.1 Introducing Statistics

TermDefinition
categorical dataData that represents categories or groups rather than numerical measurements, such as colors, types, or classifications.
expected countThe theoretical frequency in each cell of a contingency table that would be expected if the null hypothesis of independence or homogeneity were true.
observed countThe actual frequency or number of observations in each cell of a contingency table from the collected data.
variationDifferences in data that occur by chance due to the random nature of sampling, rather than from systematic causes.

8.2 Setting Up a Chi Square Goodness of Fit Test

TermDefinition
alternative hypothesisThe claim that contradicts the null hypothesis, representing what the researcher is trying to find evidence for.
categorical dataData that represents categories or groups rather than numerical measurements, such as colors, types, or classifications.
chi-square distributionsProbability distributions used to test the goodness of fit between observed and expected categorical data, characterized by positive values and right skewness.
chi-square statisticA test statistic that measures the distance between observed and expected counts relative to the expected counts.
chi-square testA statistical test used to determine whether observed frequencies of categorical data match expected frequencies based on a hypothesized distribution.
degrees of freedomA parameter of the t-distribution that affects its shape; as degrees of freedom increase, the t-distribution approaches the normal distribution.
distribution of proportionsThe way in which proportions are spread across the categories of a categorical variable.
expected countThe theoretical frequency in each cell of a contingency table that would be expected if the null hypothesis of independence or homogeneity were true.
goodness of fitA statistical test that determines how well observed data match the expected distribution specified by a hypothesis.
independenceThe condition that observations in a sample are not influenced by each other, typically ensured through random sampling or randomized experiments.
null hypothesisThe initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference.
null proportionThe hypothesized proportion for each category under the null hypothesis in a chi-square goodness of fit test.
observed countThe actual frequency or number of observations in each cell of a contingency table from the collected data.
proportionA part or share of a whole, expressed as a fraction, decimal, or percentage.
random sampleA sample selected from a population in such a way that every member has an equal chance of being chosen, reducing bias and allowing for valid statistical inference.
randomized experimentA study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships.
sample sizeThe number of observations or data points collected in a sample, denoted as n.
sampling without replacementA sampling method in which an item selected from a population cannot be selected again in subsequent draws.
statistical inferenceThe process of drawing conclusions about a population based on data collected from a sample.

8.3 Carrying Out a Chi Square Goodness of Fit Test

TermDefinition
chi-square distributionA probability distribution used in chi-square tests, characterized by degrees of freedom and used to determine p-values for test statistics.
chi-square testA statistical test used to determine whether observed frequencies of categorical data match expected frequencies based on a hypothesized distribution.
degrees of freedomA parameter of the t-distribution that affects its shape; as degrees of freedom increase, the t-distribution approaches the normal distribution.
expected countThe theoretical frequency in each cell of a contingency table that would be expected if the null hypothesis of independence or homogeneity were true.
null distributionThe probability distribution of the test statistic under the assumption that the null hypothesis is true.
null hypothesisThe initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference.
observed countThe actual frequency or number of observations in each cell of a contingency table from the collected data.
p-valueThe probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true.
probability modelA mathematical framework that describes the probability distribution of outcomes under specified assumptions.
reject the null hypothesisThe decision made when the p-value is less than or equal to the significance level, indicating sufficient evidence against the null hypothesis.
significance levelThe threshold probability (α) used to determine whether to reject the null hypothesis in a significance test.
significance testA statistical procedure used to determine whether there is sufficient evidence to reject the null hypothesis based on sample data.
test statisticA calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data.
theoretical distributionA probability distribution based on a mathematical model, such as the normal distribution, used to approximate the distribution of a test statistic.

8.4 Expected Counts in Two Way Tables

TermDefinition
categorical dataData that represents categories or groups rather than numerical measurements, such as colors, types, or classifications.
expected countThe theoretical frequency in each cell of a contingency table that would be expected if the null hypothesis of independence or homogeneity were true.
two-way tableA table that displays the frequency distribution of two categorical variables, organized in rows and columns.

8.5 Setting Up a Chi-Square Test for Homogeneity or Independence

TermDefinition
alternative hypothesisThe claim that contradicts the null hypothesis, representing what the researcher is trying to find evidence for.
associationThe relationship between two variables where knowing the value of one variable provides information about the other variable.
categorical dataData that represents categories or groups rather than numerical measurements, such as colors, types, or classifications.
categorical variableA variable that takes on values that are category names or group labels rather than numerical values.
chi-square testA statistical test used to determine whether observed frequencies of categorical data match expected frequencies based on a hypothesized distribution.
chi-square test for homogeneityA statistical test used to determine whether the distributions of a categorical variable are the same across different populations or treatments.
chi-square test for independenceA statistical test used to determine whether two categorical variables in a population are associated or independent.
distributionThe pattern of how data values are spread or arranged across a range.
expected countThe theoretical frequency in each cell of a contingency table that would be expected if the null hypothesis of independence or homogeneity were true.
homogeneityIn a chi-square test, the condition where the distribution of a categorical variable is the same across different groups or populations.
independenceThe condition that observations in a sample are not influenced by each other, typically ensured through random sampling or randomized experiments.
null hypothesisThe initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference.
proportionA part or share of a whole, expressed as a fraction, decimal, or percentage.
randomized experimentA study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships.
row and column variablesThe two categorical variables displayed in a two-way table, with one variable defining the rows and the other defining the columns.
sampling without replacementA sampling method in which an item selected from a population cannot be selected again in subsequent draws.
simple random sampleA sample selected from a population such that every possible sample of the same size has an equal chance of being chosen.
statistical inferenceThe process of drawing conclusions about a population based on data collected from a sample.
stratified random sampleA sampling method in which a population is divided into separate groups called strata based on shared characteristics, and a simple random sample is selected from each stratum.
two-way tableA table that displays the frequency distribution of two categorical variables, organized in rows and columns.

8.6 Carrying Out a Chi-Square Test for Homogeneity or Independence

TermDefinition
chi-square distributionA probability distribution used in chi-square tests, characterized by degrees of freedom and used to determine p-values for test statistics.
chi-square statisticA test statistic that measures the distance between observed and expected counts relative to the expected counts.
chi-square test for homogeneityA statistical test used to determine whether the distributions of a categorical variable are the same across different populations or treatments.
chi-square test for independenceA statistical test used to determine whether two categorical variables in a population are associated or independent.
degrees of freedomA parameter of the t-distribution that affects its shape; as degrees of freedom increase, the t-distribution approaches the normal distribution.
expected countThe theoretical frequency in each cell of a contingency table that would be expected if the null hypothesis of independence or homogeneity were true.
null hypothesisThe initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference.
observed countThe actual frequency or number of observations in each cell of a contingency table from the collected data.
p-valueThe probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true.
probability modelA mathematical framework that describes the probability distribution of outcomes under specified assumptions.
reject the null hypothesisThe decision made when the p-value is less than or equal to the significance level, indicating sufficient evidence against the null hypothesis.
research questionThe specific question about a population or populations that a statistical test is designed to answer.
significance levelThe threshold probability (α) used to determine whether to reject the null hypothesis in a significance test.
test statisticA calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data.
two-way tableA table that displays the frequency distribution of two categorical variables, organized in rows and columns.