| Term | Definition |
|---|---|
| alternative hypothesis | The claim that contradicts the null hypothesis, representing what the researcher is trying to find evidence for. |
| approximately normal | A distribution that closely follows the shape of a normal distribution, allowing for the use of normal probability methods. |
| categorical variable | A variable that takes on values that are category names or group labels rather than numerical values. |
| difference of two population proportions | The comparison between two population proportions, expressed as p₁ - p₂, to determine if they differ significantly. |
| independence | The condition that observations in a sample are not influenced by each other, typically ensured through random sampling or randomized experiments. |
| null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
| one-sided alternative hypothesis | An alternative hypothesis that specifies the direction of the difference, either p₁ < p₂ or p₁ > p₂. |
| pooled proportion | A combined estimate of the population proportion calculated from both samples when assuming the null hypothesis is true: p̂c = (n₁p̂₁ + n₂p̂₂)/(n₁ + n₂). |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| randomized experiment | A study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships. |
| sampling distribution | The probability distribution of a sample statistic (such as a sample proportion) obtained from repeated sampling of a population. |
| sampling without replacement | A sampling method in which an item selected from a population cannot be selected again in subsequent draws. |
| simple random sample | A sample selected from a population such that every possible sample of the same size has an equal chance of being chosen. |
| statistical inference | The process of drawing conclusions about a population based on data collected from a sample. |
| two-sample z-test | A hypothesis test used to compare the difference between two population proportions using the standard normal distribution. |
| two-sided alternative hypothesis | An alternative hypothesis that specifies the difference could be in either direction, stated as p₁ ≠ p₂. |
| Term | Definition |
|---|---|
| distribution | The pattern of how data values are spread or arranged across a range. |
| population | The entire group of individuals or items from which a sample is drawn and about which conclusions are to be made. |
| sample | A subset of individuals or items selected from a population for the purpose of data collection and analysis. |
| variation | Differences in data that occur by chance due to the random nature of sampling, rather than from systematic causes. |
| Term | Definition |
|---|---|
| difference in sample proportions | The difference between two sample proportions (p̂₁ - p̂₂) used to compare proportions from two different samples. |
| difference of two population proportions | The comparison between two population proportions, expressed as p₁ - p₂, to determine if they differ significantly. |
| null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
| p-value | The probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true. |
| pooled proportion | A combined estimate of the population proportion calculated from both samples when assuming the null hypothesis is true: p̂c = (n₁p̂₁ + n₂p̂₂)/(n₁ + n₂). |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| reject the null hypothesis | The decision made when the p-value is less than or equal to the significance level, indicating sufficient evidence against the null hypothesis. |
| significance level | The threshold probability (α) used to determine whether to reject the null hypothesis in a significance test. |
| significance test | A statistical procedure used to determine whether there is sufficient evidence to reject the null hypothesis based on sample data. |
| standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
| test statistic | A calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data. |
| Term | Definition |
|---|---|
| approximately normal | A distribution that closely follows the shape of a normal distribution, allowing for the use of normal probability methods. |
| categorical variable | A variable that takes on values that are category names or group labels rather than numerical values. |
| confidence interval | A range of values, calculated from sample data, that is likely to contain the true population parameter with a specified level of confidence. |
| confidence level | The probability that a confidence interval will contain the true population parameter, typically expressed as a percentage such as 90%, 95%, or 99%. |
| critical value | A value from the standard normal distribution used to determine the margin of error for a given confidence level. |
| independence | The condition that observations in a sample are not influenced by each other, typically ensured through random sampling or randomized experiments. |
| margin of error | The amount by which a sample statistic is likely to vary from the corresponding population parameter, calculated as the critical value times the standard error. |
| number of failures | The count of unfavorable outcomes in a sample, denoted as n(1-p̂), used to verify the normality condition. |
| number of successes | The count of favorable outcomes in a sample, denoted as np̂, used to verify the normality condition. |
| one-sample z-interval for a proportion | A confidence interval procedure used to estimate a population proportion based on a single sample, using the standard normal (z) distribution. |
| population parameter | A numerical characteristic of an entire population, such as the mean, proportion, or standard deviation. |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| random sample | A sample selected from a population in such a way that every member has an equal chance of being chosen, reducing bias and allowing for valid statistical inference. |
| randomized experiment | A study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships. |
| sample proportion | The proportion of individuals in a sample that have a particular characteristic, denoted as p-hat (p̂). |
| sample size | The number of observations or data points collected in a sample, denoted as n. |
| sample statistic | A numerical value calculated from sample data that is used to estimate the corresponding population parameter. |
| sampling distribution | The probability distribution of a sample statistic (such as a sample proportion) obtained from repeated sampling of a population. |
| sampling without replacement | A sampling method in which an item selected from a population cannot be selected again in subsequent draws. |
| standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
| standard normal distribution | A normal distribution with mean 0 and standard deviation 1, used to determine critical values for confidence intervals. |
| Term | Definition |
|---|---|
| claim | A statement or assertion about a population parameter that can be evaluated using statistical evidence. |
| confidence interval | A range of values, calculated from sample data, that is likely to contain the true population parameter with a specified level of confidence. |
| confidence level | The probability that a confidence interval will contain the true population parameter, typically expressed as a percentage such as 90%, 95%, or 99%. |
| margin of error | The amount by which a sample statistic is likely to vary from the corresponding population parameter, calculated as the critical value times the standard error. |
| one-sample proportion | A confidence interval or hypothesis test that estimates or tests a single population proportion based on data from one sample. |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| random sample | A sample selected from a population in such a way that every member has an equal chance of being chosen, reducing bias and allowing for valid statistical inference. |
| sample size | The number of observations or data points collected in a sample, denoted as n. |
| width of a confidence interval | The range or span of a confidence interval, calculated as the difference between the upper and lower bounds of the interval. |
| Term | Definition |
|---|---|
| 10% condition | The requirement that sample size n is at most 10% of the population size N to ensure independence when sampling without replacement. |
| alternative hypothesis | The claim that contradicts the null hypothesis, representing what the researcher is trying to find evidence for. |
| approximately normal | A distribution that closely follows the shape of a normal distribution, allowing for the use of normal probability methods. |
| categorical variable | A variable that takes on values that are category names or group labels rather than numerical values. |
| independence | The condition that observations in a sample are not influenced by each other, typically ensured through random sampling or randomized experiments. |
| null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
| number of failures | The count of unfavorable outcomes in a sample, denoted as n(1-p̂), used to verify the normality condition. |
| number of successes | The count of favorable outcomes in a sample, denoted as np̂, used to verify the normality condition. |
| one-sample z-test for a population proportion | A hypothesis test used to determine whether a sample proportion provides evidence that a population proportion differs from a hypothesized value. |
| one-sided alternative hypothesis | An alternative hypothesis that specifies the direction of the difference, either p₁ < p₂ or p₁ > p₂. |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| random sample | A sample selected from a population in such a way that every member has an equal chance of being chosen, reducing bias and allowing for valid statistical inference. |
| randomized experiment | A study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships. |
| sample proportion | The proportion of individuals in a sample that have a particular characteristic, denoted as p-hat (p̂). |
| sampling distribution | The probability distribution of a sample statistic (such as a sample proportion) obtained from repeated sampling of a population. |
| sampling without replacement | A sampling method in which an item selected from a population cannot be selected again in subsequent draws. |
| statistical inference | The process of drawing conclusions about a population based on data collected from a sample. |
| two-sided alternative hypothesis | An alternative hypothesis that specifies the difference could be in either direction, stated as p₁ ≠ p₂. |
| Term | Definition |
|---|---|
| alternative hypothesis | The claim that contradicts the null hypothesis, representing what the researcher is trying to find evidence for. |
| null distribution | The probability distribution of the test statistic under the assumption that the null hypothesis is true. |
| null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
| one-sample proportion | A confidence interval or hypothesis test that estimates or tests a single population proportion based on data from one sample. |
| p-value | The probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true. |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| probability model | A mathematical framework that describes the probability distribution of outcomes under specified assumptions. |
| sample statistic | A numerical value calculated from sample data that is used to estimate the corresponding population parameter. |
| significance level | The threshold probability (α) used to determine whether to reject the null hypothesis in a significance test. |
| significance test | A statistical procedure used to determine whether there is sufficient evidence to reject the null hypothesis based on sample data. |
| standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
| test statistic | A calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data. |
| theoretical distribution | A probability distribution based on a mathematical model, such as the normal distribution, used to approximate the distribution of a test statistic. |
| z-statistic | A standardized test statistic for a population proportion calculated as (sample statistic - null value) divided by the standard deviation of the statistic. |
| z-test | A hypothesis test that uses the standard normal distribution to determine whether a sample statistic differs significantly from a population parameter. |
| Term | Definition |
|---|---|
| alternative hypothesis | The claim that contradicts the null hypothesis, representing what the researcher is trying to find evidence for. |
| null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
| p-value | The probability of observing a test statistic as extreme as or more extreme than the one calculated from the sample data, assuming the null hypothesis is true. |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| reject the null hypothesis | The decision made when the p-value is less than or equal to the significance level, indicating sufficient evidence against the null hypothesis. |
| significance level | The threshold probability (α) used to determine whether to reject the null hypothesis in a significance test. |
| significance test | A statistical procedure used to determine whether there is sufficient evidence to reject the null hypothesis based on sample data. |
| statistical evidence | Information from sample data that supports or fails to support a hypothesis about a population parameter. |
| test statistic | A calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data. |
| Term | Definition |
|---|---|
| null hypothesis | The initial claim or assumption being tested in a hypothesis test, typically stating that there is no effect or no difference. |
| parameter | A numerical summary that describes a characteristic of an entire population. |
| power of a test | The probability that a statistical test will correctly reject a false null hypothesis. |
| sample size | The number of observations or data points collected in a sample, denoted as n. |
| significance level | The threshold probability (α) used to determine whether to reject the null hypothesis in a significance test. |
| standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
| Type I error | An error that occurs when a null hypothesis is rejected when it is actually true; the probability of committing this error is equal to the significance level (α). |
| Type II error | An error that occurs when a null hypothesis is not rejected when it is actually false. |
| Term | Definition |
|---|---|
| 10% condition | The requirement that sample size n is at most 10% of the population size N to ensure independence when sampling without replacement. |
| approximately normal | A distribution that closely follows the shape of a normal distribution, allowing for the use of normal probability methods. |
| categorical variable | A variable that takes on values that are category names or group labels rather than numerical values. |
| confidence interval | A range of values, calculated from sample data, that is likely to contain the true population parameter with a specified level of confidence. |
| difference in proportions | The difference between two population proportions, calculated as p₁ - p₂, used to compare the prevalence of a characteristic across two populations. |
| difference of two population proportions | The comparison between two population proportions, expressed as p₁ - p₂, to determine if they differ significantly. |
| independence | The condition that observations in a sample are not influenced by each other, typically ensured through random sampling or randomized experiments. |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| randomized experiment | A study design where subjects are randomly assigned to treatment groups to establish cause-and-effect relationships. |
| sample proportion | The proportion of individuals in a sample that have a particular characteristic, denoted as p-hat (p̂). |
| sampling distribution | The probability distribution of a sample statistic (such as a sample proportion) obtained from repeated sampling of a population. |
| sampling without replacement | A sampling method in which an item selected from a population cannot be selected again in subsequent draws. |
| simple random sample | A sample selected from a population such that every possible sample of the same size has an equal chance of being chosen. |
| standard error | The standard deviation of a sampling distribution, which measures the variability of a sample statistic across repeated samples. |
| success-failure condition | A requirement that the expected number of successes and failures in each sample (np̂ and n(1-p̂)) meet a minimum threshold, typically 5 or 10, to ensure the sampling distribution is approximately normal. |
| test statistic | A calculated value used to determine whether to reject the null hypothesis in a hypothesis test, computed from sample data. |
| two-sample z-interval | A confidence interval procedure that uses the standard normal distribution to estimate the difference between two population proportions based on sample data. |
| Term | Definition |
|---|---|
| confidence interval | A range of values, calculated from sample data, that is likely to contain the true population parameter with a specified level of confidence. |
| difference in proportions | The difference between two population proportions, calculated as p₁ - p₂, used to compare the prevalence of a characteristic across two populations. |
| population proportion | The true proportion or percentage of a characteristic in an entire population, typically denoted as p. |
| random sampling | A method of selecting samples from a population where each member has an equal chance of being chosen, ensuring the sample is representative of the population. |
| sample size | The number of observations or data points collected in a sample, denoted as n. |