In AP Statistics, association means two variables are related, so knowing the value of one variable gives you information about the other. For categorical variables, you spot it when conditional relative frequencies differ across groups in a two-way table, and you test it formally with a chi-square test.
Association is the idea that two variables are connected. If knowing someone's smoking status changes your prediction about whether they have lung disease, those variables are associated. If the conditional distributions look basically the same across every group, there's no association and the variables are independent.
For categorical data, association lives in two-way (contingency) tables. You compute conditional relative frequencies (cell counts divided by their row or column total, per LO 2.3.A) and compare them across groups (LO 2.3.B). If 40% of current smokers have lung disease but only 10% of never-smokers do, that gap is the association. In Unit 8, the same idea gets a formal test. You calculate expected counts assuming no association (LO 8.4.A, expected count = row total × column total / table total) and use a chi-square test to decide whether the observed differences are too big to blame on chance.
Association is one of the few concepts that anchors two whole units. In Unit 2 (Topic 2.3), LOs 2.3.A and 2.3.B have you calculate marginal and conditional relative frequencies and use them to decide whether two categorical variables are associated, which is descriptive work. In Unit 8 (Topics 8.4 and 8.7), association becomes the thing you're doing inference about. The chi-square test of independence asks whether an association you see in sample data is convincing evidence of an association in the population. Expected counts in Topic 8.4 are literally what the table would look like if there were NO association, which is why comparing observed to expected counts works. If you can't define association clearly, you can't interpret a chi-square conclusion correctly.
Keep studying AP Statistics Unit 8
Chi-Square Test (Unit 8)
The chi-square test of independence is the formal version of the eyeball check you do in Unit 2. The null hypothesis says there is no association between the variables, and a small p-value gives you convincing evidence that an association exists in the population.
Conditional Relative Frequency (Unit 2)
This is your association detector. If the conditional distributions of one variable differ across the categories of another (say, support for a policy differs by political party), the variables are associated. If the conditional distributions match, they're independent.
Expected Count (Unit 8)
Expected counts are the counts you'd see if there were zero association. The formula (row total × column total) / table total comes straight from assuming independence, so the bigger the gap between observed and expected, the stronger the evidence of association.
Contingency Table (Units 2 & 8)
Association between categorical variables is always read off a contingency table. The same table format carries you from descriptive comparisons in Unit 2 to chi-square inference in Unit 8.
Multiple choice loves two angles. First, reading a two-way table and deciding whether the variables appear associated, which means comparing conditional relative frequencies, not raw counts. Second, picking the right inference procedure when a question says "a study examines whether two categorical variables are associated" and the answer is a chi-square test of independence (that's exactly what Topic 8.7 trains). You'll also compute expected counts, like finding the expected number of junior science majors from row and column totals. On FRQs, association shows up in full chi-square problems. The 2017 FRQ Q5 gave a two-way table of age at diagnosis by sex and asked whether the data provide convincing statistical evidence of an association, requiring hypotheses, conditions, the test statistic, and a conclusion in context. The biggest scoring trap is the conclusion. Say the data give (or don't give) convincing evidence of an association; never claim the data prove the variables are independent or that one variable causes the other.
Association says two variables are related; causation says one actually changes the other. A chi-square test or a strong pattern in a contingency table can only ever show association. The classic AP Stats mantra applies here in full force. If the data came from an observational study, lurking variables could explain the relationship, so you can only claim causation when the data come from a well-designed randomized experiment. Writing "smoking causes lung disease" as a chi-square conclusion from observational data will cost you points; write "there is convincing evidence of an association between smoking status and lung disease" instead.
Two variables are associated when knowing the value of one gives you information about the other; for categorical variables, this shows up as conditional relative frequencies that differ across groups.
Compare conditional relative frequencies (percentages), not raw counts, because groups in a two-way table usually have different sizes.
Expected counts, calculated as (row total × column total) / table total, represent what the table would look like if there were no association at all.
The chi-square test of independence is the inference procedure for the question 'are these two categorical variables associated?' and its null hypothesis is always no association.
Association is never causation; only a randomized experiment lets you make a cause-and-effect claim.
Failing to reject the null does not prove the variables are independent; it just means you lack convincing evidence of an association.
Association means two variables are related, so knowing one variable's value helps you predict the other. For categorical variables in a two-way table, you detect it by comparing conditional relative frequencies across groups, and you test it with a chi-square test in Unit 8.
No. Association only means the variables are related; it can't tell you that one causes the other. Lurking variables can create associations in observational data, so causal claims require a randomized experiment.
Correlation (r) is a specific number measuring the strength of a linear relationship between two quantitative variables (Unit 2 scatterplots). Association is the broader concept and is the word you use for categorical variables, where correlation doesn't apply.
Compute conditional relative frequencies, like the percent of each smoking-status group that has lung disease. If those percentages differ noticeably across groups, the variables appear associated; if they're roughly equal, the variables look independent.
Use a chi-square test of independence when one sample is classified by two categorical variables in a contingency table. The 2017 FRQ asked exactly this with a table of age at diagnosis by sex, and the null hypothesis was that there is no association between the variables.