In logistic regression, the raw coefficients live on the log-odds scale, which is hard to think about directly. The standard move is to exponentiate them into odds ratios, which are much easier to communicate. This section covers how that transformation works and what to watch out for.

Interpreting Coefficients as Odds Ratios

A logistic regression coefficient ( $\hat{\beta}$ ) tells you the change in the log odds of the outcome for a one-unit increase in that predictor, holding everything else constant. Because log odds aren't intuitive, you exponentiate the coefficient to get the odds ratio:

$OR = e^{\hat{\beta}}$

The odds ratio represents the multiplicative change in the odds of the outcome for a one-unit increase in the predictor. That word "multiplicative" matters:

An odds ratio of 1.5 means the odds are multiplied by 1.5 (a 50% increase in the odds)
An odds ratio of 0.8 means the odds are multiplied by 0.8 (a 20% decrease in the odds)
An odds ratio of exactly 1 means no change in the odds

So $OR > 1$ signals a positive association with the outcome, and $OR < 1$ signals a negative association.

Considerations for Interpreting Odds Ratios

The meaning of "one-unit increase" depends on the type of predictor:

Continuous predictor: The OR reflects a literal one-unit increase on that variable's scale. If the predictor is income in thousands of dollars, the OR applies per additional $1,000.
Binary predictor: The OR is the ratio of the odds of the outcome in one group versus the reference group. For example, if treatment vs. control has $OR = 2.3$ , the treatment group has 2.3 times the odds of the outcome compared to control.
Categorical predictor (3+ levels): Each level gets its own OR, always compared to a chosen reference category. Changing the reference category changes the reported ORs but not the underlying model.

Confidence intervals for odds ratios help you assess precision and significance:

A 95% CI that excludes 1 (e.g., [1.12, 1.94]) indicates statistical significance at the 0.05 level
A CI that contains 1 (e.g., [0.85, 1.30]) means you can't rule out no association
Wider intervals reflect greater uncertainty in the estimate

Marginal Effects of Predictors

Odds ratios tell you about multiplicative changes in odds, but stakeholders and applied researchers often want to know about changes in probability. That's where marginal effects come in.

Interpreting coefficients as odds ratios, Visualizing Odds Ratios · J Stuart Carlton

Calculating Marginal Effects

A marginal effect is the change in the predicted probability of the outcome for a one-unit change in a predictor, with all other predictors held at specified values (typically their means for continuous variables or reference levels for categorical ones).

Because the logistic function is nonlinear, marginal effects are not constant. They depend on where you are on the sigmoid curve. A predictor might shift the probability by 8 percentage points when the baseline probability is near 0.5, but only by 1 percentage point when the baseline is near 0.05. This is a key difference from linear regression, where the effect of a predictor is the same everywhere.

Two common approaches:

Marginal effect at the mean (MEM): Evaluate the effect with all other predictors set to their sample means
Average marginal effect (AME): Compute the marginal effect for every observation in the dataset, then average them. AMEs are generally preferred because they don't rely on a potentially unrealistic "average" observation.

Marginal effects can be computed for any predictor type:

Continuous: The instantaneous rate of change in predicted probability per one-unit increase
Binary: The difference in predicted probability between the two levels
Categorical (3+ levels): The difference in predicted probability between each level and the reference category

Interpreting Marginal Effects

Marginal effects are expressed directly as probability changes, making them more intuitive than odds ratios:

A marginal effect of 0.05 means a one-unit increase in the predictor is associated with a 5 percentage point increase in the predicted probability
A marginal effect of $-0.02$ means a 2 percentage point decrease

For a categorical predictor, the marginal effects across all levels (including the reference) sum to zero. This is a built-in consistency property: shifting probability toward one category must shift it away from others.

Interaction Effects in Logistic Regression

Interpreting coefficients as odds ratios, Maximum Likelihood Estimate and Logistic Regression simplified — Pavan Mirla

Understanding Interaction Effects

An interaction effect occurs when the effect of one predictor on the outcome depends on the value of another predictor. You model this by including a product term. For predictors $X_1$ and $X_2$ , the model becomes:

$\text{logit}(p) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 (X_1 \times X_2)$

Here $\beta_3$ captures how the effect of $X_1$ on the log odds changes per one-unit increase in $X_2$ (and vice versa). A statistically significant $\beta_3$ tells you the relationship between one predictor and the outcome genuinely varies across levels of the other predictor.

Interpreting Interaction Effects

Interpreting interactions in logistic regression is trickier than in linear regression for two reasons:

The coefficients $\beta_1$ and $\beta_2$ no longer represent "the effect of $X_1$ (or $X_2$ ) when the other is zero." They represent the effect only at the reference value of the interacting predictor.
Because the logistic link is nonlinear, the interaction effect on the probability scale can differ in sign and magnitude from the interaction on the log-odds scale. You should always check marginal effects or predicted probabilities, not just the coefficient table.

Interaction plots are the clearest way to understand what's happening:

Parallel lines on the probability scale suggest no meaningful interaction
Non-parallel or crossing lines indicate that the effect of one predictor changes depending on the level of the other
The gap between lines at different points along the x-axis shows how the marginal effect of one predictor shifts across values of the other

Visualizing Relationships in Logistic Regression

Probability Plots

Probability plots display the predicted probability of the outcome as a function of one or more predictors, with other variables held at fixed values. They're the single best tool for communicating logistic regression results.

For a single continuous predictor, the plot takes the familiar S-shaped (sigmoid) form, bounded between 0 and 1:

A steeper curve indicates a stronger relationship between the predictor and the outcome
The inflection point (where the curve is steepest) corresponds to a predicted probability of 0.5
The curve flattens near 0 and 1, which is why marginal effects shrink at extreme probabilities

For a binary predictor, the plot reduces to two points (one per group). The vertical distance between them is the marginal effect.

For a categorical predictor with 3+ levels, you get one point per category. Differences between points and the reference level represent the marginal effects.

Interaction Plots

Interaction plots extend probability plots by showing how the predicted probability curve for one predictor shifts across levels of a second predictor:

Each line represents the predicted probability across values of one predictor, with the other predictor held at a specific level
Parallel lines mean the effect of the x-axis predictor is the same regardless of the other variable (no interaction)
Non-parallel or crossing lines reveal an interaction, and the pattern of divergence or convergence tells you how the interaction works

Adding confidence bands around the predicted probability curves shows uncertainty:

Wider bands mean less precise estimates (often at the extremes of the predictor range where data is sparse)
Non-overlapping bands between groups suggest statistically significant differences in predicted probabilities at those predictor values

2,589 studying →