## 6.9 Interactions

Just like with linear regression, you can include an interaction in a logistic regression model to evaluate if a predictor is an effect modifier, that is if the OR for one predictor depends on another (the effect modifier).

Example 6.3 (continued): After adjusting for age and income, does the association between lifetime marijuana use and age at first use of alcohol differ by sex?

Adding the interaction term alc_agefirst:demog_sex to the model, we get the following results.

fit.ex6.3.int <- glm(mj_lifetime ~ alc_agefirst + demog_age_cat6 + demog_sex +
demog_income + alc_agefirst:demog_sex,
family = binomial, data = nsduh)
round(summary(fit.ex6.3.int)$coef, 4) ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 7.3735 0.8292 8.8923 0.0000 ## alc_agefirst -0.3378 0.0423 -7.9876 0.0000 ## demog_age_cat626-34 -0.2951 0.3314 -0.8902 0.3733 ## demog_age_cat635-49 -0.8160 0.2987 -2.7320 0.0063 ## demog_age_cat650-64 -0.6866 0.3007 -2.2832 0.0224 ## demog_age_cat665+ -1.2600 0.3064 -4.1130 0.0000 ## demog_sexMale -2.1381 0.9847 -2.1713 0.0299 ## demog_income$20,000 - $49,999 -0.5304 0.2682 -1.9773 0.0480 ## demog_income$50,000 - $74,999 -0.0931 0.3073 -0.3029 0.7620 ## demog_income$75,000 or more    -0.3629     0.2547 -1.4250   0.1542
## alc_agefirst:demog_sexMale      0.1189     0.0555  2.1420   0.0322

To answer the question of significance, look at the row in the regression coefficients table corresponding to the interaction (alc_agefirst:demog_sex). The p-value is 0.0322, so the interaction is statistically significant. Had there been more than one row (e.g., if the interaction involved a categorical predictor with more than 2 levels) then use a Type III test to get the test of interaction. Since, in this case, there is just one term corresponding to the interaction, the Type III Wald test p-value from car::Anova() is exactly the same as the p-value in the coefficients table.

car::Anova(fit.ex6.3.int, type = 3, test.statistic = "Wald")
## Analysis of Deviance Table (Type III tests)
##
## Response: mj_lifetime
##                        Df Chisq           Pr(>Chisq)
## (Intercept)             1 79.07 < 0.0000000000000002 ***
## alc_agefirst            1 63.80   0.0000000000000014 ***
## demog_age_cat6          4 22.33              0.00017 ***
## demog_sex               1  4.71              0.02991 *
## demog_income            3  5.23              0.15555
## alc_agefirst:demog_sex  1  4.59              0.03219 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion: The association between age of first alcohol use and lifetime marijuana use differs significantly between males and females (p = .032).

### 6.9.1 Overall test of a predictor involved in an interaction

As shown in Section 5.9.11 for MLR, to carry out an overall test of a predictor involved in an interaction in a logistic regression model, compare the full model to a reduced model with both the main effect and interaction removed. As with MLR, use anova() to compare the models, but for a logistic regression you must also specify test = "Chisq" to get a p-value. If you did not remove missing data prior to fitting the full model, you must do so before comparing models if the main effect predictor being removed to create the reduced model has missing values. You can either create a new dataset with missing values removed or use the subset option.

Example 6.3 (continued): In the model that includes an interaction, is age at first alcohol use significantly associated with the outcome?

# Fit the reduced model using the subset option
# to remove missing values for the predictor
# being removed to create the reduced model
fit0 <- glm(mj_lifetime ~ demog_age_cat6 + demog_sex + demog_income,
family = binomial, data = nsduh,
subset = !is.na(alc_agefirst))
# Compare to full model
anova(fit0, fit.ex6.3.int, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: mj_lifetime ~ demog_age_cat6 + demog_sex + demog_income
## Model 2: mj_lifetime ~ alc_agefirst + demog_age_cat6 + demog_sex + demog_income +
##     alc_agefirst:demog_sex
##   Resid. Df Resid. Dev Df Deviance            Pr(>Chi)
## 1       834       1085
## 2       832        931  2      154 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The 2 df test (2 df because the models differ in two terms – the main effect and the interaction) results in a very small p-value.

Conclusion: Age at first alcohol use is significantly associated with the outcome (p < .001).

### 6.9.2 Estimate the OR at each level of the other predictor

Use gmodels::estimable() to compute separate ORs for a predictor involved in an interaction at each level of the other predictor in the interaction. When there is no interaction, or when there is but the other predictor in the interaction is zero or at its reference level, the OR for a predictor is its exponentiated regression coefficient. When there is an interaction and the other predictor is non-zero or at a non-reference level, add the appropriate interaction term multiplied by the value of the other predictor to the main effect before exponentiating. Do not add the intercept or any other terms in the model since when computing the odds ratio all other terms drop out of the equation. In particular, do not add the main effect coefficient for the other predictor in the interaction.

Example 6.3 (continued): In the model that includes an interaction, what are the AORs for age at first alcohol use for females and males?

Before using gmodels::estimable(), look at the coefficient names in the glm object to make sure you spell them correctly.

rownames(summary(fit.ex6.3.int)$coef) ## [1] "(Intercept)" "alc_agefirst" ## [3] "demog_age_cat626-34" "demog_age_cat635-49" ## [5] "demog_age_cat650-64" "demog_age_cat665+" ## [7] "demog_sexMale" "demog_income$20,000 - $49,999" ## [9] "demog_income$50,000 - $74,999" "demog_income$75,000 or more"
## [11] "alc_agefirst:demog_sexMale"
# ORs for alc_agefirst
# Other predictor in the interaction is sex

# Females (reference level of other predictor)
EST.F <- gmodels::estimable(fit.ex6.3.int,
c("alc_agefirst"               = 1),
conf.int = 0.95)

# Males (non-reference level of other predictor)
EST.M <- gmodels::estimable(fit.ex6.3.int,
c("alc_agefirst"               = 1,
"alc_agefirst:demog_sexMale" = 1),
conf.int = 0.95)

rbind(EST.F,
EST.M)
##                         Estimate Std. Error X^2 value DF           Pr(>|X^2|)
## (0 1 0 0 0 0 0 0 0 0 0)  -0.3378    0.04229     63.80  1 0.000000000000001332
## (0 1 0 0 0 0 0 0 0 0 1)  -0.2189    0.03626     36.43  1 0.000000001581411446
##                         Lower.CI Upper.CI
## (0 1 0 0 0 0 0 0 0 0 0)  -0.4217  -0.2539
## (0 1 0 0 0 0 0 0 0 0 1)  -0.2908  -0.1469

These estimates are on the log-odds-ratio scale. We need to exponentiate to get AORs. But we have to be careful because the gmodels::estimable() output includes a p-value which and we do not want to exponentiate that. The code below exponentiates just the estimate and 95% CI.

# AORs and 95% CIs (do use exp here)
rbind(exp(EST.M[c("Estimate", "Lower.CI", "Upper.CI")]),
exp(EST.F[c("Estimate", "Lower.CI", "Upper.CI")]))
##                         Estimate Lower.CI Upper.CI
## (0 1 0 0 0 0 0 0 0 0 1)   0.8034   0.7476   0.8633
## (0 1 0 0 0 0 0 0 0 0 0)   0.7133   0.6559   0.7758
# P-values (do NOT use exp here)
rbind(EST.M["Pr(>|X^2|)"],
EST.F["Pr(>|X^2|)"])
##                                   Pr(>|X^2|)
## (0 1 0 0 0 0 0 0 0 0 1) 0.000000001581411446
## (0 1 0 0 0 0 0 0 0 0 0) 0.000000000000001332

Conclusion: After adjusting for age and income, age at first alcohol use is significantly associated with lifetime marijuana use (p < .001 – from the test comparing the model with an interaction to the model with no alc_agefirst main effect or interaction). The association between age at first alcohol use and lifetime marijuana use differs significantly between males and females (interaction p = .032). An age of first alcohol use 1 year greater is associated with 19.7% lower odds of lifetime marijuana use for males (AOR = 0.803; 95% CI = 0.748, 0.863; p < .001) and 28.7% lower odds for females (AOR = 0.713; 95% CI = 0.656, 0.776; p < .001).

NOTE: The above example was for a continuous $$\times$$ categorical interaction. For a categorical $$\times$$ categorical or continuous $$\times$$ continuous interaction, imitate the code in Section 5.9.9.2 or 5.9.9.3, respectively, and then exponentiate the resulting estimates and CIs (but do not exponentiate the p-values).

### 6.9.3 Visualize an interaction

It is always a good idea to visualize the magnitude of an interaction effect. Remember, in a large sample an effect can be statistically significant without being meaningfully large (practically significant) and, in a small sample, an effect can lack statistical significance but still be meaningfully large. Visualizing the effect can help with assessing practical significance.

As with MLR, you can visualize an interaction by plotting the fitted regression curves. The code is similar to MLR, but now we have two options. We can plot the fit on the log-odds scale or on the probability scale (by adding type = "response" to predict()).

Example 6.3 (continued): Visualize the interaction between age of first alcohol use and sex. The code below produces Figure 6.4.

First, create vectors corresponding to the values over which to plot for each term in the interaction.

# Values of the continuous predictor over which to plot
ALC <- seq(min(nsduh$alc_agefirst, na.rm = T), max(nsduh$alc_agefirst, na.rm = T), by = 1)

# Levels of the categorical predictor
LEVELS <- levels(nsduh$demog_sex) Next, create a data.frame to contain predictions that has a row for each combination of the two predictors, and any single value for each other predictor. In this example, we set each other predictor (demog_age_cat6, demog_income) at its reference level. # Data frame of predictor values at which to predict preddat <- data.frame( # Repeat ALC for each level of sex alc_agefirst = rep(ALC, length(LEVELS)), # Repeat the levels of gender for each value of ALC demog_sex = rep(LEVELS, each=length(ALC)), # Other predictors at their reference levels # (alternatively, any other value in the range of the observed data) demog_age_cat6 = "18-25", demog_income = "Less than$20,000"
)

# Add columns for the predicted values
preddat$pred_logodds <- predict(fit.ex6.3.int, preddat) preddat$pred_p       <- predict(fit.ex6.3.int,
preddat,
type = "response")

# Subset for each sex
# Make sure this order is consistent with the legend below
SUB1 <- preddat$demog_sex == "Female" SUB2 <- preddat$demog_sex == "Male"
par(mfrow=c(1,2))

# Plot on the log-odds scale (pred_logodds)
plot(0, 0, col = "white",
xlab = "Age of First Alcohol Use (years)",
ylab = "Log-odds of P(lifetime marijuana use)",
las = 1, pch = 20, font.lab = 2, font.axis = 2,
xlim = c(min(preddat$alc_agefirst), max(preddat$alc_agefirst)),
ylim = c(min(preddat$pred_logodds), max(preddat$pred_logodds)))
lines(preddat$alc_agefirst[SUB1], preddat$pred_logodds[SUB1],
lwd = 2, lty = 1, col = "purple")
lines(preddat$alc_agefirst[SUB2], preddat$pred_logodds[SUB2],
lwd = 2, lty = 2, col = "dark green")
legend(20, 6, c("Female", "Male"), col=c("purple", "dark green"),
lwd=c(2,2), lty=c(1,2), bty="n")

# Plot on the probability scale (pred_p)
plot(0, 0, col = "white",
xlab = "Age of First Alcohol Use (years)",
ylab = "P(lifetime marijuana use)",
las = 1, pch = 20, font.lab = 2, font.axis = 2,
xlim = c(min(preddat$alc_agefirst), max(preddat$alc_agefirst)),
ylim = c(min(preddat$pred_p), max(preddat$pred_p)))
lines(preddat$alc_agefirst[SUB1], preddat$pred_p[SUB1],
lwd = 2, lty = 1, col = "purple")
lines(preddat$alc_agefirst[SUB2], preddat$pred_p[SUB2],
lwd = 2, lty = 2, col = "dark green")
legend(20, 1, c("Female", "Male"), col=c("purple", "dark green"),
lwd=c(2,2), lty=c(1,2), bty="n")

Conclusion: We previously concluded that the interaction was statistically significant, which is equivalent to concluding that the fitted curves are significantly different from parallel (on both the log-odds and probability scales). Figure 6.4 confirms that they are not parallel and demonstrates that the interaction effect has a large magnitude, as well, since the lines seem meaningfully far from parallel.

NOTE: The above example demonstrated the code for a continuous $$\times$$ categorical interaction. If you have a categorical $$\times$$ categorical or continuous $$\times$$ continuous interaction, modify the code accordingly by imitating the code in Section 5.9.9.2 or 5.9.9.3, respectively.