6.14 Influential observations

Checking for influential observations in logistic regression is the same as for MLR (Section 5.22). Fit the model, plot the Cook’s distances and DFBetas, and, if there are observations with extreme values, conduct a sensitivity analysis to see if their removal impacts your conclusions (Section 5.25).

Example 6.3 (continued): Look for influential observations in the model that includes an interaction (Figures 6.7 and 6.8).

car::influenceIndexPlot(fit.ex6.3.int, vars = "Cook",
                        id=F, main = "Cook's distance")
Plot with a point for each Cook's distance with three to five points that sticks out far above the others

Figure 6.7: Cook’s distance plot for a logistic regression

# Compute DFBETAS
DFBETAS <- dfbetas(fit.ex6.3.int)

# To see the spelling of the terms
# colnames(DFBETAS)
# Index plot for each predictor
# (results only shown for one plot)

plot(DFBETAS[, "alc_agefirst"], ylab="AlcAge")
abline(h = c(-0.2, 0.2), lty = 2)
DF betas for age of first alcohol use with a horizontal line 0.2 and two points that fall above this bound. There is no line at -0.2 because no values are that negative

Figure 6.8: Plot of DFBETAS for a logistic regression

Conclusion: There appear to be a few potentially influential observations based on Cook’s Distance, and a few observations that are influential for the main effect of age of first alcohol use.