6.14 Influential observations

Checking for influential observations in logistic regression is the same as for MLR (Section 5.22). Fit the model, plot the Cook’s distances and DFBetas, and, if there are observations with extreme values, conduct a sensitivity analysis to see if their removal impacts your conclusions (Section 5.25).

Example 6.3 (continued): Look for influential observations in the model that includes an interaction (Figures 6.8 and 6.9).

car::influenceIndexPlot(fit.ex6.3.int, vars = "Cook",
                        id=F, main = "Cook's distance")
Cook's distance plot for a logistic regression

Figure 6.8: Cook’s distance plot for a logistic regression

# Compute DFBETAS
DFBETAS <- dfbetas(fit.ex6.3.int)

# To see the spelling of the terms
# colnames(DFBETAS)
# Index plot for each predictor
# (results only shown for one plot)

plot(DFBETAS[, "alc_agefirst"], ylab="AlcAge")
abline(h = c(-0.2, 0.2), lty = 2)
Plot of DFBETAS for a logistic regression

Figure 6.9: Plot of DFBETAS for a logistic regression

Conclusion: There appear to be a few potentially influential points based on Cook’s Distance, and a few points that are influential for the main effect of age of first alcohol use.