## 9.9 Regression diagnostics after MI

In previous chapters, various assumption checking and diagnostics were discussed for each type of regression model. How do we diagnose the fit of a model after MI? Any visual diagnostic plot can be examined within each imputation individually and compared between imputations. Also, diagnostic statistics can be computed within each imputation and pooled, for example, using pool.scalar(), micombine.chisquare(), or micombine.F() (the latter two from the miceadds package ).

Two examples will be demonstrated here: visually checking linearity for a continuous predictor in a linear regression model and carrying out a Hosmer-Lemeshow goodness-of-fit test for a logistic regression model.

### 9.9.1 Examining a diagnostic plot across imputations

As described in Section 5.16, use a CR plot to check the linearity assumption of a linear regression model. Display the plot for each imputation to get a visual assessment of the adequacy of the linearity assumption. For the linear regression model fit in Section 9.6, there were 14 imputations.

In practice, view all the plots for all continuous predictors. Here, for brevity, only the linearity assumption for one of the continuous predictors from Example 9.1 (BMXWAIST) is assessed and only for the first four imputations. As shown in Figure 9.8, the linearity assumption appears to be consistently met.

# To view for all imputations
# for(i in 1:length(fit.imp.lm$analyses)) { # Insert car::crPlots call here # } # Just the first four imputations par(mfrow=c(2,2)) for(i in 1:4) { car::crPlots(fit.imp.lm$analyses[[i]],
terms = ~ BMXWAIST,
pch=20, col="gray",
smooth = list(smoother=car::gamLine),
ylab = "Comp + Resid",
# In general, no need to specify ylim.
# It is included here because outliers
# make it harder to see patterns on the
# the original scale, so zooming in helps.
ylim = c(-0.05, 0.05))
} Figure 9.8: Checking the linearity assumption within each imputation

### 9.9.2 Pooling a diagnostic test over imputations

The Hosmer-Lemeshow goodness-of-fit test computes a chi-squared statistic. Compute this statistic on the fit to each imputed dataset, and then use micombine.chisquare() to pool over imputations.

# Initialize a vector to store the chi-squared values
X2 <- rep(NA, length(fit.imp.glm$analyses)) # Compute chi-squared statistic on each imputation # (and store the degrees of freedom, as well) for(i in 1:length(X2)) { HL <- ResourceSelection::hoslem.test(fit.imp.glm$analyses[[i]]$y, fit.imp.glm$analyses[[i]]$fitted.values) X2[i] <- HL$statistic # Chi-squared value
DF    <- HL\$parameter # Degrees of freedom
rm(HL)
}

# Input the chi-squared vector and the degrees of freedom
# to pool over imputations
micombine.chisquare(X2, DF)
## Combination of Chi Square Statistics for Multiply Imputed Data
## Using 22 Imputed Data Sets
## F(8, 187.31)=0.612     p=0.76728

### References

Enders, Craig K. 2010. Applied Missing Data Analysis. Guilford Press, New York, NY.
Robitzsch, Alexander, Simon Grund, and Thorsten Henke. 2023. Miceadds: Some Additional Multiple Imputation Functions, Especially for Mice. https://CRAN.R-project.org/package=miceadds.