10.5 Summary

In this chapter, we introduced Bayesian model averaging (BMA) in generalized linear models. For linear Gaussian models, we perform BMA using three approaches: the Bayesian Information Criterion (BIC) approximation with Occam’s window, the Markov Chain Monte Carlo Model Composition (MC3) algorithm, and conditional Bayes factors, which account for endogeneity. Additionally, we show how to perform dynamic Bayesian model averaging in state-space models, where forgetting parameters are used to facilitate computation. For other generalized linear models, such as logit, gamma, and Poisson, we demonstrate how to use the BIC approximation to perform BMA. Finally, we present alternative methods for calculating the marginal likelihood: the Savage-Dickey density ratio, Chib’s method, and the Gelfand-Dey method. These methods are particularly useful when the BIC approximation does not perform well due to small or moderate sample sizes.

However, a limitation of standard BMA is its implicit assumption that one of the candidate models is the true data-generating process. As Box famously noted, “Since all models are wrong, the scientist cannot obtain a ‘correct’ one by excessive elaboration. On the contrary, following William of Occam, he should seek an economical description of natural phenomena” (George E. P. Box 1976).

This perspective has motivated new developments in BMA that relax the “true model” assumption and instead treat all models as misspecified. In these approaches, the Bayesian average of predictive densities is constructed using leave-one-out (LOO) predictive performance, and model weights are chosen to minimize a predictive loss function. One prominent example is stacking of predictive distributions, which, rather than weighting models by marginal likelihood, uses cross-validation (LOO) to assign weights that maximize out-of-sample predictive accuracy (Yao et al. 2018). For a comprehensive review of Bayesian methods for aggregating predictive distributions, see Yao (2021).

References

Box, George E. P. 1976. “Science and Statistics.” Journal of the American Statistical Association 71 (356): 791–99. https://doi.org/10.1080/01621459.1976.10480949.
Yao, Yuling. 2021. “Bayesian Aggregation.” In Wiley StatsRef: Statistics Reference Online, 1–13. John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118445112.stat08301.
Yao, Yuling, Aki Vehtari, Daniel Simpson, and Andrew Gelman. 2018. “Using Stacking to Average Bayesian Predictive Distributions.” Bayesian Analysis 13 (3): 917–1003. https://doi.org/10.1214/17-BA1091.