12.7 Exercises

Simulation exercise: the Bayesian LASSO continues

Program the Gibbs sampler for the Bayesian LASSO from scratch, assuming a hierarchical structure for the global shrinkage parameter, where both the shape and rate parameters are set to 1. Perform inference using this sampler in the Bayesian LASSO simulation exercise and compare the results with those obtained using the monomvn package.
Michael Jetter et al. (2022) employ SSVS to identify the main drivers of civil conflict in the post-Cold War era, considering a set of 35 potential determinants across 175 countries worldwide. We use a subset of their dataset provided in Conflict.csv, where the dependent variable is conflictcw, a binary indicator of civil conflict. Perform SSVS using the BoomSpikeSlab package, specifically the lm.spike function, to identify the best subset of models.
Tüchler (2008) proposes an SSVS approach for binary response models. Use the dataset Conflict.csv, where the dependent variable is conflictcw, to perform SSVS using the BoomSpikeSlab package, specifically the logit.spike function, in order to identify the best subset of models. Compare the results with those obtained in Exercise 2.
Example: Simulation exercise \(K > N\)

Use the simulation setting from the Bayesian LASSO and SSVS examples, but now assume there are 600 inputs. This setup implies that the number of inputs exceeds the sample size. In such a scenario, there is no unique solution to the least squares estimator because the determinant of \(\mathbf{W}^{\top} \mathbf{W}\) is zero. This means the matrix is not invertible, and consequently, standard inference procedures based on the least squares estimator cannot be applied. On the other hand, Bayesian inference in this setup is well-defined because the prior helps regularize the problem, which is a key motivation for these methods.
Simulation exercise: the BART model continues

Compute Friedman’s partial dependence functions (Friedman 2001) for all variables in the BART model simulation example, and plot the posterior mean along with the 95% credible intervals.
Chipman, George, and McCulloch (2010) present BART probit for classification. This method can be implemented using the BART package through the function pbart. Use the file Conflict.csv, where the dependent variable is conflictcw, to perform BART probit, implementing k-fold cross-validation to select the threshold that maximizes the sum of the true positive and true negative rates. Additionally, identify the most important predictors by evaluating different numbers of trees.
Simulation exercise: The Gaussian Process simulation continues

Simulate the process
\[ f_i = \sin(2\pi x_{i1}) + \cos(2\pi x_{i2}) + \sin(x_{i1} x_{i2}) + \mu_i, \]
where \(\mu_i \overset{\text{i.i.d.}}{\sim} N(0, 0.1^2)\), \(x_{ik} \sim U(0,1)\) for \(k = 1, 2\), and the sample size is 500.

Define a grid of 20 evenly spaced values between 0 and 1 for each covariate \(x_{ik}\), and use this grid to perform prediction.

Estimate the hyperparameters of the Gaussian Process by maximizing the log marginal likelihood. Then, use the km function from the DiceKriging package to fit the Gaussian Process, fixing the noise variance at the value that maximizes the log marginal likelihood.

Finally, use the fitted model to predict the outputs on the grid points, and produce a 3D plot showing the predicted surface along with the training data points.
Simulation exercise: Stochastic gradient MCMC continues

Program from scratch the stochastic gradient Langevin dynamics algorithm for the logit simulation exercise implementing the control variate version, performing 1,500 stochastic gradient descent iterations to locate the posterior mode, which should then be used as the initial value for 1,000 subsequent MCMC iterations using a step size set to \(1 \times 10^{-4}\).
Perform the simulation according to the model
\[ y_i = 1 - 2 x_{i1} + 0.5 x_{i2} + \mu_i, \]
where \(\mu_i \sim N(0,1)\), the sample size is 100,000, and the covariates \(\mathbf{x}_i \sim N(\mathbf{0}, \mathbf{I}_2)\). Use 5,000 MCMC iterations and a batch size of 1,000 to implement the SGLD algorithm. Set a learning rate schedule that yields sensible results.

Assume independent priors \(\pi(\boldsymbol{\beta}, \sigma^2) = \pi(\boldsymbol{\beta}) \times \pi(\sigma^2)\), with \(\boldsymbol{\beta} \sim N(\mathbf{0}, \mathbf{I}_3)\) and \(\sigma^2 \sim IG(\alpha_0/2, \delta_0/2)\), where \(\alpha_0 = \delta_0 = 0.01\).

References

———. 2010. “BART: Bayesian Additive Regression Trees.” The Annals of Applied Statistics 4 (1): 266–98.

———. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of Statistics 29 (5): 1189–1232.

Jetter, Michael, Rafat Mahmood, Christopher F. Parmeter, and Andrés Ramı́rez-Hassan. 2022. “Post-Cold War Civil Conflict and the Role of History and Religion: A Stochastic Search Variable Selection Approach.” Economic Modelling 114: 105907. https://doi.org/10.1016/j.econmod.2022.105907.

Tüchler, Regina. 2008. “Bayesian Variable Selection for Logistic Models Using Auxiliary Mixture Sampling.” Journal of Computational and Graphical Statistics 17 (1): 76–94.