Bayesian computations

Homework 2

Author

Affiliation

Tommaso Rigon

DEMS

The homeworks are not graded, but the results may be sent to tommaso.rigon@unimib.it to receive feedbacks.

Pima indian dataset

In this homework we consider once again the Pima indian dataset, as in the previous Markdown document B.1. Importantly, note that in this homework we will not standardize the predictors to make the computational problem more challenging.

Pima <- rbind(MASS::Pima.tr, MASS::Pima.te)
y <- as.numeric(Pima$type == "Yes") # Binary outcome
X <- cbind(1, model.matrix(type ~ . - 1, data = Pima)) # Design matrix

Homework

Model description

Let \textbf{y} = (y_1,\dots,y_n)^\intercal be the vector of the observed binary responses (variable y_data) and let \textbf{X} be the corresponding design matrix (object X_data) whose generic row is \textbf{x}_i = (x_{i1},\dots,x_{ip})^\intercal, for i=1,\dots,n, suitably standardized. Consider a generalized linear model such that

y_i \mid \pi_i \overset{\text{ind}}{\sim} \text{Bern}(\pi_i), \qquad \pi_i = \Phi(\eta_i), \qquad \eta_i = \beta_1x_{i1} + \cdots + \beta_p x_{ip}, where \Phi(\cdot) is the probit link (pnorm function).As done in Markdown document B.1, we will employ a relatively vague prior centered at 0, namely \beta \sim N(0, 100 I_p).

Assignments

Implement the Albert and Chib (1993) data augmentation algorithm for sampling from the posterior distribution of \beta.