Estimating Multilevel Logistic Regression Models

Cynderella

New Member
The following multilevel logistic model with
one explanatory variable at level 1 (individual level) and
one explanatory variable at level 2 (group level) :

$$\text{logit}(p_{ij})=\pi_{0j}+\pi_{1j}x_{ij}\ldots (1)$$
$$\pi_{0j}=\gamma_{00}+\gamma_{01}z_j+u_{0j}\ldots (2)$$
$$\pi_{1j}=\gamma_{10}+\gamma_{11}z_j+u_{1j}\ldots (3)$$

where , $$u_{0j}\sim N(0,\sigma_0^2)$$ , $$u_{1j}\sim N(0,\sigma_1^2)$$ , $$\text{cov}(u_{0j},u_{1j})=\sigma_{01}$$

I want to estimate the parameter of the model and I like to use R command glmmPQL .

Substituting equation (2) and (3) in equation (1) yields ,

$$\text{logit}(p_{ij})=\gamma_{00}+\gamma_{10}x_{ij}+\gamma_{01}z_j+\gamma_{11}x_{ij}z_j+u_{0j}+u_{1j}x_{ij}\ldots (4)$$

There are 30 groups $$(j=1,...,30)$$ and 5 individual in each group .

R code :
Code:
#Simulating data from multilevel logistic distribution
library(mvtnorm)
set.seed(1234)

J <- 30             ## number of groups
n_j <- rep(5,J)     ## number of individuals in jth group
N <- sum(n_j)

g_00 <- -1
g_01 <- 0.3
g_10 <- 0.3
g_11 <- 0.3

s2_0 <- 0.13  ##variance corresponding to specific ICC
s2_1 <- 1     ##variance standardized to 1
s01  <- 0     ##covariance assumed zero

z <- rnorm(J)
x <- rnorm(N)

#Generate (u_0j,u_1j) from a bivariate normal .
mu <- c(0,0)
sig <- matrix(c(s2_0,s01,s01,s2_1),ncol=2)
u <- rmvnorm(J,mean=mu,sigma=sig,method="chol")

pi_0 <- g_00 +g_01*z + as.vector(u[,1])
pi_1 <- g_10 + g_11*z + as.vector(u[,2])
eta <- rep(pi_0,n_j)+rep(pi_1,n_j)*x
p <- exp(eta)/(1+exp(eta))

y <- rbinom(N,1,p)
Now the parameter estimation .

Code:
#### estimating parameters
library(MASS)
library(nlme)

sim_data_mat <- matrix(c(y,x,rep(z,n_j),rep(1:30,n_j)),ncol=4)
sim_data <- data.frame(sim_data_mat)
colnames(sim_data) <- c("Y","X","Z","cluster")
summary(glmmPQL(Y~X*Z,random=~1|cluster,family=binomial,data=sim_data,,niter=200))

OUTPUT :

Code:
iteration 1
Linear mixed-effects model fit by maximum likelihood
Data: sim_data

Random effects:
Formula: ~1 | cluster
(Intercept)  Residual
StdDev: 0.0001541031 0.9982503

Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: Y ~ X * Z
Value Std.Error  DF   t-value p-value
(Intercept) -0.8968692 0.2018882 118 -4.442404  0.0000
X            0.5803201 0.2216070 118  2.618691  0.0100
Z            0.2535626 0.2258860  28  1.122525  0.2712
X:Z          0.3375088 0.2691334 118  1.254057  0.2123
Correlation:
(Intr) X      Z
X   -0.072
Z    0.315  0.157
X:Z  0.095  0.489  0.269

Number of Observations: 150
Number of Groups: 30
* Why does it take only 1 iteration while I mentioned to take $200$ iterations inside the function glmmPQL by the argument niter=200 ?

* Also p-value of group-level variable $(Z)$ and cross-level interaction $(X:Z)$ shows they are not significant . Still why in this article, they keep the group-level variable $$(Z)$$ and cross-level interaction $$(X:Z)$$ for further analysis ?

* Also How are the degrees of freedom DF being calculated ?

* It doesn't match with the relative bias of the various estimates of the table. I tried to calculate the relative bias as :

Code:
#Estimated Fixed Effect parameters :

hat_g_00 <- -0.8968692 #overall intercept
hat_g_10 <- 0.5803201  # X
hat_g_01 <-0.2535626   # Z
hat_g_11 <-0.3375088   #X*Z

fixed <-c(g_00,g_10,g_01,g_11)
hat_fixed <-c(hat_g_00,hat_g_10,hat_g_01,hat_g_11)

#Estimated Random Effect parameters :

hat_s_0 <-0.0001541031  ##Estimated Standard deviation of random intercept
hat_s_1 <-  0.9982503

std  <- c(sqrt(0.13),1)
hat_std  <- c(0.0001541031,0.9982503)

##Relative bias of Fixed Effect :
rel_bias_fixed <- ((hat_fixed-fixed)/fixed)*100
 -10.31308  93.44003 -15.47913  12.50293

##Relative bias of Random Effect :
rel_bias_Random <- ((hat_std-std)/std)*100
 -99.95726  -0.17497