# Penalized quasi-likelihood

#### Cynderella

##### New Member
Can anybody explain me why the function glmmPQL(.) in R behaves in different
ways, depending on the number of measurements/individuals you use? To
show you this, I generated two examples. The first one includes 20
indivduals with each 100 repeated measurements (binary response), the
second one includes 40 individuals. The 'individuals' differ only in
different x values. I fitted logistic regression models with and without
random intercepts.

The coefficients and p values between dummy.glm20 and dummy.glmm20
differ. However, dummy.glm40 and dummy.glmm40 have the same coefficients
and p values, respectively. Did the solution in the second example not
converge (only one iteration step)?

Why does the AIC between e.g. dummy.glm20 and dummy.glmm20 differ so
much?

And last question: how can dummy.glm20 and dummy.glmm20 be compared with
an anova(.) function?

###Example 1
Code:
     y <- c(rep(0,40),sample(c(rep(0,20),rep(1,20)),40),rep(1,20))
dummy20 <- data.frame(ID=as.factor(c(rep(1:20,each=100))),
y=rep(y,20),x=c(1:2000))
dummy.glm20 <- glm(y~x,data=dummy20,family=binomial)
summary(dummy.glm20)
###Output
Code:
      Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.330e-01  9.199e-02  -5.795 6.85e-09 ***
x            1.270e-04  7.915e-05   1.604    0.109

Null deviance: 2692.0  on 1999  degrees of freedom
Residual deviance: 2689.5  on 1998  degrees of freedom
AIC: 2693.5

Comparing with 'glmmPQL'

dummy.glmm20 <- glmmPQL(y~x,random=~1 | ID,
data=dummy20,family=binomial)
summary(dummy.glmm20)
###Output
Code:
      Linear mixed-effects model fit by maximum likelihood
Data: dummy20
AIC      BIC    logLik
10270.78 10293.19 -5131.392

Random effects:
Formula: ~1 | ID
(Intercept)  Residual
StdDev:    50.55603 0.7928198

Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: y ~ x
Value Std.Error   DF   t-value p-value
(Intercept) -88.62246 11.686506 1979 -7.583316  <.0001
x             0.08768  0.002913 1979 30.099686  <.0001
Correlation:
(Intr)
x -0.252

Standardized Within-Group Residuals:
Min         Q1        Med         Q3        Max
-2.4592661 -0.4117634 -0.1389889  0.4223991  2.8757740

Number of Observations: 2000
Number of Groups: 20
###Example 2
Code:
       dummy40 <- data.frame(ID=as.factor(c(rep(1:40,each=100))),
y=rep(y,40),x=c(1:4000))
dummy.glm40 <- glm(y~x,data=dummy40,family=binomial)
summary(dummy.glm40)
###Output
Code:
       Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.691e-01  6.478e-02  -7.241 4.45e-13 ***
x            3.173e-05  2.796e-05   1.135    0.256

Null deviance: 5384.1  on 3999  degrees of freedom
Residual deviance: 5382.8  on 3998  degrees of freedom
AIC: 5386.8

Comparing with 'glmmPQL'

dummy.glmm40 <- glmmPQL(y~x,random=~1 | ID,
data=dummy40,family=binomial)
summary(dummy.glmm40)
###Output
Code:
       Linear mixed-effects model fit by maximum likelihood
Data: dummy40
AIC      BIC    logLik
17068.15 17093.32 -8530.073

Random effects:
Formula: ~1 | ID
(Intercept)  Residual
StdDev: 0.003066688 0.9999229

Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: y ~ x
Value  Std.Error   DF   t-value p-value
(Intercept) -0.4690798 0.06479625 3959 -7.239305  <.0001
x            0.0000317 0.00002797 3959  1.134708  0.2566
Correlation:
(Intr)
x -0.867

Standardized Within-Group Residuals:
Min        Q1       Med        Q3       Max
-0.842464 -0.820600 -0.798994  1.214569  1.263420

Number of Observations: 4000
Number of Groups: 40

#### hlsmith

##### Less is more. Stay pure. Stay poor.
It may be correcting for every observation. Stephen Cole has an partially elated in the American journal of epidemiology. You can probably find it searching for maximum likelihood.