1. ## Understanding Multilevel Model

I'm slowly working through multilevel models and have a pretty specific question/though checking.

Here's some R code:

Code:
``````library(lme4); data(Dyestuff)

mod <- lm(Yield ~ 1 , Dyestuff)
resid(mod)``````
This is a called a null model correct? This is because you're predicting the outcome based solely using the grand mean as the prediction.

I assume this is correct because I can achieve the same results by:

Code:
``Dyestuff\$Yield - mean(Dyestuff\$Yield)``
So now let's look at a similar model in the lme4 package.

Code:
``````fm1 <- lmer(Yield ~ 1 + (1|Batch), Dyestuff)
resid(fm1)``````
My take on this is that it is nearly identical to:

Code:
``````fm2 <- lm(Yield ~ Batch, Dyestuff)
resid(fm2)``````
Is this true? The only difference is that the lme function is using REML? So basically this model is predicting based on the group mean as demonstrated by the code below?

Code:
``with(Dyestuff, ave(Yield, Batch, FUN=function(x) x - mean(x)))``
Again the only difference between the two fm1 and fm2 is that fm1 is using REML and fm2 is using OLS?

Can some smarter people just check/critique my thinking?

2. ## Re: Understanding Multilevel Model

I think I'm wrong on the reason that resid(fm1) != resid(fm2). It's not REML but the fact that more variability is being accounted for by the between group variance. Is this correct?

3. ## Re: Understanding Multilevel Model

Originally Posted by trinker
This is a called a null model correct?
Well, I mean, some people call it that (notably, Raudenbush & Bryk and Snijders & Bosker). I would just call it a regression with no predictors, or something like this. No need to introduce unnecessary new terminology.

Originally Posted by trinker
Is this true? The only difference is that the lme function is using REML?
Close but not exactly. In technical terms, fm1 treats Batches as random effects, while fm2 treats Batches as fixed effects. Empirically you can observe that the two models differ in their predicted/fitted values (i.e., try inspecting fitted(fm1) and fitted(fm2)). fm2's predicted values are simply the mean Yield for each Batch. fm1's predicted values are very similar to the simple Batch means, but the predictions have been "shrunk" to some extent toward the grand mean (i.e., the intercept). These shrunken estimates from fm1 are called BLUPs ("Best Linear Unbiased Predictors").

4. ## The Following User Says Thank You to Jake For This Useful Post:

trinker (02-18-2014)

5. ## Re: Understanding Multilevel Model

Thanks Jake, I'll run with this thread for a while...

I'm trying to work through Bate's book (p. 24). In chapter 1 he makes a plot of the estimates with prediciton intervals. I don't know how the prediction intervals are calculated (I assume the standard error, is this true)? If not how were the prediction intervals calculated:

Code:
``````fm1ML <- lmer(Yield ~ 1|Batch, Dyestuff)
dotplot(ranef(fm1ML, condVar = TRUE))

dat <- structure(list(Batch = structure(c(3L, 4L, 5L, 2L, 6L, 1L), .Label = c("F",
"D", "A", "B", "C", "E"), class = "factor"), X.Intercept. = c(-17.6069081921822,
0.391264626493306, 28.5623177339851, -23.0846129630835, 56.7333708414772,
-44.9954320466883)), .Names = c("Batch", "X.Intercept."), row.names = c(NA,
-6L), class = "data.frame")

se <- summary(fm1ML)\$coefficients[2]

ggplot(dat, aes(x=X.Intercept., y=Batch)) + geom_point() +
geom_errorbarh(aes(xmin=X.Intercept.-se, xmax=X.Intercept.+se), height=0, size=1)``````
Here are the two plots:

Are the 2 plots equivalent (i.e. is the prediction interval = to coefficient - SE?)

6. ## Re: Understanding Multilevel Model

Go back to the R board trinker

An earmark of Multi Level (ML) as used by the authors cited above is that they use nomeclature others don't for example hiearchical regression rather than ML. I agree with jake that calling it a null model because it has no predictors makes little sense.

7. ## The Following User Says Thank You to noetsi For This Useful Post:

trinker (02-18-2014)

8. ## Re: Understanding Multilevel Model

Originally Posted by trinker
(i.e. is the prediction interval = to coefficient - SE?)
well, almost. from the ggplot code that you posed I can see he's setting the limits of the prediction on the interval [intercept estimate - standard error, intercept estimate + standard error]

9. ## The Following User Says Thank You to spunky For This Useful Post:

trinker (02-18-2014)

10. ## Re: Understanding Multilevel Model

trinker I had an excellent course from an expert in HLM. When I get some more time, it is really busy right now with the legislature coming into session, I can pull them out (you mentioned you wanted to see them).

11. ## The Following User Says Thank You to noetsi For This Useful Post:

trinker (02-18-2014)

12. ## Re: Understanding Multilevel Model

Spunky I should have been ecplicit. I made the gap lot code. Bates uses the lattice but in doing so it's not explicit what is actually happening. The ggplot was my guess that made sense.

13. ## Re: Understanding Multilevel Model

Just a tiny comment: It's definitely called a NULL model in many fields, and calling it a model without predictors could confuse more - depending on the field. Find out what normal for your field. Jake's spot on when it comes to shrinkage and BLUPS.

Now does Bates, in his book, explain how to fairly test between models differing in what is included as random effects? That is what I would like to know

14. ## Re: Understanding Multilevel Model

At least in the earlier edition of the text (Pinheiro & Bates, 2000), they mention that there are issues arising from the fact that the parameter value under the null hypothesis is on the border of the parameter space, and that technically there is a correction you can apply to the degrees of freedom of the LRT statistic to take account of this. But then they note that the commonly suggested correction (i.e., using an equal mixture of chi-squares with 2 different degrees of freedom) is not really quite right either, as verified by simulation. So their basic conclusion is that it is tricky to do this just right, no easily-implemented method seems to fix the problem exactly, but the simple, naive LRT is not too bad really (although slightly conservative), so they recommend just going with that.

15. ## The Following 2 Users Say Thank You to Jake For This Useful Post:

TheEcologist (02-18-2014), trinker (02-20-2014)

16. ## Re: Understanding Multilevel Model

Originally Posted by Jake
At least in the earlier edition of the text (Pinheiro & Bates, 2000), they mention that there are issues arising from the fact that the parameter value under the null hypothesis is on the border of the parameter space, and that technically there is a correction you can apply to the degrees of freedom of the LRT statistic to take account of this. But then they note that the commonly suggested correction (i.e., using an equal mixture of chi-squares with 2 different degrees of freedom) is not really quite right either, as verified by simulation. So their basic conclusion is that it is tricky to do this just right, no easily-implemented method seems to fix the problem exactly, but the simple, naive LRT is not too bad really (although slightly conservative), so they recommend just going with that.
Great! I'll look into Pinhero and Bates 2000, I always thought they were the best resource for this. But this is very useful, thanks Jake.

17. ## Re: Understanding Multilevel Model

If you have a copy of P&B they discuss this on pp. 83-87.

18. ## The Following User Says Thank You to Jake For This Useful Post:

TheEcologist (02-18-2014)

19. ## Re: Understanding Multilevel Model

Originally Posted by Jake
At least in the earlier edition of the text (Pinheiro & Bates, 2000), they mention that there are issues arising from the fact that the parameter value under the null hypothesis is on the border of the parameter space, and that technically there is a correction you can apply to the degrees of freedom of the LRT statistic to take account of this. But then they note that the commonly suggested correction (i.e., using an equal mixture of chi-squares with 2 different degrees of freedom) is not really quite right either, as verified by simulation. So their basic conclusion is that it is tricky to do this just right, no easily-implemented method seems to fix the problem exactly, but the simple, naive LRT is not too bad really (although slightly conservative), so they recommend just going with that.
there is some bootstrapped likelihood-ratio test that some random spunky shared with you at some point to address this issue...

#justsayin

20. ## Re: Understanding Multilevel Model

Yes, if it was really important that I get a reliable answer to this question in my own research (e.g., the question was of substantive theoretical interest), then I would use something like the bootstrap-based approach that you outlined. But most of the time, in the process of everyday data analysis and model comparison using mixed models, it's really not important that I have a reliable p-value for testing random effects. In fact I usually don't bother doing formal hypothesis tests on random effects at all. Usually I let my choice of the random effects structure be dictated by the design of the study and computational/convergence issues.

21. ## Re: Understanding Multilevel Model

Originally Posted by Jake
In fact I usually don't bother doing formal hypothesis tests on random effects at all.
as weird as it sounds, i do this as well... at some point the hypothesis testing of the random effects is much more of an after thought and it's the actual covariance structure what one ends up interpreting.

Page 1 of 4 1 2 3 4 Last

 Tweet