+ Reply to Thread
Page 1 of 4 1 2 3 4 LastLast
Results 1 to 15 of 48

Thread: Understanding Multilevel Model

  1. #1
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Understanding Multilevel Model




    I'm slowly working through multilevel models and have a pretty specific question/though checking.

    Here's some R code:

    Code: 
    library(lme4); data(Dyestuff)
    
    mod <- lm(Yield ~ 1 , Dyestuff)
    resid(mod)
    This is a called a null model correct? This is because you're predicting the outcome based solely using the grand mean as the prediction.

    I assume this is correct because I can achieve the same results by:

    Code: 
    Dyestuff$Yield - mean(Dyestuff$Yield)
    So now let's look at a similar model in the lme4 package.

    Code: 
    fm1 <- lmer(Yield ~ 1 + (1|Batch), Dyestuff)
    resid(fm1)
    My take on this is that it is nearly identical to:

    Code: 
    fm2 <- lm(Yield ~ Batch, Dyestuff)
    resid(fm2)
    Is this true? The only difference is that the lme function is using REML? So basically this model is predicting based on the group mean as demonstrated by the code below?

    Code: 
    with(Dyestuff, ave(Yield, Batch, FUN=function(x) x - mean(x)))
    Again the only difference between the two fm1 and fm2 is that fm1 is using REML and fm2 is using OLS?

    Can some smarter people just check/critique my thinking?
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  2. #2
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Understanding Multilevel Model

    I think I'm wrong on the reason that resid(fm1) != resid(fm2). It's not REML but the fact that more variability is being accounted for by the between group variance. Is this correct?
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  3. #3
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Understanding Multilevel Model

    Quote Originally Posted by trinker View Post
    This is a called a null model correct?
    Well, I mean, some people call it that (notably, Raudenbush & Bryk and Snijders & Bosker). I would just call it a regression with no predictors, or something like this. No need to introduce unnecessary new terminology.

    Quote Originally Posted by trinker View Post
    Is this true? The only difference is that the lme function is using REML?
    Close but not exactly. In technical terms, fm1 treats Batches as random effects, while fm2 treats Batches as fixed effects. Empirically you can observe that the two models differ in their predicted/fitted values (i.e., try inspecting fitted(fm1) and fitted(fm2)). fm2's predicted values are simply the mean Yield for each Batch. fm1's predicted values are very similar to the simple Batch means, but the predictions have been "shrunk" to some extent toward the grand mean (i.e., the intercept). These shrunken estimates from fm1 are called BLUPs ("Best Linear Unbiased Predictors").
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  4. The Following User Says Thank You to Jake For This Useful Post:

    trinker (02-18-2014)

  5. #4
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Understanding Multilevel Model

    Thanks Jake, I'll run with this thread for a while...

    I'm trying to work through Bate's book (p. 24). In chapter 1 he makes a plot of the estimates with prediciton intervals. I don't know how the prediction intervals are calculated (I assume the standard error, is this true)? If not how were the prediction intervals calculated:

    Code: 
    fm1ML <- lmer(Yield ~ 1|Batch, Dyestuff) 
    dotplot(ranef(fm1ML, condVar = TRUE))
    
    dat <- structure(list(Batch = structure(c(3L, 4L, 5L, 2L, 6L, 1L), .Label = c("F", 
        "D", "A", "B", "C", "E"), class = "factor"), X.Intercept. = c(-17.6069081921822, 
        0.391264626493306, 28.5623177339851, -23.0846129630835, 56.7333708414772, 
        -44.9954320466883)), .Names = c("Batch", "X.Intercept."), row.names = c(NA, 
        -6L), class = "data.frame")
    
    se <- summary(fm1ML)$coefficients[2]
    
    ggplot(dat, aes(x=X.Intercept., y=Batch)) + geom_point() +
         geom_errorbarh(aes(xmin=X.Intercept.-se, xmax=X.Intercept.+se), height=0, size=1)
    Here are the two plots:





    Are the 2 plots equivalent (i.e. is the prediction interval = to coefficient - SE?)
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  6. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Understanding Multilevel Model

    Go back to the R board trinker

    An earmark of Multi Level (ML) as used by the authors cited above is that they use nomeclature others don't for example hiearchical regression rather than ML. I agree with jake that calling it a null model because it has no predictors makes little sense.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  7. The Following User Says Thank You to noetsi For This Useful Post:

    trinker (02-18-2014)

  8. #6
    TS Contributor
    Points: 22,359, Level: 93
    Level completed: 1%, Points required for next Level: 991
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Understanding Multilevel Model

    Quote Originally Posted by trinker View Post
    (i.e. is the prediction interval = to coefficient - SE?)
    well, almost. from the ggplot code that you posed I can see he's setting the limits of the prediction on the interval [intercept estimate - standard error, intercept estimate + standard error]
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  9. The Following User Says Thank You to spunky For This Useful Post:

    trinker (02-18-2014)

  10. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Understanding Multilevel Model

    trinker I had an excellent course from an expert in HLM. When I get some more time, it is really busy right now with the legislature coming into session, I can pull them out (you mentioned you wanted to see them).
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  11. The Following User Says Thank You to noetsi For This Useful Post:

    trinker (02-18-2014)

  12. #8
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Understanding Multilevel Model

    Spunky I should have been ecplicit. I made the gap lot code. Bates uses the lattice but in doing so it's not explicit what is actually happening. The ggplot was my guess that made sense.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  13. #9
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Understanding Multilevel Model

    Just a tiny comment: It's definitely called a NULL model in many fields, and calling it a model without predictors could confuse more - depending on the field. Find out what normal for your field. Jake's spot on when it comes to shrinkage and BLUPS.

    Now does Bates, in his book, explain how to fairly test between models differing in what is included as random effects? That is what I would like to know
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  14. #10
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Understanding Multilevel Model

    At least in the earlier edition of the text (Pinheiro & Bates, 2000), they mention that there are issues arising from the fact that the parameter value under the null hypothesis is on the border of the parameter space, and that technically there is a correction you can apply to the degrees of freedom of the LRT statistic to take account of this. But then they note that the commonly suggested correction (i.e., using an equal mixture of chi-squares with 2 different degrees of freedom) is not really quite right either, as verified by simulation. So their basic conclusion is that it is tricky to do this just right, no easily-implemented method seems to fix the problem exactly, but the simple, naive LRT is not too bad really (although slightly conservative), so they recommend just going with that.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  15. The Following 2 Users Say Thank You to Jake For This Useful Post:

    TheEcologist (02-18-2014), trinker (02-20-2014)

  16. #11
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Understanding Multilevel Model

    Quote Originally Posted by Jake View Post
    At least in the earlier edition of the text (Pinheiro & Bates, 2000), they mention that there are issues arising from the fact that the parameter value under the null hypothesis is on the border of the parameter space, and that technically there is a correction you can apply to the degrees of freedom of the LRT statistic to take account of this. But then they note that the commonly suggested correction (i.e., using an equal mixture of chi-squares with 2 different degrees of freedom) is not really quite right either, as verified by simulation. So their basic conclusion is that it is tricky to do this just right, no easily-implemented method seems to fix the problem exactly, but the simple, naive LRT is not too bad really (although slightly conservative), so they recommend just going with that.
    Great! I'll look into Pinhero and Bates 2000, I always thought they were the best resource for this. But this is very useful, thanks Jake.
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  17. #12
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Understanding Multilevel Model

    If you have a copy of P&B they discuss this on pp. 83-87.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  18. The Following User Says Thank You to Jake For This Useful Post:

    TheEcologist (02-18-2014)

  19. #13
    TS Contributor
    Points: 22,359, Level: 93
    Level completed: 1%, Points required for next Level: 991
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Understanding Multilevel Model

    Quote Originally Posted by Jake View Post
    At least in the earlier edition of the text (Pinheiro & Bates, 2000), they mention that there are issues arising from the fact that the parameter value under the null hypothesis is on the border of the parameter space, and that technically there is a correction you can apply to the degrees of freedom of the LRT statistic to take account of this. But then they note that the commonly suggested correction (i.e., using an equal mixture of chi-squares with 2 different degrees of freedom) is not really quite right either, as verified by simulation. So their basic conclusion is that it is tricky to do this just right, no easily-implemented method seems to fix the problem exactly, but the simple, naive LRT is not too bad really (although slightly conservative), so they recommend just going with that.
    there is some bootstrapped likelihood-ratio test that some random spunky shared with you at some point to address this issue...

    #justsayin
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  20. #14
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Understanding Multilevel Model

    Yes, if it was really important that I get a reliable answer to this question in my own research (e.g., the question was of substantive theoretical interest), then I would use something like the bootstrap-based approach that you outlined. But most of the time, in the process of everyday data analysis and model comparison using mixed models, it's really not important that I have a reliable p-value for testing random effects. In fact I usually don't bother doing formal hypothesis tests on random effects at all. Usually I let my choice of the random effects structure be dictated by the design of the study and computational/convergence issues.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  21. #15
    TS Contributor
    Points: 22,359, Level: 93
    Level completed: 1%, Points required for next Level: 991
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Understanding Multilevel Model


    Quote Originally Posted by Jake View Post
    In fact I usually don't bother doing formal hypothesis tests on random effects at all.
    as weird as it sounds, i do this as well... at some point the hypothesis testing of the random effects is much more of an after thought and it's the actual covariance structure what one ends up interpreting.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

+ Reply to Thread
Page 1 of 4 1 2 3 4 LastLast

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats