# Weaknesses in the assumptions misuses of HLM/multilevel modeling

#### trinker

##### ggplot2orBust
Hopefully the title says it.

What do people see as potential/perceived weaknesses in the assumptions and/or misuses of HLM/multilevel modeling?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I think there may be the risk of people using them because of the data layout but in actuality there may not be random intercepts or slopes, but it was not sufficiently examined.

I also heard if there are few groups in the upper levels than there is a risk of type II errors.

#### Lazar

##### Phineas Packard
There are ways of correcting for this as Jake will note but https://www.iser.essex.ac.uk/research/publications/working-papers/iser/2013-14.pdf is important to read. The other thing I see is when the nesting structure is observations nested under individuals, researchers often fail to realise that observations from one day to the next are likely more closely related than one observation to one a week from now. In such a case a model like:
Code:
lme(outcome ~  predictor, data = myData, random =  ~ 1 | id1)
Might not be as appropriate as
Code:
lme(outcome ~  predictor, data = myData, random =  ~ 1 | id1, cor=corAR1(0.2,form= ~1|id1))
in which the autoregressive nature of the data is somewhat accounted for.

#### spunky

##### Can't make spagetti
my reply to trinker's chatbox convo (it's a little rantsy so i thought it made more sense to put it here):

@trinker well, if you come to think about it, regardless of which design we use within social sciency land ALL of the claims we make about our data are somewhat casual. we do it from the moment we say something like Y=a+bX+e . the '=' sign is already implying a hidden directionality on the effect that relates Y to X (i.e. Y is a 'function' of X and not vice-versa). i'm not horribly scared about it because i think we need to make bold (albeit usually incorrect) statements like that so other people who disagree with us can do their own research and prove us wrong, benefiting the the overall advancement of science. it's like the studies cognitive psych people have done on how to interpret 95% confidence intervals and it turns out that EVERYBODY interprets them as if they were 95% CREDIBLE (i.e. Bayesian) intervals. even people who understand what a 95% confidence interval means in theory uses a Bayesian interpretation whenever they are in an applied context. technically they are 'wrong', but then again the interpretation of confidence intervals is so convoluted that i wouldn't blame them for jumping into Bayesianism even they're not aware of it. in the same way, making a causal statement about Y=a+bX+e is technically wrong, but i'm not sure how to talk about them without doing a lot of liguistic acrobatics à la Wittgenstein so that it kinda sorta almost sounds like a causal statement even thought it's not it.

i think the issue with linear mixed models is the same issue as with structural equation models, finite mixture models or, in general, anything that's beyond OLS multiple regression: for people who don't understand how it works, it's sufficiently complicated that it makes you feel like it's doing something it's not supposed to do. as you mentioned, the introduction of random effects suddenly feels like you can control for all this unexplained variation that was previously inaccesible to you in OLS regression. in the same way, the use of factors in factor analysis leads you to believe you have somehow managed to access these untouchable platonic entities we call psychological constructs. but i would not call them weaknesses of linear mixed models. this is just people being crazy and asking these models to do things they cannot do. and i guess that's why i wasn't sure of what you meant by "weaknesses" of the assumptions. at first i thought "well, maybe he means that now you have to check for the normality of both the random effects and the residuals" or "now you need to make sure you know how to center predictors correctly", etc. but as far as what you are mentioning now i'd say you're probably better off studying the recent developments on statistical approaches to model causality. i always recommend people to read Judea Pearl's book (http://bayes.cs.ucla.edu/BOOK-2K/) because i think it clearly articulates what the state-of-the-art thinking is on causal modeling. it starts off by early 19th century conceptions of causality in science, moves into Fisher's model of randomization and experiments to ascertain causal claims, dwells in the Rubin-Neyman causality model and ends up with his (counterfactual) model of causality. if you can stomach or skip all the sections of the chapters where Pearl roots for his theory and shows how it is superior to everybody else's, it provides a great summary on the different takes the world of statistics has taken to try and make causal claims from (usually non-experimental) data.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
There are ways of correcting for this as Jake will note but https://www.iser.essex.ac.uk/research/publications/working-papers/iser/2013-14.pdf is important to read. The other thing I see is when the nesting structure is observations nested under individuals, researchers often fail to realise that observations from one day to the next are likely more closely related than one observation to one a week from now. In such a case a model like:
Code:
lme(outcome ~ predictor, data = myData, random = ~ 1 | id1)
Might not be as appropriate as
Code:
lme(outcome ~ predictor, data = myData, random = ~ 1 | id1, cor=corAR1(0.2,form= ~1|id1))
in which the autoregressive nature of the data is somewhat accounted for.
Lazar,

Are you referencing a cross-sectional mixed model where people have multiple contributions, but it is not per se a longitudinal mixed model. I have a model like this that I am getting ready to build, I am guessing, 20% of people have multiple contributions. So you are referencing calling people group level 2 and also not leaving the cov matrix unstructured and this would be appropriate for individuals with one contribution? In addition, if I have another higher level group variable, I should call it group level 3? Anyone can jump in and answer this. I was getting ready to write my own post, but this example is comparable to my situation.