1. ## Multilevel analysis

This is a thread on the wonderful world of multilevel analysis which is called many different things by various authors.

This is the type of comment that raises real doubts in my mind about statistical analysis. It comes from a multilevel graduate class.

Observations are pooled (i.e., groups are
combined and group membership ignored) to
estimate model coefficients. For example, data
from students within schools would be analyzed
at the student level (ignoring schools) even if
school features are of interest.

Between-group variation is ignored.
Estimates are OK when variation between
groups is negligible (e.g., school means are
comparable).

When between-group variation is not negligible,
larger groups will dominate the analysis.

SE will be invalid as well I believe.

Given that almost everything is nested inside of something, and between group variation will not be negligible, this would seem to invalidate much or all of linear and logistic regression. Similar limits come up all the time for other methods.

So if these type of problems are in fact common, how much of the analysis done is actually valid?

2. ## Re: Multilevel analysis

Originally Posted by noetsi
So if these type of problems are in fact common, how much of the analysis done is actually valid?
Well, John Ioannidis has made the case already that more than half of published papers are probably wrong anyway so I'm guessing... less than half is actually valid?

3. ## Re: Multilevel analysis

There is a quote, which I don't know who to attribute to, but it goes: "once you know multi-level modeling, every model becomes (looks like) a multi-level model."

Yeah your above quote seems a little dubious, but to the author's defense, there are tests to determine if random effects are needed or not (amount of variability and I^2 tests in meta-analyses). Though, analysts need to remember that those tests need to be well powered.

4. ## Re: Multilevel analysis

Originally Posted by spunky
Well, John Ioannidis has made the case already that more than half of published papers are probably wrong anyway so I'm guessing... less than half is actually valid?

*heavy sigh*

5. ## Re: Multilevel analysis

Originally Posted by hlsmith
There is a quote, which I don't know who to attribute to, but it goes: "once you know multi-level modeling, every model becomes (looks like) a multi-level model."

Yeah your above quote seems a little dubious, but to the author's defense, there are tests to determine if random effects are needed or not (amount of variability and I^2 tests in meta-analyses). Though, analysts need to remember that those tests need to be well powered.
What I have seen stressed is the ICC (interclass correlation) there are rules of thumb how high this number has to be to make ML useful. I don't think power is an issue with this although I have not seen that addressed. My data involves thousands of cases effectively whole populations.

6. ## Re: Multilevel analysis

Yeah, I wasn't referring to your data per se but speaking from my own experience. And yes there are some tests associated with these ICC values.

7. ## Re: Multilevel analysis

Raudenbush and Bryk raise the issue that diagnostics for multilevel analysis is often different than regular linear regression. Does anyone know a good source for how you do tests of the assumptions of ML (multilevel)? I don't mean the theory, I mean what you actually do in practice including ways to address assumptions that are violated. For example R and B don't address the use of White's SE at all leaving me unclear if those work in multilevel models.

They also raise questions, but provide no useful guidelines IMHO, on how many level 2 units you have to have to estimate each slope at level 2 (well slope plus intercept) when the slopes and intercepts have collinearity. Has anyone seen a suggestion on what the minimum number is? In honesty I am not sure what you do if you don't have enough level 2 units to estimate all the parameters that should be random [so require a level 2 analysis].

What makes this worse for me is that I commonly analyze populations not samples and remain uncertain if I should even test the regression assumptions other than linearity....

While I am asking, the authors mentions the use of empirical bayes residuals [we did not use empirical bayes elements in my class at all]. Are these preferable to OLS residuals to analyze the failure of assumptions? I know nothing of these at all....

8. ## Re: Multilevel analysis

Side note on something I had to figure out when delving into MLM, the more groups the better. By groups I mean clusters. I believe it comes into play via degrees of freedom in the modeling. So ideally you want more clusters to have more power, and I am not talking about levels, but groups AKA clusters in say level 2. I am sorry in that I don't have a great reference for this or any theoretic rationale.

9. ## The Following User Says Thank You to hlsmith For This Useful Post:

noetsi (05-09-2017)

10. ## Re: Multilevel analysis

I am not sure what clusters are. I am going to run individual results inside our units, primarily to determine how well the units perform since assignment to them is anything but random, but the units have a clear impact on personal results. To me that suggest multilevel models although today I read in an expert in the field...

If, on the other hand, the statistical inference aims only at the particular set of units j included in the data set at hand, then a fixed effects model is appropriate.
https://pdfs.semanticscholar.org/f78...f3a0533049.pdf

I am not really interested in national effects, just our units so I wonder if I should even use multilevel analysis in that case. However, the author does suggest this as well

This is the case, e.g., when one wishes to test the effect of an explanatory variable that is defined at level two, i.e., it is a function of the level-two units only. Then testing this variable has to be based on some way of comparing the variation accounted for by this variable to the total residual variation between level-two units, and it is hard to see how this could be done meaningfully without assuming that the level-two units are a sample from a population.
I have factors that likely work on the units directly and only indirectly on the customers. Since I am not going to model indirect effects as with SEM perhaps I could model this through multilevel analysis?

I have about 50 units, which would be the level 2 in my model.

11. ## Re: Multilevel analysis

Residuals at level one which are unconfounded by the higher-level residuals can be obtained, as remarked by Hilden-Minton [27], as the OLS residuals calculated separately
within each level-two cluster. These are just the same as the estimated residuals in the OLS analysis of the fixed effects model, where all level-two (or
higher-level, if there are any higher levels) residuals are treated as fixed rather than random.
I am really confused what this means. Does it mean that you have to run separate residual analysis for each upper level group (each J)? I commonly have 50 groups and running fifty separate residual analysis is not a reasonable thing to do...

12. ## Re: Multilevel analysis

For clarification, what I was calling clusters are your units.

13. ## Re: Multilevel analysis

Originally Posted by hlsmith
For clarification, what I was calling clusters are your units.
I guessed that

When you run multilevel and check residuals, do you check the residuals separately for each group/cluster? So if you have 10 groups you conduct ten residual analysis one for each group (ignoring that you run the analysis for first level then do the same thing for each parameter for each group at level 2 in a 2 level analysis)?

That seems prohibitively time consuming.

14. ## Re: Multilevel analysis

We have units that I believe have independent and/or moderating effect on our results. Units that provide service that is. I want to test if they have an impact and if so how and what drives that impact at the same time I want to determine what drives success in our units generally.

Multilevel approaches is how I decided to pursue this. But I have wondered recently if that is the best way to do it.

15. ## Re: Multilevel analysis

Moderating means interaction to me. If you have a potential interaction you typically add an interaction term. Can you bette describe your context.

Thanks.

16. ## Re: Multilevel analysis

I am not sure that is what I mean. What I mean is that factors like vendor services [nearly all services are provided by outside contractors] we provide impact results. But those services are provided by units who [in practice if not policy] who also impact customers. This could be independent of variables like vendor services or it could interact with them as you suggest.

So our units, which counsel customers and direct them to vendor services influence results and factors like services influence results. Possibly they interact and possibly not, but they both influence results [or that is my theory].