Multilevel regression with two clusters

hlsmith

Not a robit
#1
Well I am revisiting a project that is giving me mental fits. I am going to give a comparable example.

Say I have a binary outcome for a picture interpreted as containing a man or woman and I know the truth so the DV is correct classification yes or no.

Now I have fifty pictures reviewed by 10 people. So responses to each picture will be correlated and each reviewer's answers will also be correlated. These two clusterings in data don't seemed nested. Now in the model I want to control for these clusterings. however in the model I want to explain characteristic in these groupings, such as reviewers' gender and if the picture was of a man or woman.

what are options for setting up this model? When I first ran the model I treated picture as a random effect, but if I examine gender of reviewer I feel like I neglect that it is nested in a reviewer since I use long data and don't control for reviewer clusters. Can I just put reviewer id in model as a categorical variables or will the model fail to run since gender is nested in id and is a linear combo of terms of deducible. Thanks and ask for clarification if needed.

@Jake @spunky @GretaGarbo @Dason st al. what are your thoughts?
 
Last edited:

ondansetron

TS Contributor
#2
I'm new to this mixed model game, but would review ID be better treated as a random effect?

In general, outside of least squares estimation, I was under the impression you can still get estimates for high collinearity (even perfect I think), but the estimates are not unique.

Correct me if I am wrong. But I will also learn from posting my thoughts: is it better to just treat review ID as a random effect?
 

hlsmith

Not a robit
#3
Correct there is a dependency (observations) issue, so breach of independence in obs. It is a toss up between images and reviewers. People like to say the more clusters the better and since in this example there are 10 or 50 clusters, I selected the larger. perhaps I run two models where I switch which is the random effects and compare variability explained and select the better of the two. Or just run two models, reviewer random effects when examining reviewer gender and vice versa.
 

hlsmith

Not a robit
#4
There are always 3 level models but they seem to have convergence issues and not sure I have data for it.

Reviewers, images clustered in reviewers, outcome clustered in images.
 

ondansetron

TS Contributor
#5
Correct there is a dependency (observations) issue, so breach of independence in obs. It is a toss up between images and reviewers. People like to say the more clusters the better and since in this example there are 10 or 50 clusters, I selected the larger. perhaps I run two models where I switch which is the random effects and compare variability explained and select the better of the two. Or just run two models, reviewer random effects when examining reviewer gender and vice versa.
Maybe I'm missing something, but is there any reason you can't include both as random effects? And after I posted, I did read a stackexchange post saying more is better!
 

ondansetron

TS Contributor
#7
The post said if you have more clusters this is better than fewer, generally speaking.

Also, I'm very new to multilevel modeling so if 3 levels means including both as random effects, then yes?
 
#8
I don’t think I understand the set up. But it seems to me to be an example of a “crossed random effect”, like what @Jake used to talk about. So that images are random effects and also that reviewers in gender are random effects.

I have heard a rule of thumb, that there should be at least 6 levels in a random effects model for it to be meaningful. Other wise it would just be a few fixed effects.
 

hlsmith

Not a robit
#11
@Jake if I move forward i may need your help ensuring i interpret estimates correctly. I saw @Lazar posted on doing this in Bayesian modeling,that would be interesting too. But currently beyond my skillset.
 

Jake

Cookie Scientist
#12
Like others already mentioned, this is a crossed random effects model, which can easily be fit in most (but not all) stats packages, including lme4 in R, SAS PROC MIXED/GLIMMIX, and others. The syntax is package-specific of course but usually it's as simple as just add separate random effect terms for the two crossed random factors.

I've written several papers using and/or describing these, here's the most recent one, which people seem to like:
https://www.researchgate.net/profil...gns-Analytic-Models-and-Statistical-Power.pdf
It discusses not just crossed random effects, but also nested and partially crossed random effects. And an accompanying R Shiny app (link in the paper) gives R, SAS, and SPSS syntax.
 

hlsmith

Not a robit
#13
@Jake - yes the code seems simple enough (at least for me in SAS, since I have done near comparable analyses). As mentioned I will probably pick your brain in the articulation of results. Though, this is a preliminary analyses for a small study at this point, so there is a chance I may not run one of these models.

THANKS.