# Multilevel Structure & Random Effects

We have the following data structure and I'm wanting to get some feedback on how to correctly specify a multilevel model for research.

Patients (i), Care-Givers (j), Hospital (k)

Our structure is such that:
Patients (i) are nested within Hospitals (k). Care-Givers (j) can also work at several different Hospitals (k).

Our outcome is a single binary (yes/no) variable and we're interested in including patient level covariates, as well as some hospital level (size, # of beds) and care-giver characteristics (sex, specialty).

Our ultimate goal is to tease apart the variation in the outcome and attribute it to Care-Givers and Hospital.

I believe I am using a cross-over structure. Does anyone have any pointers to get started in SAS?

I think you may go the route of Proc glimmix. Seems like you may have 3 levels, not sure how you get that third one into SAS, two variables listed in strata?

You only have 1 observation per patient, right? So really we can only estimate 2 random effects (hospitals and caregivers), not 3.

The GLIMMIX syntax might look something like this:
proc glimmix;
model y = patientCov1 patientCov2 size beds sex specialty / dist = binary solution;
random intercept patientCov1 patientCov2 sex specialty / sub=k type=un;
random intercept patientCov1 patientCov2 size beds / sub=j type=un;
run;

Thanks Jake, that is what I meant.

Thank you for the quick responses! <3

Can you kindly explain the logic behind having the patient level characteristics on the random intercept lines? Also, some of the physician factors (sex) are fixed, so this part is throwing me off too.

That syntax specifies random slopes. The caregiver covariates are fixed for a caregiver, so those can't be random across caregivers, but they can be random across hospitals, because a hospital is observed with multiple caregivers and thus multiple values of the caregiver covariates. Likewise for the hospital covariates, if caregivers are observed with multiple hospitals then we can estimate random slopes across caregivers for the hospital covariates. And the patient-level covariates vary at the observation level so random slopes for those can be estimated across both grouping factors.

Edit: I see that I accidentally switched around the "j" and "k" grouping factors in my syntax. Will fix now.

That syntax specifies random slopes. The caregiver covariates are fixed for a caregiver, so those can't be random across caregivers, but they can be random across hospitals, because a hospital is observed with multiple caregivers and thus multiple values of the caregiver covariates. Likewise for the hospital covariates, if caregivers are observed with multiple hospitals then we can estimate random slopes across caregivers for the hospital covariates. And the patient-level covariates vary at the observation level so random slopes for those can be estimated across both grouping factors.

Thanks for the clarification Jake - makes sense! And thanks to everyone else too for such quick responses <3, appreciate it

If I'm interested in calculating the ICC from this model, my guess is it would be something like this:

ICC for Care-giver = variance of care-giver random effect / variance of care-giver random effect + variance of hospital random effect + level 1 variance in responses (persons).

Likewise, I can switch around if I wanted the ICC for hospital. My question comes from specificying the level 1 variance in responses. Can I take it to just be Pi^2/3? This is what I've seen done, however after reading some more if seems that this is only appropriate if you consider the latent variable approach to logistic regression. Is it much different than what we've done in GLIMMIX? Can I use the output from GLIMMIX and just substitute in Pi^/3 or do I have a bit of work cut out for me using NLMIXED and/or possibly other tools? Any advice would be helpful!

