Mixed model analysis for group differences if some participants but not all are family members of eachother

#1
Hi! I have a question. I'm very new to mixed model analysis and I'm trying to find out how I can use it to solve my problem.

In my research project, I have seen two groups of children (cases and controls) and they performed all sorts of psychological tests and filled out questionnaires. Now I want to investigate whether the performance of my cases differs from the controls. Normally this would be no problem, just use a t-test/ANOVA etcetera. However, I have multiple children in my groups that are family. I would thus like know whether cases differ from controls on my outcome measures, in some way 'correcting for' or 'incorporating' the effect of dependency of data within and between my groups.

The group of cases consists of 161 children of whom 43 have a family relationship with another child in the cases group (19 twins, one triplet and one 'regular' brother-sister pair).
The group of controls consists of 44 children of whom 10 have a family relationship with another control child (5 'regular' brother-sister pairs).
Furthermore, there are 5 children in the cases group that are related to a child in the control group (4 times brother-sister, one time nephew-niece).

My supervisor told me to use mixed model analysis to get rid of this dependency of data problem, and I have tried to understand how I would have to do this in my specific case, however I still do not fully understand how I can do so.

I did find something about analyzing nested data, and I understand that I could give every child a cluster ID which I add in the first SPSS box in "subjects" so that I have 142 separate clusters containing one child (all children not related to other children in any of the groups), 29 clusters containing 2 children (for all twins, brother-sister pairs and the nephew-niece pair) and 1 cluster containing 3 children (for the triplet). However I'm not sure if that's the way to use this in my case (sounds like so many different clusters..). If this is what I should do, should I then enter for example math performance as my dependent variable and group (0 or 1, case or control) as factor? And finding that group is a significant predictor of math performance would then mean that there is a significant difference on math performance between my cases and controls and finding that group is not a significant predictor of math performance would mean that there is no significant difference on math performance between my cases and controls? (And possibly correcting for covariates such as age at assessment by entering them as covariates).

Or does this make no sense at all and should I do something completely different?

I would greatly appreciate some help! Links to articles, youtube videos or books explaining this more clearly are also very welcome.
 
Last edited: