# comparing means when the control group has 2 subsets of data

#### keamano

##### New Member
Hi there
I have prospective study looking at functionality after an injury; let's say a big toe injury. I have the injured group (N=44) that has suffered an injury on one big toe (the other foot is "non-injured"), and control group with healthy toes (N=15). I have a functionality score for the foot, and it is expressed as a continuous variable. I want to compare the mean scores of 1.) injured feet vs non-injured feet, 2.) injured feet vs control feet, and 3.) noninjured feet vs control feet. I was going to run some simple t-tests with controls as 30 individual feet, but found out that in the controls, the right and left sides are highly correlated. I've been told I cannot treat the control as 30 separate feet for this reason. So what tests should I run?

Thanks so much!

#### Disvengeance

##### New Member
You need to consider the unit of analysis, which depends on your experimental design. It sounds to me like the unit of analysis should be a person - not a person's toes - and thus the number of possible injured big toes can be 0, 1, or 2. Then you can perform analysis variance using some multiple comparisons method, such as Scheffe's or Tukey's, for all pairwise comparisons.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Do any exogenous variables affect functionality that you would need to control for or did you match your controls to the cases?

Yes, this ends up being a fairly conplicated issue, or more so that you originally assumed. Side note, your data need to meet the test statistic assumption to use parametric procedures (those based on normality).

Disvengeance, was noting that if you make multiple comparisons you need to control for multiplicity (pairwise error rate), typically by adjusting your alpha (significance level).

I am not sure what approach may be best, since as noted you need to also control for the covariance between same person data. A simple approach may be running two sample ttests and also paired ttests, given the normality assumption is met. You would need to be transparent in presenting this to your audience that multiple procedure were used. I am unsure if any one would have issues with you adjusting the significance level when making multiple comparisons using two different approaches.