**My question regards the best way of comparing two groups for differences in the levels of several quantitative parameters. Here, for only one of the groups there are 4 repeated measurements at different intervals.**

**The data**

-We have two groups we want to compare: one consists of ~200 patients at 1 timepoint and the other of 50 controls that we have measurements for at ~4 timepoints.

-The measurements of the 200 patients and 50 controls were done in about the same period of time.

-The 50 controls were measured at various times, usually about 3 months apart. However,

*for each control, this timing was different.*This means that the first measurement of a control can be closer to the second measurement of another control than to the first measurement of that other control.

-For all these people we measured several quantitative parameters.

-These quantitative parameters are not normally distributed. But it is not a problem do to something like an inverse rank based transform. I know this type of transform is highly debated, and that opinions vary, but this is not the purpose of my question on this forum.

-We would like to be able to correct for age, gender and BMI

**The goal**

We want to check of these quantitative parameters are higher or lower for the patient group than they are for the control group.

**My “simple” approaches so far**

I have been using an ANCOVA, correcting for age, gender and BMI, using the medians of the 50 controls vs the 200 patients. I use the following bit of R code:

Code:

```
formulaSel = as.formula(‘quantitativeParam~cohortNames+age+BMI+gender’)
#”cohortNames" is a factor containing the information if an individual is a patient or a control, and “quantitativeParam” is our quantitative parameter of interest
results = lm(formulaSel,data = data)
anovaRes = anova(results)
```

Code:

```
#Linear model
formulaSel = as.formula('quantitativeParam ~ cohortNames + age + BMI + gender
results = lm(formulaSel,data = data)
valuesAll = summary(results)$coefficient
```

**The questions**

(1) Should I:

-take the mean/median over all 4 timepoints?

-use the timepoints separately?

-use a technique that can take into account the fact that we have repeated measurements for just one of the groups

(2) Is what I am doing valid?

(3) What is the best technique to compare these groups?

Thanks in advance for the input!

Cheers,

Rob

PS. I also posted my question on stackexchange last week, but haven't received any feedback, hence my slightly adjusted post here.

https://stats.stackexchange.com/que...th-only-one-of-these-having-repeated-measures