R

I have a dataset consisting of four continuous independent variables with a value between 0 and 1, and four continuous dependent variables with a score between -3 and 3. The IVs are values representing a same biomarker, but in different regions of the brain. The DVs are values representing four separate neuropsychological performance measures. For all variables, higher values are "good", and lower values are "bad". I am interested in seeing whether changes in the IVs can predict changes in the DVs. I also have two continuous covariates which I would like to partial out of the regression.

I have a total of 38 data points for each variable; consisting of 19 patients that were assessed prior to and following treatment. Except for the covariates, all data are normally distributed. Here is a correlation matrix I made in R:

As you can see, the four IVs are highly collinear (though somewhat less so for IV4). This makes sense experimentally, because it simply means the effect is not region specific.

I would like to do two things with these data: 1) create a composite score of the four IVs (potentially excluding IV4, if appropriate), and 2) compute a linear regression between this composite score and each one of the four DVs (with the two covariates partialled out). What is the best approach to this? Principal component analysis? Multivariate regression? I am rather new to multivariate statistics, and just started learning R yesterday, so some detail would be appreciated.

Thanks!

I have a total of 38 data points for each variable; consisting of 19 patients that were assessed prior to and following treatment. Except for the covariates, all data are normally distributed. Here is a correlation matrix I made in R:

Code:

```
DV1 DV2 DV3 DV4 IV1 IV2 IV3
DV1
DV2 0.60***
DV3 0.30 0.33*
DV4 0.42** 0.25 0.60***
IV1 0.28 0.27 0.47** 0.39*
IV2 0.20 0.20 0.59*** 0.31 0.82***
IV3 0.19 0.27 0.55*** 0.37* 0.85*** 0.92***
IV4 0.23 0.33* 0.33* 0.19 0.60*** 0.71*** 0.69***
```

I would like to do two things with these data: 1) create a composite score of the four IVs (potentially excluding IV4, if appropriate), and 2) compute a linear regression between this composite score and each one of the four DVs (with the two covariates partialled out). What is the best approach to this? Principal component analysis? Multivariate regression? I am rather new to multivariate statistics, and just started learning R yesterday, so some detail would be appreciated.

Thanks!

Last edited by a moderator: