# Multivariate regression with paired samples?

#### mnd

##### New Member
Hello,

First time poster here. I have data from two groups of people recorded during two different conditions. I would like to study how the condition modulates the differences between these two populations while also correcting for some nuisance variables (age, sex).

I'm familiar with MATLAB and was previously using glmfit to compare two populations (only one condition) while removing the effects of nuisance variables. I have used this to compare the two groups under each condition separately, and eyeballed those results. Then I did a paired t-test on the data within each group under the two conditions, but I feel there should be a single test to capture what I am interested in but I don't know what test that is (or how to implement it in MATLAB). Any help would be much appreciated.

Thanks,
mnd

#### threestars

##### New Member
do you know R? You could do the following

data.condition1$cond2 <- 0 #This creates a dummy variable indicating data from condition2 data.condition2 <- read.csv(file-path) data.condition2$cond2 <- 1

combined.data <- rbind(data.condition1,data.condition2)

m0 <- lm(dv ~ cond2 + age + gender, data = combined.data)
summary(m0)

The coefficient for cond2 would tell you the difference between your conditions while controlling for age and gender.

#### mnd

##### New Member
do you know R? You could do the following

data.condition1$cond2 <- 0 #This creates a dummy variable indicating data from condition2 data.condition2 <- read.csv(file-path) data.condition2$cond2 <- 1

combined.data <- rbind(data.condition1,data.condition2)

m0 <- lm(dv ~ cond2 + age + gender, data = combined.data)
summary(m0)

The coefficient for cond2 would tell you the difference between your conditions while controlling for age and gender.
Thanks for your response. I'm somewhat familiar with R (used to use it a lot, but not in quite some time). Does this code compare data across the two groups (in addition to the 2 conditions)? Sorry if I was unclear -- there are 2 groups of different people and two test conditions.

Another thought I had after posting which perhaps somebody could comment on is whether I could simply calculate the difference for each subject between the two conditions (these variables should be normally distributed) and then put those differences into my GLM to see the difference across groups taking age and sex into account.

#### victorxstc

##### Pirate
I could simply calculate the difference for each subject between the two conditions (these variables should be normally distributed) and then put those differences into my GLM to see the difference across groups taking age and sex into account.
I think it is possible (actually I have used this method once in a research). But this might sometimes exclude the fluctuations in each of the control or treatment sides which can be of interest. For example, age can increase some trait both in control and treatment groups, but if they both increase similarly per increase of age, their difference would stand almost still, disallowing the effect of age to show up. Therefore, I used it as a complementary method besides other tests on control and treated groups.

#### threestars

##### New Member
So you have 2 observations per individual?

#### mnd

##### New Member
I think it is possible (actually I have used this method once in a research). But this might sometimes exclude the fluctuations in each of the control or treatment sides which can be of interest. For example, age can increase some trait both in control and treatment groups, but if they both increase similarly per increase of age, their difference would stand almost still, disallowing the effect of age to show up. Therefore, I used it as a complementary method besides other tests on control and treated groups.
I understand what you're saying, but let's say we are not interested in the effect age (or sex) will have on our dependent variable, so we would simply like to "remove" their effects -- would studying the difference still be valid?

#### Felbalazard

##### New Member
Hey !
I was thinking about the same kind of problems i.e. doing multivariate analysis of paired data and thought of the same procedure (doing a regression on the differences). I wondered if there was some literature on the subject ? For example victorxstc did you publish your use of this method and if yes, can you give me a reference ?

#### rogojel

##### TS Contributor
Hi,
I think it would be better to just analyse the data with condition as one of the IVs (and in R you would just need to declare it a factor using the as.factor() function IMO). Of interest could be the interaction between "condition" and the other IVs so I think a simple regression could yield a lot more information then regressing on the difference.
regards

#### Mai Mehanna

##### New Member
Hi everybody,

I have a question related to this topic:

I want to analyze paired data using regression analysis to adjust for covariates. I used PROC MIXED statement in SAS and I modeled the difference as the Y in the model with my covariates as Xs. I have a problem regarding the interpretation of the output. So, basically, I want to know the difference in response adjusted for my covariates. However, the output gives me the parameter estimate for each predictor (for example, the difference in response between males and females for the gender covariates). How can I know the value of the actual difference adjusted for the covariates in the model?

So, basically, in independent data, I had a "Drug" column in my data to compare the difference in response between the 2 drugs (some patients took drug A and others took drug B). I did multiple linear regression using PROC GLM statement in SAS and I put the difference in response as Y and "Drug" as one of the Xs in my model beside other covariates.

Regarding the paired data, I don't have the "Drug" column because every patient took the 2 drugs. I want to compare the difference in response between the 2 drugs in each patient. So, here is my SAS code:

PROC MIXED DATA=PRA.PAIRED_1;
MODEL DIFF_PAIRED = SEX pre_CSBP_BB pre_CSBP_TD / SOLUTION CLPARM;
RUN;
QUIT;
DIFF_PAIRED is the difference in response between the 2 drugs.

The output:

Covariance Parameter Estimates
Cov Parm Estimate
Residual 323.59

Fit Statistics
-2 Res Log Likelihood 1018.5
AIC (Smaller is Better) 1020.5
AICC (Smaller is Better) 1020.5
BIC (Smaller is Better) 1023.3

Solution for Fixed Effects
Effect Estimate Standard Error DF t Value Pr > |t| Alpha Lower Upper
Intercept -59.6842 25.3976 115 -2.35 0.0205 0.05 -109.99 -9.3765
SEX -7.7764 3.2996 115 -2.36 0.0201 0.05 -14.3122 -1.2406
pre_CSBP_BB -0.8725 0.1466 115 -5.95 <.0001 0.05 -1.1628 -0.5821
pre_CSBP_TD 1.3641 0.1476 115 9.24 <.0001 0.05 1.0717 1.6566

Type 3 Tests of Fixed Effects
Effect Num DF Den DF F Value Pr > F
SEX 1 115 5.55 0.0201
pre_CSBP_BB 1 115 35.42 <.0001
pre_CSBP_TD 1 115 85.38 <.0001
So, how can I know the difference between the 2 drugs adjusted for the covariates in the model, like the case of unpaired data (I used the "Drug" parameter estimate from the output).