Can the PCA scores be used as variables in multiple linear regression

#1
Hello,

I used PCA to reduce the dimensions of a data set of 10 variables into 2 principal components (describing 82% of variance in data set of 10 variables) and calculated the scores for both PCs for all the samples (n=85).

Can I use these PCs (Scores) (separately or together) as explanatory variables along with a few other explanatory variables (X1, X2, X3, X4) in a multiple linear regression with a response variable 'Y'?

My goal is mainly to know which explanatory variables are more influential rather than deriving an accurate formula.

Thank you,
Warm regards,
 
#2
Can I use these PCs (Scores) (separately or together) as explanatory variables along with a few other explanatory variables (X1, X2, X3, X4) in a multiple linear regression with a response variable 'Y'?
Yes, you can.
 
#3
Thanks GretaGarbo, I did come across a few studies that have used only PCs as independent variables. I am not sure if you can use a couple of PCs (say X1 and X2) as independent variables along with a few other variables (say X3, X4, X5, X6) as independent variables in the same MLR.
Thank you!
 
#5
hi,
why not include the extra variables into the PCA model first?

regards
rogojel
Hi rogojel,

It is going to be a mixed model analysis with a few being categorical variables.
Among the numeric/random variables, there is a set of 10 variables which all are indices measuring a particular type of characteristic; though each measuring a different aspect of it. I need to club this information from these 10 indices/variables together. The other numeric variables measure different characteristics.
Thank you!
 

rogojel

TS Contributor
#6
Hi,
I guess your best approach would still be to include all variables into the PCA. If the two sets of variables are completely unrelated you will still preserve this sepatration in the PCA, however if they are not then you get a better model.

regards
rogojel
 
#7
Hi,
I guess your best approach would still be to include all variables into the PCA. If the two sets of variables are completely unrelated you will still preserve this sepatration in the PCA, however if they are not then you get a better model.

regards
rogojel
Hi rogojel,
Yes, it would have been best to put all into PCA. But then, some of them are categorical and among the rest of the numeric (that are not currently included in PCA), some are of a very different character than those included and would make interpretation either very difficult or impossible. Hence they would be good if kept separate.
But I will still try putting them all together and see what happens. There would still remain the categorical variables in the model.
Thanks