# Thread: Multiple regression analysis: multicollinearity problem

1. ## Multiple regression analysis: multicollinearity problem

Dear

For my research I'm using interactions in a multiple regression analysis to test the influence of the IV on the DV with moderators.
My hypotheses are the following:
1. The relation between X1 and Y increases with higher degrees.
2. The relation between X2 and Y increases with higher degrees.
3. X2 has a greater influence on Y than X1.

The model for this is:
Y = constant + β1X1 + β2X2 + β3X3 + β4X4 + β5X1X3 + β6X1X4 +β7X2X3 + β8X2X4

X1 and X2 are the IV and X3 and X4 are the moderators (degrees, X4 being the highest degree), which are dummy variables.

However by using this model I have very high multicollinearity between my independant variables.. The VIF values were over 10 (some even at 60) which shows it's problematic.
I managed to lower the VIF value to 5,132 by centering the X1 and X2 variables but none of these are significant anymore.

How can I solve this problem?

2. ## Re: Multiple regression analysis: multicollinearity problem

hi,
I would try to find an acceptable model, e.g. by stepwise elimination. The centering is a good idea, I would keep it.

regards

3. ## Re: Multiple regression analysis: multicollinearity problem

Originally Posted by rogojel
hi,
I would try to find an acceptable model, e.g. by stepwise elimination. The centering is a good idea, I would keep it.

regards
Dear

By stepwise elimination I found that in Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 +β7X2X3 + β8X2X4 (p=0.001 for β7X2X3).
Which shows both X2 and X3 are significant? Is this correct? And the VIFs dropped below 5.

For Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 + β8X2X4 I found that X2 is significant (p=0.005).

Kind regards

4. ## Re: Multiple regression analysis: multicollinearity problem

Originally Posted by Hypnoz
Dear

By stepwise elimination I found that in Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 +β7X2X3 + β8X2X4 (p=0.001 for β7X2X3).
Which shows both X2 and X3 are significant? Is this correct? And the VIFs dropped below 5.
No, it only means that the interaction could be significant. If you want to have a hierarchical model, you are advised to KEEP both X2 and X3 in the model but this does not mean they are significant.

Originally Posted by Hypnoz
For Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 + β8X2X4 I found that X2 is significant (p=0.005).

Kind regards
X2 must be a typo, right? It is not in the model.

How did ýou end up with two different models? Or do I miss something? The stepwise elimination means that you eliminate all terms, one by one, that are not significant, interactions first and highest p-value first. What do you get if you follow this thru?

regards

5. ## Re: Multiple regression analysis: multicollinearity problem

It is extremely difficult to tell if X1 has greater impact than X2 on Y. You can show the slope change, but that is not impact per see (the scale of X is a significant problem, but hardly the only one). The closest I have ever come up with for linear regression is to generate standardized slopes (in terms of standard deviations) but this won't work if your variables are dummy variables many argue.

I have been told that regression is not really set up with relative impact in mind which seems really strange to me. Its probably the most important question most non-academic analysis is interested in.

6. ## Re: Multiple regression analysis: multicollinearity problem

Originally Posted by rogojel
X2 must be a typo, right? It is not in the model.

How did ýou end up with two different models? Or do I miss something? The stepwise elimination means that you eliminate all terms, one by one, that are not significant, interactions first and highest p-value first. What do you get if you follow this thru?

regards
Sorry for the post before. I had made a mistake as you have noticed...
However by using the stepwise elimination as you described I came to the following model: Y = constant + β1X1 + β2X2+ β3X3 + β4X4 where X2 is significant (p=0.004).
If there is even 1 interaction in the model none of the variables are significant anymore...

7. ## Re: Multiple regression analysis: multicollinearity problem

Originally Posted by noetsi
It is extremely difficult to tell if X1 has greater impact than X2 on Y. You can show the slope change, but that is not impact per see (the scale of X is a significant problem, but hardly the only one). The closest I have ever come up with for linear regression is to generate standardized slopes (in terms of standard deviations) but this won't work if your variables are dummy variables many argue.

I have been told that regression is not really set up with relative impact in mind which seems really strange to me. Its probably the most important question most non-academic analysis is interested in.
How would you do it?

Kind regards

8. ## Re: Multiple regression analysis: multicollinearity problem

Originally Posted by Hypnoz
However by using the stepwise elimination as you described I came to the following model: Y = constant + β1X1 + β2X2+ β3X3 + β4X4 where X2 is significant (p=0.004).
If there is even 1 interaction in the model none of the variables are significant anymore...
How about X1 X3, X4? Why don't you eliminate them if they are not significant?

9. ## Re: Multiple regression analysis: multicollinearity problem

Originally Posted by rogojel
How about X1 X3, X4? Why don't you eliminate them if they are not significant?
Then I would have the following model: Y = constant + β2X2 where X2 is significant (p=0.007).
However I need to see which X has the biggest influence on Y, so I think X1 and X2 need to be in the model.

Next to this, is it possible to test the first and second hypothesis with correlations? And if so, how can I add degrees to that?

kind regards

10. ## Re: Multiple regression analysis: multicollinearity problem

Did you eliminate the others one at a time?

Not significant basically means that the data is compatible wirh the hypothesis that the influence is exactly zero, so, by eliminating non-signuficant terms you also eliminate factors that have a very small or no influence.

regards

regards

11. ## Re: Multiple regression analysis: multicollinearity problem

Originally Posted by rogojel
Did you eliminate the others one at a time?

Not significant basically means that the data is compatible wirh the hypothesis that the influence is exactly zero, so, by eliminating non-signuficant terms you also eliminate factors that have a very small or no influence.

regards

regards
Yes, I did eliminate the others one at a time. So the data I'm using is compatibel with the hypothesis if I say the influence is exactly zero?

Kind regards

12. ## Re: Multiple regression analysis: multicollinearity problem

If the term was not significant, yes.

13. ## Re: Multiple regression analysis: multicollinearity problem

Hello!

I have a question regarding multiple regression analysis: for comparing numerical variable I should use linear regression and for comparing dichotomial variables I should use logistic regression. But if I want to compare numerical with dichotomial variables, what kind of test should I use? Thank you!

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts