+ Reply to Thread
Results 1 to 13 of 13

Thread: Multiple regression analysis: multicollinearity problem

  1. #1
    Points: 285, Level: 5
    Level completed: 70%, Points required for next Level: 15

    Posts
    21
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Multiple regression analysis: multicollinearity problem




    Dear

    For my research I'm using interactions in a multiple regression analysis to test the influence of the IV on the DV with moderators.
    My hypotheses are the following:
    1. The relation between X1 and Y increases with higher degrees.
    2. The relation between X2 and Y increases with higher degrees.
    3. X2 has a greater influence on Y than X1.

    The model for this is:
    Y = constant + β1X1 + β2X2 + β3X3 + β4X4 + β5X1X3 + β6X1X4 +β7X2X3 + β8X2X4

    X1 and X2 are the IV and X3 and X4 are the moderators (degrees, X4 being the highest degree), which are dummy variables.

    However by using this model I have very high multicollinearity between my independant variables.. The VIF values were over 10 (some even at 60) which shows it's problematic.
    I managed to lower the VIF value to 5,132 by centering the X1 and X2 variables but none of these are significant anymore.

    How can I solve this problem?
    Last edited by Hypnoz; 03-12-2017 at 12:23 PM.

  2. #2
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Multiple regression analysis: multicollinearity problem

    hi,
    I would try to find an acceptable model, e.g. by stepwise elimination. The centering is a good idea, I would keep it.

    regards

  3. #3
    Points: 285, Level: 5
    Level completed: 70%, Points required for next Level: 15

    Posts
    21
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Quote Originally Posted by rogojel View Post
    hi,
    I would try to find an acceptable model, e.g. by stepwise elimination. The centering is a good idea, I would keep it.

    regards
    Dear

    By stepwise elimination I found that in Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 +β7X2X3 + β8X2X4 (p=0.001 for β7X2X3).
    Which shows both X2 and X3 are significant? Is this correct? And the VIFs dropped below 5.

    For Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 + β8X2X4 I found that X2 is significant (p=0.005).

    What are your thoughts about this?

    Kind regards

  4. #4
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Quote Originally Posted by Hypnoz View Post
    Dear

    By stepwise elimination I found that in Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 +β7X2X3 + β8X2X4 (p=0.001 for β7X2X3).
    Which shows both X2 and X3 are significant? Is this correct? And the VIFs dropped below 5.
    No, it only means that the interaction could be significant. If you want to have a hierarchical model, you are advised to KEEP both X2 and X3 in the model but this does not mean they are significant.

    Quote Originally Posted by Hypnoz View Post
    For Y = constant + β1X1 + β3X3 + β4X4 + β5X1X3 + β6X1X4 + β8X2X4 I found that X2 is significant (p=0.005).

    What are your thoughts about this?

    Kind regards
    X2 must be a typo, right? It is not in the model.

    How did ou end up with two different models? Or do I miss something? The stepwise elimination means that you eliminate all terms, one by one, that are not significant, interactions first and highest p-value first. What do you get if you follow this thru?

    regards

  5. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Multiple regression analysis: multicollinearity problem

    It is extremely difficult to tell if X1 has greater impact than X2 on Y. You can show the slope change, but that is not impact per see (the scale of X is a significant problem, but hardly the only one). The closest I have ever come up with for linear regression is to generate standardized slopes (in terms of standard deviations) but this won't work if your variables are dummy variables many argue.

    I have been told that regression is not really set up with relative impact in mind which seems really strange to me. Its probably the most important question most non-academic analysis is interested in.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  6. #6
    Points: 285, Level: 5
    Level completed: 70%, Points required for next Level: 15

    Posts
    21
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Quote Originally Posted by rogojel
    X2 must be a typo, right? It is not in the model.

    How did ou end up with two different models? Or do I miss something? The stepwise elimination means that you eliminate all terms, one by one, that are not significant, interactions first and highest p-value first. What do you get if you follow this thru?

    regards
    Sorry for the post before. I had made a mistake as you have noticed...
    However by using the stepwise elimination as you described I came to the following model: Y = constant + β1X1 + β2X2+ β3X3 + β4X4 where X2 is significant (p=0.004).
    If there is even 1 interaction in the model none of the variables are significant anymore...

  7. #7
    Points: 285, Level: 5
    Level completed: 70%, Points required for next Level: 15

    Posts
    21
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Quote Originally Posted by noetsi View Post
    It is extremely difficult to tell if X1 has greater impact than X2 on Y. You can show the slope change, but that is not impact per see (the scale of X is a significant problem, but hardly the only one). The closest I have ever come up with for linear regression is to generate standardized slopes (in terms of standard deviations) but this won't work if your variables are dummy variables many argue.

    I have been told that regression is not really set up with relative impact in mind which seems really strange to me. Its probably the most important question most non-academic analysis is interested in.
    How would you do it?

    Kind regards

  8. #8
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Quote Originally Posted by Hypnoz View Post
    However by using the stepwise elimination as you described I came to the following model: Y = constant + β1X1 + β2X2+ β3X3 + β4X4 where X2 is significant (p=0.004).
    If there is even 1 interaction in the model none of the variables are significant anymore...
    How about X1 X3, X4? Why don't you eliminate them if they are not significant?

  9. #9
    Points: 285, Level: 5
    Level completed: 70%, Points required for next Level: 15

    Posts
    21
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Quote Originally Posted by rogojel View Post
    How about X1 X3, X4? Why don't you eliminate them if they are not significant?
    Then I would have the following model: Y = constant + β2X2 where X2 is significant (p=0.007).
    However I need to see which X has the biggest influence on Y, so I think X1 and X2 need to be in the model.

    Next to this, is it possible to test the first and second hypothesis with correlations? And if so, how can I add degrees to that?

    kind regards

  10. #10
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Did you eliminate the others one at a time?

    Not significant basically means that the data is compatible wirh the hypothesis that the influence is exactly zero, so, by eliminating non-signuficant terms you also eliminate factors that have a very small or no influence.

    regards

    regards

  11. #11
    Points: 285, Level: 5
    Level completed: 70%, Points required for next Level: 15

    Posts
    21
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Multiple regression analysis: multicollinearity problem

    Quote Originally Posted by rogojel View Post
    Did you eliminate the others one at a time?

    Not significant basically means that the data is compatible wirh the hypothesis that the influence is exactly zero, so, by eliminating non-signuficant terms you also eliminate factors that have a very small or no influence.

    regards

    regards
    Yes, I did eliminate the others one at a time. So the data I'm using is compatibel with the hypothesis if I say the influence is exactly zero?

    Kind regards

  12. #12
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Multiple regression analysis: multicollinearity problem

    If the term was not significant, yes.

  13. #13
    Points: 37, Level: 1
    Level completed: 74%, Points required for next Level: 13

    Posts
    2
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: Multiple regression analysis: multicollinearity problem


    Hello!

    I have a question regarding multiple regression analysis: for comparing numerical variable I should use linear regression and for comparing dichotomial variables I should use logistic regression. But if I want to compare numerical with dichotomial variables, what kind of test should I use? Thank you!

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats