+ Reply to Thread
Results 1 to 8 of 8

Thread: Why does the coefficient change sign when another variable is added to the OLS model?

  1. #1
    Points: 142, Level: 2
    Level completed: 84%, Points required for next Level: 8

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Why does the coefficient change sign when another variable is added to the OLS model?




    Dear all,

    I am trying to run an OLS regression in Stata 13, with log of per capita calorie as my dependent variable and age and years of education of household head, log per capita expenditure as my independent variables (other controls to be added eventually). When I run the regression with just age and education as control, they are significant and positive. However, as soon as I add log per capita expenditure, education becomes negative and significant. I am puzzled by this result (the literature on calorie consumption argues that education of the household head has a positive impact)- I understand that education of the household head might reflect a "wealth" effect, but the correlation coefficient is not that large. I have posted my regression results below, as well as summary statistics. I was wondering if someone could help me understand what is going on here. I realize that this sort of problem might (or might not ) be overcome using other techniques than OLS, but I have just started learning OLS and would like to understand how to deal with this in OLS, or at least know why it cannot deal with this.

    Thanks,

    Monzur


    Code: 
    .  regress log_pccal  age_hhhead eduy_hhhead [pw=hhweight], r
     
    
    Linear regression                           Number of obs =    3355
                                                           F(  2,  3352) =  105.40
                                                           Prob > F      =  0.0000
                                                           R-squared     =  0.0692
                                                           Root MSE      =  .25583
    
    ------------------------------------------------------------------------------
                 |               Robust
       log_pccal |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
      age_hhhead |   .0049182   .0003602    13.65   0.000      .004212    .0056244
     eduy_hhhead |   .0075136   .0011997     6.26   0.000     .0051613    .0098659
           _cons |   7.537586   .0171067   440.62   0.000     7.504045    7.571126
    ------------------------------------------------------------------------------
    
    .  regress log_pccal age_hhhead eduy_hhhead log_pcexp [pw=hhweight], r
    
    
    Linear regression                                      Number of obs =    3355
                                                           F(  3,  3351) =  601.38
                                                           Prob > F      =  0.0000
                                                           R-squared     =  0.4123
                                                           Root MSE      =  .20332
    
    ------------------------------------------------------------------------------
                 |               Robust
       log_pccal |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
      age_hhhead |    .001919   .0002945     6.52   0.000     .0013415    .0024964
     eduy_hhhead |  -.0082508    .001044    -7.90   0.000    -.0102977   -.0062039
       log_pcexp |   .3777407   .0100402    37.62   0.000     .3580552    .3974262
           _cons |   4.795607   .0730719    65.63   0.000     4.652337    4.938877
    ------------------------------------------------------------------------------
    
    .  estat vif
    
        Variable |       VIF       1/VIF  
    -------------+----------------------
       log_pcexp |      1.20    0.832228
     eduy_hhhead |      1.16    0.863121
      age_hhhead |      1.07    0.930743
    -------------+----------------------
        Mean VIF |      1.14
    
    
    .  su log_pccal eduy_hhhead log_pcexp, d
    
                              log_pccal
    -------------------------------------------------------------
    
    Obs                3698
    Mean           7.783589
    Std. Dev.       .276406
    Variance       .0764003
    Skewness       .0350145
    Kurtosis       3.511389
    
                years of education of household head
    -------------------------------------------------------------
    
    Obs                3698
    Sum of Wgt.        3698
    Mean           2.984857
    Std. Dev.      3.776812
    
    Variance       14.26431
    Skewness       .9461994
    Kurtosis       2.751041
    
                  log of hh per capita expenditure
    -------------------------------------------------------------
    
    Obs                3698
    Sum of Wgt.        3698
    
    Mean           7.762185
    Std. Dev.      .4636838
    
    Variance       .2150027
    Skewness       .4395734
    Kurtosis       3.433132
    
    . pwcorr log_pccal age_hhhead eduy_hhhead log_pcexp, sig
    
                 | log~ccal age_hh~d eduy_h~d log_pc~p
    -------------+------------------------------------
       log_pccal |   1.0000
                 |
                 |
      age_hhhead |   0.2282   1.0000
                 |   0.0000
                 |
     eduy_hhhead |   0.0855  -0.1133   1.0000
                 |   0.0000   0.0000
                 |
       log_pcexp |   0.6401   0.1796   0.3254   1.0000
                 |   0.0000   0.0000   0.0000
                 |

  2. #2
    TS Contributor
    Points: 22,448, Level: 93
    Level completed: 10%, Points required for next Level: 902
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Why does the coefficient change sign when another variable is added to the OLS mo

    Quote Originally Posted by monzur View Post
    When I run the regression with just age and education as control, they are significant and positive. However, as soon as I add log per capita expenditure, education becomes negative and significant.
    perhaps you're dealing with a suppressor effect?
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  3. #3
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Why does the coefficient change sign when another variable is added to the OLS mo

    You might have multicolinearity or possibly a moderator effect (where one IV is influencing the impact of another variable on the DV). I do not know how to test for moderator effects ( I don't work with moderators generally) but you can test for MC by running a VIF test. If memory serves a change in sign when you add a variable is a sign often of one of these effects. This is an example that multivariate relationships and univariate relationships can be very different.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  4. #4
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Why does the coefficient change sign when another variable is added to the OLS mo

    Quote Originally Posted by spunky View Post
    perhaps you're dealing with a suppressor effect?
    Is a suppressor and moderator effect essentially the same thing (or perhaps a suppressor effect is one example of a moderator effect)?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  5. #5
    TS Contributor
    Points: 22,448, Level: 93
    Level completed: 10%, Points required for next Level: 902
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Why does the coefficient change sign when another variable is added to the OLS mo

    Quote Originally Posted by noetsi View Post
    Is a suppressor and moderator effect essentially the same thing (or perhaps a suppressor effect is one example of a moderator effect)?
    they're different but related things.... a moderator could be a suppressor but not all suppressors are moderators. these people do a pretty good job at untangling the whole thing:

    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2819361/
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  6. The Following User Says Thank You to spunky For This Useful Post:

    noetsi (12-19-2014)

  7. #6
    TS Contributor
    Points: 22,448, Level: 93
    Level completed: 10%, Points required for next Level: 902
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Why does the coefficient change sign when another variable is added to the OLS mo

    JEEBEZUZ! just look at the change in the fit of the model!

    without the suppressor variable (log_pcexp) your R-squared is 0.0692.... so basically zero. but with your suppressor variable makes the R-squared jump to 0.4123!!!

    my money's on the suppressor effect
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  8. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Why does the coefficient change sign when another variable is added to the OLS mo

    Quote Originally Posted by spunky View Post
    JEEBEZUZ! just look at the change in the fit of the model!

    without the suppressor variable (log_pcexp) your R-squared is 0.0692.... so basically zero. but with your suppressor variable makes the R-squared jump to 0.4123!!!

    my money's on the suppressor effect
    Or very few cases

    Seriously with 3355 cases that won't be occuring. With a very small sample size you signficantly increase r squared simply by adding more variables especially if you have a lot in the model.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  9. #8
    TS Contributor
    Points: 22,448, Level: 93
    Level completed: 10%, Points required for next Level: 902
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Why does the coefficient change sign when another variable is added to the OLS mo


    Quote Originally Posted by noetsi View Post
    Or very few cases
    nope, it's definitely suppression. towards the end the OP provides the correlation matrix among the variables. they're positively correlated but the regression weight changes to the opposite sign in the presence of the suppressor
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats