+ Reply to Thread
Results 1 to 2 of 2

Thread: Help on interpreting linear regression

  1. #1
    Points: 2,817, Level: 32
    Level completed: 45%, Points required for next Level: 83
    Lukan27's Avatar
    Location
    Denmark
    Posts
    26
    Thanks
    7
    Thanked 1 Time in 1 Post

    Help on interpreting linear regression estimates




    I'm currently working on a larger assignment, and I need some input how to interpret the results from my/a linear regression. I'm pretty sure I get it right, but as a precautionary measure some input would be lovely.

    I've websearched and speculated alot about this, but I can't seem to get a final take on it.

    This is a part of my regression (which is enough to illustrate the point):

    Code: 
    > summary(fittest1)
    
    Call:
    lm(formula = homicides_any_method ~ gdp * education, data = raw_data)
    
    Residuals:
       Min     1Q Median     3Q    Max 
     -3000  -2082  -1399   -124  42265 
    
    Coefficients:
                    Estimate Std. Error t value Pr(>|t|)   
    (Intercept)    3.442e+03  1.234e+03   2.790  0.00593 **
    gdp           -6.125e-02  1.686e-01  -0.363  0.71692   
    education     -1.201e+02  1.690e+02  -0.711  0.47828   
    gdp:education  1.845e-03  1.528e-02   0.121  0.90404   
    ---
    Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
    
    Residual standard error: 5372 on 158 degrees of freedom
      (15 observations deleted due to missingness)
    Multiple R-squared:  0.03071,	Adjusted R-squared:  0.0123 
    F-statistic: 1.669 on 3 and 158 DF,  p-value: 0.176
    
    > summary(fittest2)
    
    Call:
    lm(formula = log_any_homicide_rate ~ log_gdp * log_education, 
        data = raw_data)
    
    Residuals:
        Min      1Q  Median      3Q     Max 
    -2.2483 -0.6268 -0.0109  0.4907  2.7438 
    
    Coefficients:
                          Estimate Std. Error t value Pr(>|t|)   
    (Intercept)             1.3101     1.8372   0.713  0.47684   
    log_gdp                 0.4819     0.2856   1.687  0.09355 . 
    log_education           2.6292     0.8446   3.113  0.00220 **
    log_gdp:log_education  -0.3949     0.1245  -3.173  0.00181 **
    ---
    Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
    
    Residual standard error: 0.998 on 158 degrees of freedom
      (15 observations deleted due to missingness)
    Multiple R-squared:  0.3181,	Adjusted R-squared:  0.3052 
    F-statistic: 24.57 on 3 and 158 DF,  p-value: 4.201e-13
    Well, the first model is with raw numbers, and the second with logarithm applied on all variables. I was quick to see that not data transforming would make everything quite useless. As we observe, model 1 is far from signifigant, neither in total, or for any variable listed. The opposite with model 2, with the exception of log_gdp of course, but almost. So there's really no doubt that model 2 is way better. But I'm confused regarding the estimates of the individual variables, and their interaction.

    See, as I'm interpreting it, in model 1, we have a -6.125e-02 estimate on gdp, -1.201e+02 on education and (positive) 1.845e-03 on their interaction. This is where I'm unsure; so for every move gdp, we have a -6.125e-02 decrease in homicides and -1.201e+02 decrease in homicides when it comes to education. This makes sense, since we should assume that wealth and education means less tendency to conduct homicide. But what about the interaction? So gdp:education means 1.845e-03 increase in homicide? So both of these in combination means an increase in homicide? This makes no sense, at least reagarding our assumption/theory that these two factors should reduce crime/homicides..

    It's essentially the same problem in model 2, just inverted; now log_gdp:log_education is negative, but log_gdp and log_education positive. So in model 2 both log_gdp and log_education means an x% increase in homicides, but their interaction means a x% decrease?

    And why does log transformation seemingly makes this invertion? Because that's the true/real interaction/effect, or?

    Any help appreciated.
    Last edited by Lukan27; 01-18-2015 at 08:41 AM.
    RTFM

  2. #2
    Points: 2,817, Level: 32
    Level completed: 45%, Points required for next Level: 83
    Lukan27's Avatar
    Location
    Denmark
    Posts
    26
    Thanks
    7
    Thanked 1 Time in 1 Post

    Re: Help on interpreting linear regression


    You know what, I think I figured it out. Since model 2 is log transformed, to say x% increase in a variable, you have to reverse the log transformation, so eg.; let's say 1% increase in a variable means x increase the dependent variable; log_gdp is 0.4819, so 1.01^0.4819 = 1.0048 = 0.481% = 1/0.481 = 2.08%, so 1% increase in GDP means 2.08% increase in homicides, everything controlled. However, log_gdp and log_education interaction results in 2.55% decrease, so in total 2.08% + 0.38% - 2.55% = -0.09%! For every 1% increase in GDP and education (in average, I guess) means 0.09% decrease in homicides!

    Correct me if I'm wrong..
    RTFM

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats