+ Reply to Thread
Page 1 of 3 1 2 3 LastLast
Results 1 to 15 of 31

Thread: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

  1. #1
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    1 out of 5 predictors is non-linear - can I proceed with OLS estimation?




    Hello dear forum members!

    My univariate multiple regression model includes 5 predictors + 2 interaction terms. However, examination of the fitted values plots revealed one predictor having a curvilinear relationship with the DV (that predictor is also a theory-based moderator). Ramsey regression specification-error test (RESET) rejected null that there are no omitted variables, thus I included a squared term for the non-linear predictor in the equation.

    The model has then passed the RESET test. Moreover, the residual Q-Q plot (and JB normality test) has improved greatly after the inclusion of the squared term.

    My question is - can I proceed with OLS estimation of the coefficients having these non-linear terms in the equation? What would be the proper way of addressing this issue (considering that all other predictors have linear relationship with the DV)?

    Thank you in advance for comments and suggestions.

  2. #2
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    If you can transform the variable to make it linear you certainly can. Or model the variable as you did [adding a non-linear term a quadratic if I understood what you did]. It is done all the time.

    Ultimately the answer depends on whether the variable that has a non-linear relationship is inherently non-linear or can be made linear [transformed to be linear]. Commonly this depends on (I believe) on whether the X or the slope is non-linear.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  3. The Following User Says Thank You to noetsi For This Useful Post:

    kiton (01-22-2015)

  4. #3
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Thank you for prompt reply, noetsi. I got your point.

    Would you consider this graph to show a linear relationship - this is log transformed - see attachment.

    Thank you.
    Attached Images  

  5. #4
    Omega Contributor
    Points: 38,326, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,995
    Thanks
    398
    Thanked 1,185 Times in 1,146 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Yes, it seems you can move forward. Is your purpose to define the Y variable or attempt to predict it?

    Take Noetsi's comments into consideration. If you keep the current term in the model, you just need to make sure you accurately explain it. The linearity in the model is based on the linear combination of model terms (vector spaces), and as you seem to already know the normality assumption is on the model residuals.

    Did you also keep the original non-squared version of the variable in the model as well?
    Stop cowardice, ban guns!

  6. The Following User Says Thank You to hlsmith For This Useful Post:

    kiton (01-22-2015)

  7. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    I would say that the graph reflects a monotonic curve [which is formally non-linear I believe]. I would think a quadratic term would model that although a lowess is by definition non-parametric so I am unsure of its usage here.

    One way to know if an equation is non-linear is to specify a non-linear term and see if it is statistically signficant. If it is not than that supports the view that the model is linear [although it could be you added the wrong non-linear term] One way to check for non-linearity that is fairly simple is Box Tidwell. This helps determine if non-linearity is suggested by the data. I tried to find a link, my experience with it is from books.

    I played around with General Additive Models for a while to address non-linearity. It appears to me to be an excellent approach to this, if for nothing else than the diagnostic elements it adds. But it is far from simple and in the end there were elements I simply failed to grasp.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  8. The Following User Says Thank You to noetsi For This Useful Post:

    kiton (01-22-2015)

  9. #6
    Omega Contributor
    Points: 38,326, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,995
    Thanks
    398
    Thanked 1,185 Times in 1,146 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Yeah, but if the model is fitting fine with the exception of one variable not having a straight line relationship, though this can be addressed with a transformation (squared term), you are probably fine moving forward as mentioned earlier.
    Stop cowardice, ban guns!

  10. The Following User Says Thank You to hlsmith For This Useful Post:

    kiton (01-22-2015)

  11. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    I agree with that. To me what is simplest is to simply do Box Tidwel and try to transform the variables. Then see if the transformation or adding non-linear terms works [through seeing if the new term is significant].

    I am not sure how to determine if a model with a linear term is better than one that has been transformed to be linear or by adding a non-linear term such as a quadratic. Because of the nature of R squared you can not use that I would think [since it only looks a linear explained variance]?

    AIC maybe?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  12. The Following User Says Thank You to noetsi For This Useful Post:

    kiton (01-22-2015)

  13. #8
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Quote Originally Posted by hlsmith View Post
    Yes, it seems you can move forward. Is your purpose to define the Y variable or attempt to predict it?

    Take Noetsi's comments into consideration. If you keep the current term in the model, you just need to make sure you accurately explain it. The linearity in the model is based on the linear combination of model terms (vector spaces), and as you seem to already know the normality assumption is on the model residuals.

    Did you also keep the original non-squared version of the variable in the model as well?
    The purpose of the study is to predict Y. On of the key problems is that the variables are not normal (log transformation does not solve the problem), so I am building the argument on the paper by Williams, Grajales, and Kurkiewicz (2013) and specify my model in accordance with the best fitted residuals.

    I surely did keep the original non-squared version of the variable in the model.

    I wonder though, in terms of proper justification, what is better: (a) modeling curvilinear relationship as X+X^2, or (b) ln(X) ?

    Thank you very much for the feedback, hlsmith

  14. The Following 2 Users Say Thank You to kiton For This Useful Post:

    GretaGarbo (01-23-2015), hlsmith (01-22-2015)

  15. #9
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Quote Originally Posted by noetsi View Post
    I agree with that. To me what is simplest is to simply do Box Tidwel and try to transform the variables. Then see if the transformation or adding non-linear terms works [through seeing if the new term is significant].

    I am not sure how to determine if a model with a linear term is better than one that has been transformed to be linear or by adding a non-linear term such as a quadratic. Because of the nature of R squared you can not use that I would think [since it only looks a linear explained variance]?

    AIC maybe?
    I will surely explore the suggested Box Tidwell, thank you for suggestion.

    I did run the model comparison using global F test and also R squared difference (as suggested by Aiken and West, 1991). Both test are in favor of a model with X+X^2 modeling.

    Also, attached is a Q-Q residual plot that I include to justify the final model specification. The saved residuals passed the Shapiro-Wilk and Shapiro-Stefania normality tests. However, they ALMOST passed the Jarque-Bera test - which I heard is the most robust of the three (p=.044).

    Other assumptions:

    - Multicollinearity - NO;
    - Exogeniety - NO;
    - Heteroskedasticity - YES, addressing that by using robust SE for heteroskedastic data (vce(hc3) in STATA);
    - Link test - OK;
    - RESET - OK.

    I sincerely appreciate your feedback, Sir.
    Attached Images  

  16. #10
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    You do have some outliers in the upper tail. I would suggest a skew and kurtosis test. If you have extra time on your hand you can run one of the many test of influence such as Cooks d or DFBETA for the impact of outliers on your data.

    Which robust SE did you use? White's?

    In comparing models I think the one most recomended is AIC.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  17. The Following User Says Thank You to noetsi For This Useful Post:

    kiton (01-22-2015)

  18. #11
    Devorador de queso
    Points: 95,705, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,931
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Quote Originally Posted by kiton View Post
    so I am building the argument on the paper by Williams, Grajales, and Kurkiewicz (2013)
    I hear that's a good paper.
    I don't have emotions and sometimes that makes me very sad.

  19. #12
    Omega Contributor
    Points: 38,326, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,995
    Thanks
    398
    Thanked 1,185 Times in 1,146 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    That is a pretty Q-Q plot. What did you do for the exogeneity and link test?
    Stop cowardice, ban guns!

  20. #13
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Quote Originally Posted by noetsi View Post
    You do have some outliers in the upper tail. I would suggest a skew and kurtosis test. If you have extra time on your hand you can run one of the many test of influence such as Cooks d or DFBETA for the impact of outliers on your data.

    Which robust SE did you use? White's?

    In comparing models I think the one most recomended is AIC.
    That is correct, I do have a number of outliers. It was a weighted decision to retain them, since they are "the story tellers". I am planning on mentioning that in the limitations section. I did examine the Cook's distances as well. Depending on the threshold, their number varies D>1 - zero, D>4/N - apx 5%.

    In case of SE, I am using u/(1-h) (Davidson and MacKinnon, 1993).

  21. #14
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?

    Quote Originally Posted by hlsmith View Post
    That is a pretty Q-Q plot. What did you do for the exogeneity and link test?
    Link test - just followed the guidelines suggested by STATA guide (-linktest- command)

    For the exogeniety: (a) examined the correlation b/w predictors and residuals (must be zero), and (b) conducted a Hausman Chi-square test.
    Last edited by kiton; 01-22-2015 at 05:36 PM.

  22. #15
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: 1 out of 5 predictors is non-linear - can I proceed with OLS estimation?


    It was a weighted decision to retain them, since they are "the story tellers".
    One of the most common comments on outliers is that you should always wonder why they exist. They can be your best learning experience about the data. And the common recomendations is that you should not remove non clerical outliers - although when they badly distort the regression line I personally always had problems with that advice.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread
Page 1 of 3 1 2 3 LastLast

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats