+ Reply to Thread
Results 1 to 12 of 12

Thread: Is R square useful at all?

  1. #1
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Is R square useful at all?




    I know it can not be used in non-linear regression at all [and the pseudo R squares are in dispute in logistic regression] but I found this comment a bit shocking given how commonly R square is in the literature.

    http://data.library.virginia.edu/is-r-squared-useless/
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  2. The Following User Says Thank You to noetsi For This Useful Post:

    Jake (03-20-2017)

  3. #2
    Omega Contributor
    Points: 38,406, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,002
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Is R square useful at all?

    That is a nice link. The take home message is as long as you know the limitations and interpretations of R^2, it is a fine parameter. And yes, if doesn't tell you about goodness of fit or if the model is valid, but it is not trying to do that, people are trying to do that. An example is that you can easily overfit a model and get a very high R^2, but in some circumstances you want a model that is generalizable, so the overfit model does not work as a great metric as well as say using MSE and cross-validation.
    Stop cowardice, ban guns!

  4. #3
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Is R square useful at all?

    The take home message is as long as you know the limitations and interpretations of R^2, it is a fine parameter.
    I didn't get that sense from the article at all To me they were saying it was pretty much useless. The author they are citing argues that R square does not really tell you about explained variance in Y given X, it just as logically tells you about the explained variance in X given Y [the results are the same]. That is very surprising to me. I assume you have to know which way the influence is going, and accept the influence does not go both ways, for the statistic to work.

    What is a good measure of fit for linear regression? I know what they are for logistic regression not linear regression. The truth is I don't pay much attention to overall model fit when I run models, I focus on the effect size of variables as long as the various model indicators show the model meets the minimum requirement. Of which r square is not one.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  5. The Following User Says Thank You to noetsi For This Useful Post:

    trinker (03-20-2017)

  6. #4
    TS Contributor
    Points: 14,811, Level: 78
    Level completed: 91%, Points required for next Level: 39
    Miner's Avatar
    Location
    Greater Milwaukee area
    Posts
    1,171
    Thanks
    34
    Thanked 405 Times in 363 Posts

    Re: Is R square useful at all?

    I never use R-squared by itself. At a minimum, use R-squared (adjusted), or even better, use R-squared (predicted). See The Minitab Blog. I have seen regression analyses with R-squared and R-squared (adjusted) in the 80s, with an R-squared (predicted) = 0.

  7. The Following User Says Thank You to Miner For This Useful Post:

    trinker (03-20-2017)

  8. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Is R square useful at all?

    Given that I rarely use R square as much as I do I use adjusted R square. R square predicted is something I have not heard of before.

    I find AIC much more useful for what I do than R square. Beyond the limitations mentioned above it is rarely clear what a "good" r square should be. You can look in the literature, if you have access to it which I commonly don't, and if it exists, which commonly it does not in my field. But other than that too many things can influence it to know whether what you found or not is a good result.

    My own theory, I have not seen this addressed, is that for complex phenomenon you will have lower R square than less complex simply because so many things can influence the results with complex realities. So you will be less likely to have useful variables in the model.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  9. #6
    TS Contributor
    Points: 14,811, Level: 78
    Level completed: 91%, Points required for next Level: 39
    Miner's Avatar
    Location
    Greater Milwaukee area
    Posts
    1,171
    Thanks
    34
    Thanked 405 Times in 363 Posts

    Re: Is R square useful at all?

    For me, it boils down to the Prediction limits. It doesn't matter if R-squared (adjusted) is 98% if the prediction limits are extremely wide. If R-squared is lower than I am comfortable with, I can compared the standard deviation of the model to the measurement variation. If the former is larger than the measurement variation, I suspect a missing factor. However, I do have the luxury of being able to quantify my measurement variation. I know this is difficult to impossible in many disciplines.

  10. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Is R square useful at all?

    What is strange to me is that most of the regression I see is not tied to prediction or relative impact. It is tied to determining whether some factor is or is not statistically valid. I think this because most analysis I see is academic and prediction is commonly not what they are interested in. Theory building is, or discovering some variable is important which is commonly the same thing. Unfortunately practitioner results are not easy to find (or I have not found them anyway)?

    What is a good way to measure fit in a linear regression (or a non-linear regression) model?

    I explain the importance of confidence intervals over effect size by saying I can predict something as five plus or minus a million with great confidence
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  11. #8
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Is R square useful at all?

    Yes, it is pretty meaningless. But you know that I have been saying that for some time.

    The usefulness of it is that you can brag about it for your friends - if it is big (especially males are interested in this). If it is small, don't talk about it.

  12. #9
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Is R square useful at all?

    I was taught to use it in hierarchical multiple regression (not HLM) to use a delta R^2 as a measure of if new variable blocks are contributing more explained variance. http://stats.stackexchange.com/a/55616/7482 You use the delta to talk about percent additional explained variance in the outcome. Is this wrong?
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  13. #10
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Is R square useful at all?

    Quote Originally Posted by trinker View Post
    You use the delta to talk about percent additional explained variance in the outcome. Is this wrong?
    No, it is not wrong. It is just an other way or rewriting the t-test (or F-test) for the extra included parameter. So it doesn't add any information.

    But is the delta-R^2 more meaningful? It can be high or low depending on how it is with the multicolinearity, even if the model beta parameters are the same (in two hypothetical) layout. Remember that the multicolinearity is a problem in the sample, not the population.

    The R^2 is quite meaningless. But what influences it is not meaningless - the standard deviation in the residuals and the "spread" in the X-values. They matter.

    By the way, it is not strange that you will get the same R^2 with regression y on x as with x on y, since R^2 is the correlation (r) squared and:

    r = Cov(x,y)/(sd(x)*sd(y))

    where Cov() is the covariance and sd() is the standard deviation. They stay the same.

  14. #11
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Is R square useful at all?

    hi,
    I think the biggest problem with R-sq(adj) is that it leads to overfitting. It can easily happen that one gets a fine R-sq value on the training set and abysmal performance on a test-set . The R-sq(pred) is alleviating this somewhat, but it still has a pretty weak link to the prediction performance imho.

    IIRC BIC has a proven link to the test set performance, so, theoretically it would be a better measure (so, AIC also?)

    regards

  15. #12
    Points: 3,006, Level: 33
    Level completed: 71%, Points required for next Level: 44

    Posts
    177
    Thanks
    1
    Thanked 29 Times in 29 Posts

    Re: Is R square useful at all?


    AIC, BIC, log likliehood, etc all make better fit measure than R-squared. What r-square (or its derivatives) is useful for is determining simple effect size. In which case it does exactly what it claims to do. Now, does that mean that you specified your model correctly? No. Its just a useful piece of evidence along with fit measures, p-values, etc.

    I really wonder who is publishing without reporting the whole suite of statistics: p-values, effect sizes, fit indices, random effect correlations, confidence intervals, etc. What are these journals and are they indexed?

  16. The Following User Says Thank You to the42up For This Useful Post:

    hlsmith (03-26-2017)

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats