how to calculate the significance of an R2 increase (based only on results in paper)

#1
Recently I have come across several articles that report OLS results of models with increasing numbers of variables. In the articles the R2 and F-statistics are reported (and, of course, t or p values for the coefficients).
In these articles, I note that the R2 (or adjusted R2) doesn't increase very much and expect that the increase in R2 (or adjusted R2) may not be significant. However, the authors do not provide the F-test of the significance of the increase. Therefore, I would like to calculate this by hand. Can anyone tell me what the formula for this is?

For example, one study has two models with adjR2=.125 and .168. These two models vary by only 1 variable (which is included in model 2 but not in model 1). The accompanying F-stats are 3.113 and 4.848, the number of coefficients are 6 and 7 respectively, the number of observations 140.
The aythors claim that this additional variable is very important (it is stat.sign), but I wonder whether the increase in adjR2 is high enough to even continue with the larger model. How would I calculate that based on the reported results table only? I don't have the actual data, so can only work with what is reported in the article.
If possible, I'd like to know how to do this both for R2 and for adjR2, given that some authors report R2 and others report adjR2.

Thanks, Peter
 
#2
Re: how to calculate the significance of an R2 increase (based only on results in pap

I don't know how to do this with R2 but if you have t and p-values for the coefficients, than all variables with p>0.05 are insignificant (if p-values are calculated correctly). Generally, adjR2 are too small for the applicable model. I can say exactly due to my experience - 1. there are redundant variables in this model 2. This model is garbage even if it is statistically significant ( - not applicable). Tip to the author of article - identify the most significant predictor and say that there is a weak correlation. This will be honestly. Other is pseudoscience.
 
#3
Re: how to calculate the significance of an R2 increase (based only on results in pap

Correct me if I'm wrong but I don't think you can compare 2 R squared results and say that one is statistically significant and the other not because they are a measure of effect size rather than statistical significance. You would assess statistical significance on the .05 P value criterion and if the F test met the appropiate critical value in the F distribution.
 

Dragan

Super Moderator
#4
Re: how to calculate the significance of an R2 increase (based only on results in pap

Recently I have come across several articles that report OLS results of models with increasing numbers of variables. In the articles the R2 and F-statistics are reported (and, of course, t or p values for the coefficients).
In these articles, I note that the R2 (or adjusted R2) doesn't increase very much and expect that the increase in R2 (or adjusted R2) may not be significant. However, the authors do not provide the F-test of the significance of the increase. Therefore, I would like to calculate this by hand. Can anyone tell me what the formula for this is?

Thanks, Peter

What the authors are reporting is all that is needed. That is, additional information would be redundant.

You only need to know the t and p-values associated with the regression weights. That's the information that is addressing the question you're asking.

A t statistic (associated with any of the regression weights) is testing the unique contribution of a specific independent variable i.e. as if it were entered last into the model.
 

trinker

ggplot2orBust
#5
Re: how to calculate the significance of an R2 increase (based only on results in pap

Ventures said:
Correct me if I'm wrong but I don't think you can compare 2 R squared results and say that one is statistically significant and the other not because they are a measure of effect size rather than statistical significance.
The article "Correlation Redux" by Finn and Oklin (spelled wrong) attempts to compare 2 [TEX]{R}^2[/TEX]. Cohen, cohen Aiken and West recomend this procedure as outlined by Finn and Oklin. The procedure is recomended for larger samples though. But in this case we're dealing with [TEX]\Delta {R}^2[/TEX] which such procedures wouldn't really be appropriate for as they are strength of effect measures (as you pointed out).
 

noetsi

Fortran must die
#6
Re: how to calculate the significance of an R2 increase (based only on results in pap

You can do a F change test to determine if new predictors added signficant additional predictive power. In this case variables are added hierarchially rather than all at once. But I think you would have to have the raw data to do this.
 
#7
Re: how to calculate the significance of an R2 increase (based only on results in pap

What the authors are reporting is all that is needed. That is, additional information would be redundant.

You only need to know the t and p-values associated with the regression weights. That's the information that is addressing the question you're asking.

A t statistic (associated with any of the regression weights) is testing the unique contribution of a specific independent variable i.e. as if it were entered last into the model.
I see that. Maybe my point wasn't entirely clear. When adding one or more variables to a regression mode, these new variables may or may not be significant. At the same time, the coefficients and significance of the "previous" variables are affected.
Even when the newly added variable reaches statistical significance, it is not necessarily the case that that the larger model as a whole is preferable. Although R2 increases, the increase in R2 (as an indicator of the fit of the whole model) can be so small that the more parsimonious model is to be preferred.
As far as I know, it is common to calculate an F-statistic to test whether the fit of the larger model is stat. sign better than that of the smaller model. Basically, this is an AOV. I know how to calculate this in standard statistical software, when I have the data available. But my quest of for how to calculate this F-statistic (comparing two models with one nested in the other) when I only have the regression results available.
 
#8
Re: how to calculate the significance of an R2 increase (based only on results in pap

You can do a F change test to determine if new predictors added signficant additional predictive power. In this case variables are added hierarchially rather than all at once. But I think you would have to have the raw data to do this.
this is exactly what I am looking for, except for the "I think you would have to have the raw data to do this" part :)
Would there be no way to do this based on the info reported in the regression results in an article?
 

noetsi

Fortran must die
#9
Re: how to calculate the significance of an R2 increase (based only on results in pap

None that I am familar with. You could look up F change test and aggregate data and see if there is a way, but I doubt there is. Perhaps a metaanalysis technique could be adapted to address, but again I doubt that very much.
 

trinker

ggplot2orBust
#10
Re: how to calculate the significance of an R2 increase (based only on results in pap

This is likely covered by a meta analysis technique. As noetsi points out this problem would be easy to conduct with raw data but with a few stats from the article your job becomes harder. I personally don't know the answer to your question but here is a link about comparing models that may be of use (LINK). It appears to give formulas to do what you want. Sum of squares can be figured out from the anova table [usually] provided in the article from df and the f stat.

EDIT: funny both noetsi and I recommended a meta analysis technique at the same time
 

noetsi

Fortran must die
#11
Re: how to calculate the significance of an R2 increase (based only on results in pap

You could also contact the author(s) and ask them for the raw data. I have gotten data files from authors to conduct research (they usually are happy to give it to you).
 

Link

Ninja say what!?!
#12
Re: how to calculate the significance of an R2 increase (based only on results in pap

You could also contact the author(s) and ask them for the raw data. I have gotten data files from authors to conduct research (they usually are happy to give it to you).
Wow. Really noetsi?!?!? Every researcher I've come across has been overly safe and frugal when it's come to data sharing.
 

spunky

Super Moderator
#13
Re: how to calculate the significance of an R2 increase (based only on results in pap

well... i guess i am gonna have to assume i am the only one who has ever been exposed to this equation but here we go. to test whether the increase in the [math]R^{2}[/math] when adding another variable (or sets of variables like in hierarchical regression) into the model is significant we use the following F-statistic:

[math]F = \frac{R_{full}^{2}-R_{reduced}^{2}}{1-R_{full}^{2}}\cdot \frac{n-p_{full}-p_{reduced}-1}{p_{full}}[/math]

Where:

[math]R_{full}^{2}[/math] is the [math]R^{2}[/math] of the regression equation with the LARGERnumber of predictors

[math]R_{reduced}^{2}[/math] is the [math]R^{2}[/math] of the regression equation with the SMALLER number of predictors (which is nested in the larger model)

[math]n[/math] is your sample size

[math]p_{full}[/math] is the number of predictors in your full model (also the degrees of freedom for your numerator in the F-ratio)

[math]p_{reduced}[/math] is the number of predictors in your smaller model

and [math]n-p_{full}-p_{reduced}-1[/math] is the degrees of freedom in the denominator of your F-ratio, for when you go and check for significance.

this is the F-test that any software does to test whether the increment in [math]R^{2}[/math] is statistically significant or not.

from what i read on your original post, you should be able to calculate the F statistic and check its significance with the data at hand.

have fun!
 

Dragan

Super Moderator
#14
Re: how to calculate the significance of an R2 increase (based only on results in pap

well... i guess i am gonna have to assume i am the only one who has ever been exposed to this equation but here we go. to test whether the increase in the [math]R^{2}[/math] when adding another variable (or sets of variables like in hierarchical regression) into the model is significant we use the following F-statistic:

[math]F = \frac{R_{full}^{2}-R_{reduced}^{2}}{1-R_{full}^{2}}\cdot \frac{n-p_{full}-p_{reduced}-1}{p_{full}}[/math]
Ahem, check your degrees of freedom in the numerator and denominator, Spunky.
 

trinker

ggplot2orBust
#15
Re: how to calculate the significance of an R2 increase (based only on results in pap

Oh yeah I had to calculate that in a linear regression class it's in Cohen Cohen Aiken and West. duh. That's a pretty common formula. Thanks for making me feel like a real ding a ling spunky :) If my regression prof were to see this he'd not be happy.
 

trinker

ggplot2orBust
#16
Re: how to calculate the significance of an R2 increase (based only on results in pap

I prefer this rearrangement of the [TEX]\Delta F[/TEX] equation:

[TEX]\Delta F = \frac{(n-k-1)\cdot (\hat{R}^2_{Full Model}-\hat{R}^2_{Reduced Model})}{(q_{Full Model}-q_{Reduced Model})\cdot(1-\hat{R}^2_{Full Model})}[/TEX]

where q = number of predictors; k = number of predictors for full model
 

spunky

Super Moderator
#17
Re: how to calculate the significance of an R2 increase (based only on results in pap

Ahem, check your degrees of freedom in the numerator and denominator, Spunky.
i wasn't fully convinced either Dragan because the # of predictors as degrees of freedom just sounds plain wrong to me... nevertheless, this is an exact copy-paste of my (1970s) version of the Cohen et. al. blue book in which i know they worked some sort of transformation (they call this a "computational formula") on the original version but, although it does seem wrong, i cant really change it because i literally just translated what they had on their chapter into LaTeX and placed it here...
 

spunky

Super Moderator
#18
Re: how to calculate the significance of an R2 increase (based only on results in pap

Oh yeah I had to calculate that in a linear regression class it's in Cohen Cohen Aiken and West. duh. That's a pretty common formula. Thanks for making me feel like a real ding a ling spunky :) If my regression prof were to see this he'd not be happy.
MAJOR lol @ trinker for not remembering his regression basics! :p
 

trinker

ggplot2orBust
#19
Re: how to calculate the significance of an R2 increase (based only on results in pap

Funny...

Cohen, Cohen, Aiken and West 2003 still use the formula spunky gives where as Aiken and West 1991 use the formula [TEX]{df}=n-k-1[/TEX] where k is the # of predictors of the full regression model.
 

noetsi

Fortran must die
#20
Re: how to calculate the significance of an R2 increase (based only on results in pap

Wow. Really noetsi?!?!? Every researcher I've come across has been overly safe and frugal when it's come to data sharing.
What field do you work in? Admitedly the people I worked with had tenure so that might be an issue here.