What does R, R square , Adjusted R square, STD error of estimates helps for ?

#1
Hello

After I run out the regression analysis, can someone enlighten me what can I get from R, R square, STD Error estimates and the adjusted r square ?!

P.S: I am not professional in stats, I have only taken one course in university and that was back 4 years ago so I don't have any knowledge about that except watching couple videos in youtube.
 

Dragan

Super Moderator
#2
Well, I'll help you with R. That is, R is the Pearson correlation between the actual values of the dependent variable (Y) and the predicted values of Y (Y-hats) - regardless of the number of predictors (X's) in the regression model. As such, R is an index of the strength of the linear association between Y and Y-hats. Note that R is bounded between 0 and 1.
 

hlsmith

Omega Contributor
#6
R^2 is the R value squared, which is bounded between 0 and 1. It is typically interpreted as the amount of variance predicted by a variable (or variables, if there are more than 1 in a model).
 

bruin

New Member
#7
Based on this...

R is the Pearson correlation between the actual values of the dependent variable (Y) and the predicted values of Y (Y-hats)
...wouldn't a negative R only occur if your regression line had the wrong slope? How could a least-squares line have the wrong slope?
 

Dragan

Super Moderator
#8
Based on this...



...wouldn't a negative R only occur if your regression line had the wrong slope? How could a least-squares line have the wrong slope?
Technically speaking, it is conceivable to obtain a negative value of R^2. This (unusual) case can occur when one conducts a regression without an intercept term i.e. regressing through the origin. However, the OLS estimate of the slope coefficient is still unbiased - unless you force the error terms to sum to zero.
 

spunky

Smelly poop man with doo doo pants.
#10
Well, no...it is not....rather, the correlation between Y and the Y-hats is bounded between 0 and 1 as I stated above.
i thought this could be provable so i decided to give it a try. i haven't done one of this in a while so plz point out potential mistakes:

to show that \(0<Corr(\mathbf{Y},\mathbf{\widehat{Y}})<1\) we could show that \(Cov(\mathbf{Y},\mathbf{\widehat{Y}})>0\).

to do this, it is convenient to use \(\mathbf{H}\), the 'hat matrix' that we know \(\mathbf{H}=\mathbf{X(X^{t}X)^{-1}X^{t}}\) so that:

\(\mathbf{\widehat{Y}}=\mathbf{X\widehat{\beta}}=\mathbf{X(X^{t}X)^{-1}X^{t}Y}=\mathbf{HY}\)

so since we've established \(\mathbf{\widehat{Y}}=\mathbf{HY}\)

we can do:

\(Cov(\mathbf{Y,\widehat{Y}})=Cov(\mathbf{Y,HY})=\mathbf{H}Cov(\mathbf{Y,Y})\mathbf{H^{t}}\)

now, we know: \(Cov(\mathbf{Y,Y})=\sigma^{2}\mathbf{I}\)

by assumption of independence (or we'd introduce heteroskedasticity and violate the typical assumptions of OLS Regression)

so we substitute that back in:

\(\mathbf{H}Cov(\mathbf{Y,Y})\mathbf{H^{t}}=\mathbf{H}\sigma^{2}\mathbf{I}\mathbf{H}^{t}=\sigma^{2}\mathbf{HH^{t}}\)

which is clearly greater than 0 because \(\sigma^{2}\) is positive (it's a variance) and \(\mathbf{HH^{t}}\) is positive-definite.

so that (i think) shows \(Cov(\mathbf{Y},\mathbf{\widehat{Y}})>0\) which i'm sure, when standardized, bounds \(Corr(\mathbf{Y},\mathbf{\widehat{Y}})\) to be between 0 and 1.

comments? :)