I guess this is the formula as long as the sample sizes are the same:
Today I was going to attempt to figure out and write some SAS code to compare nested linear models (variance explained). But I need some basics first.
In R you can do this very easily, run reduced model then saturated model and then run AOV(list both models) and it kicks out an f-stat, if significant then the saturated model explains significantly more variance.
I know how to do it with -2LL, you subtract the two values of -2LL and DF, then look up the value on the chi-sq distribution. I have seen some formulas describing the process for basic multiple linear I think reg, but I want to make sure I am doing it right. What should I be comparing in particular from a linear model output, the f-stats?
Stop cowardice, ban guns!
I guess this is the formula as long as the sample sizes are the same:
Stop cowardice, ban guns!
Alright, what am I missing here? My p-value seems way too small...
Code:/*s=saturated; r=reduced*/ data r_sqed; input Rs Rr DFs DFr N; ndf = DFs - DFr; ddf = N-DFs-1; F=((Rs -Rr)/(ndf))/((1-Rs)/(ddf)); p=1-probf(F,ndf,ddf); datalines; 37.26 12.08 6 3 24 ; proc printdata=r_sqed; var F p; format p pvalue12.11; run;
Stop cowardice, ban guns!
Don't have SAS at the moment... what is it giving you.
But it looks like you're using 37.26 instead of .3726 and 12.08 instead of .1208
I don't have emotions and sometimes that makes me very sad.
Attitude. It is also going slow because I need a reboot.
I think I am on the trail, was getting garbage because I had the directory open and it was just kicking out old results.
Stop cowardice, ban guns!
I wasn't giving attitude. Just letting you know that I (and probably others) can't run your code so would appreciate the current output.
The formula you have posted assumes that the Rsquare values are between 0 and 1 (not 0 and 100) so I was pointing out that if you're having issues that might be a cause.
I don't have emotions and sometimes that makes me very sad.
Final Code, above I was mistakenly using the F-stats instead of the R^2 values.
Code:data r_sqed; input Rs Rr DFs DFr N; ndf = DFs - DFr; ddf = N-DFs-1; F=((Rs -Rr)/(ndf))/((1-Rs)/(ddf)); p=1-probf(F,ndf,ddf); datalines; 0.929339 0.644415 6 3 24 ; proc print data=r_sqed; var F p; format p pvalue12.11; run;
Stop cowardice, ban guns!
Do you think it matters that I used unadjusted R^2?
I went back and attempted to use Adjrsq, but not sure if I am just belaboring it. The adjusted calculated p-value was about ten 4 times larger, but still highly significant. With only a difference of 3 variables in the example it doesn't seem to be a big deal.
Stop cowardice, ban guns!
Tweet |