Greta I think the argument is that in some cases (ie when the variables are in unfamiliar scaling) the standardized betas are more interpretable than the raw coefficients. So it allows you to say a 1 standard deviation increase in x yields an n standard deviation change in y. This is the crude design I used in my linear regression course (it now throws up a warning) for simple additive models:
PS no guarantees if this is correct it was over a year and a half ago and I made it to match what my classmates who used SPSS would get. I think standardizing the coefficients first makes more sense but haven't really given this much thought since my linear regression class.
"If you torture the data long enough it will eventually confess."
-Ronald Harry Coase -
^This is true. It is also a measure of effect size and some journal require effect size estimates. In addition many scales in social science are arbitary and/or have range restriction as such it seems that standadized estimates are often easier to understand.
It seems to me that if one is using a standardized estimate then it seems like one is doing some implicit assumption that the x-variables are randomly sampled. But of course there is no such assumption in regression. We are just conditioning on the observed x-values. And that the standard deviation is some kind of “natural spread” of the x-variables. I think that this can lead the investigator wrong. (I wish I could dig up that critique!)
No I would definitely report the original estimates. But if they want the rescaled values, ok I would give them that to.
"shouldn't as long as they're coded as factors in the dataframe"
But then the software takes the factors of the categorical variables and codes dummy variables (in most programs). To have a dummy variable seems to me to have a very simple thing.
Then, why on earth should one rescale that another time? (to the standardized beta model). Doesn’t that make it extra complicated?
I agree. One of the most frustrating things is when people report standadized estimates for things like gender and then say "the difference between males and females was...". There may be some value, however, in standadizing the outcome variable such that the estimate gives the difference between groups coded 0 and groups coded 1 in standard deviation units of the outcome variable. Many programs will give estimates for unstd, std x and y, std x, and std y.
“They're equal. I'd heard this but never tried it.”
Aha, but is this also true for the classical sum-to-zero parameterization or the strange parameterization that is used in Splus or any other parameterization that the software happens to use?
(Nice code but the dummy function didn’t work for me.)
Anyway, there seems to be a place for extra confusion here because of standardization.