I had a similar pursuit about 6 months ago. I believe I came across a SAS tech paper from a sas user group that gave a good description. Sorry I am not at my computer right now. Though I will see if I can find it.
In a simple model, x is a continuous (normally distributed) variable predicting y. Since y values are proportions ranging from 0 to 1 (0%-100%), simple linear regression may give out-of-bounds estimates for some predicted values (i.e., lower than 1 or higher than 1).
Therefore, I have decided to use beta regression with boundaries from 0 to 1 (i used betareg() command in betareg R package; the software is however not important). While it is easy to interpret the unstandardized regression parameter from a linear model (see below linear model output: B = 0.126 indicating an increase by 12.6% of y if x rises by 1), I am not sure how to understand, transform, or use the parameters from betareg model to get a meaningful interpretation of the coef (see below - Beta regression output).
Output for linear regression model: lmMod = lm(formula = y ~ x)
Output for beta regression model:betaMod = betareg(formula = y ~ x)Code:Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.57936 0.10849 -5.340 9.57e-07 *** x 0.12591 0.01354 9.296 4.07e-14 ***
How can I interpret the parameter 0.567 in the beta regression output (together with the intercept)? Is there a way how to use 0.567 and get the increase of the absolute value in y (i.e., if x increases by 1, y increases by XX, since y is in %, the interpretation is easy).Code:Coefficients (mean model with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) -4.85712 0.52580 -9.238 <2e-16 *** x 0.56796 0.06498 8.740 <2e-16 *** Phi coefficients (precision model with identity link): Estimate Std. Error z value Pr(>|z|) (phi) 7.686 1.184 6.491 8.54e-11 ***
Thank you! M.
Last edited by Martin Marko; 03-05-2017 at 09:34 AM.
MM
I had a similar pursuit about 6 months ago. I believe I came across a SAS tech paper from a sas user group that gave a good description. Sorry I am not at my computer right now. Though I will see if I can find it.
Stop cowardice, ban guns!
It might have been Paper: 335:2011. Looks like they take on a logistic style interpretation.
Stop cowardice, ban guns!
Martin Marko (03-05-2017)
Thank you a lot for helping,
logistic interpretation means B1 is log odds, right? So I can use exp(coefficientB1_value) to get "odds" ( = 1.792) which I don't understand at all.
Perhaps another way to go: I am considering to use the abovementioned simple linear regression and then define the "meaningful" range of its application (like, use linear regression equation to compute the value of x that would predict prob of y = 0 and then estimate upper-bound meaningful value of x that would predict y = 1). Does this make any sense? Not sure, but i really need to know an increase of X changes the value of Y (in %).
BTW, the relationship Y~X can be seen as linear:
Thank you,
MM
The logit model:
log(p/(1-p) = beta*x
can be solved to:
p = exp(beta*x)/(1+exp(beta*x))
or
p = 1/(1 + exp(-(beta*x)))
It gives these numbers:
But if your original data were 0/1 success/failure then maybe it would be more natural to do the usual logit.Code:# the linear regression model parameter estimates a <- -0.57936 b <- 0.12591 a + b*8 # [1] 0.42792 #seems reasonable a + b*9 # [1] 0.55383 # the beta-regression model with logit link: alpha <- -4.85712 beta <- 0.56796 # log(p/1-p) = xbeta gives # p = 1/(1-exp(-(alpha + beta*x))) p0 = 1/(1+exp(-(alpha + beta*8))) p0 # [1] 0.4222753 p1 = 1/(1+exp(-(alpha + beta*9))) p1 # [1] 0.5632887 p1 - p0 # [1] 0.1410134 changing from x=8 to x=9 # compare with the above linear model 0.55383 - 0.42792 # 0.12591 # they are two different models so they don't give exactly the same result # but similar results
Martin Marko (03-05-2017)
Many thanks for the transformation,
it was much helpful,
Best regards,
MM
Can you post a histogram of your dependent variable values? Linear reg is acceptable given the bulk of values land near 0.5 with minimum dispersion.
Stop cowardice, ban guns!
Sure,
just to mention that each data point represents a difficulty parameter of a test item which was estimated on ~200 individuals measure.
The issue of the linear/beta regression was to model of how theoretical complexity of an item (given by construction) relates to its empirical difficulty.
M
Last edited by Martin Marko; 03-06-2017 at 01:38 PM.
MM
Tweet |