Logistic Regression Probability Estimation

#1
Hi I am a learner who is trying to build my first stats model which is aiming at winning chance prediction for business quotes.

The stats model I use is the logistic regression. after going through the variable selection process it finally contains 4 categorical variables plus 2 numeric variables. the tool we got here is sas 9.3.

My latest issue is regards to the estimated probability ratios produced in sas output.
It seems I can't replicate the probability calculation correctly as long as the omitted dummy variable is involved (in this case, I suspect acg=3 is causing the problem.) All the details are available on the attached spreadsheet. I have highlighted the those matched estimated probability in green; otherwise in red. Hope this would be a simple problem due to my misunderstanding. will appreciate for any suggestions which might help.

Thanks!
 
Last edited:

hlsmith

Less is more. Stay pure. Stay poor.
#2
Not following how you got beta coefficient values for:
log_psi​
log_ppr​

???

How are these calculated?
 
#3
Not following how you got beta coefficient values for:
log_psi​
log_ppr​

???

How are these calculated?
Hi, the coefficients are generated by using the maxi likelihood estimates.
Not sure if this is what you asked about?

Analysis of Maximum Likelihood Estimates

Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 1.9248 0.4052 22.5605 <.0001
acg 1 1 0.1234 0.0867 2.0253 0.1547
acg 2 1 0.4959 0.0979 25.6711 <.0001
stc 1 1 -0.1282 1.1474 0.0125 0.9110
stc 2 1 0.2773 0.3460 0.6425 0.4228
stc 3 1 0.5234 0.6479 0.6525 0.4192
stc 4 1 -0.7399 0.3227 5.2560 0.0219
tgb 0 1 -0.2950 0.0717 16.9503 <.0001
tgb 1 1 0.2441 0.0643 14.4204 0.0001
tgb 2 1 0.1252 0.0587 4.5587 0.0328
tgb 3 1 0.0823 0.0595 1.9128 0.1667
ybc 1 1 0.3485 0.0921 14.3018 0.0002
ybc 2 1 0.1665 0.1705 0.9539 0.3287
ybc 3 1 0.0237 0.1017 0.0542 0.8158
ybc 4 1 -0.1061 0.0809 1.7226 0.1894
ybc 5 1 -0.0646 0.0754 0.7326 0.3920
ybc 6 1 -0.1712 0.0776 4.8610 0.0275
ybc 7 1 -0.1513 0.0900 2.8235 0.0929
log_psi 1 -0.0953 0.0194 24.0875 <.0001
log_ppr 1 -0.5856 0.0999 34.3342 <.0001
 
#4
Hi,
I just resolved this issue by myself. The reason is actually quite simple.
Given sas proc logistic treats the effective coding as its default parameterization setting; therefore, I need to add all the relevant estimated coefficients together then time the total by -1. after this I can get the same estimated probability as SAS.
more information is available here.
Effective Coding:http://www.ats.ucla.edu/stat/mult_pkg/faq/general/effect.htm
Reference Coding:http://www.ats.ucla.edu/stat/mult_pkg/faq/general/dummy.htm
Plus a very brief but well-explained SAS article:http://www.nesug.org/proceedings/nesug07/sa/sa11.pdf