You could start with -estat gof-
Even better, start by reading -help logit postestimation- and the corresponding manual entry
I fitted a logistic regression model on a dataset of 5513 observations, and then I tested its performance with the ROC curves. How can I now test the calibration of the model? Is there any command in stata that do that?
thanks
You could start with -estat gof-
Even better, start by reading -help logit postestimation- and the corresponding manual entry
I want to test calibration with LOWESS smoothing curves. But my problem is: I have to test the calibration of a model tested on one dataset, on another dataset imported after. What are my indipendent and dependent variables?
Your dependent variable is the same, your independent variable is the predicted probability from your model.
Lore88x (11-19-2012)
So my independent v. is the probabilities obtained with the command "predict" on the model using the coefficients obtained in the first model?
Thanks a lot. It works.
But what is the command that better gives the calibration of the model? Is there any index of the calibration? Is the brier score one of them, or it is totally different? thanks
Yes to your first question.
There are various ways of testing the calibration and discrimination of a model. As far as I know the most common method for checking calibration is the Hosmer-Lemeshow goodness-of-fit test, which is implemented in Stata using -estat gof- with the group() option. I am sure there are lots of other options. I don't know much about the Brier score but it's easily implemented in Stata using the -brier- command, which has a nice manual entry explaining what all the results mean.
I am quite partial to collapsing the data into 10 groups (or more if you have a big dataset) and plotting predicted vs observed probability. For example:
Code:sysuse auto, clear logit foreign price weight headroom mpg predict prob egen probgroup=cut(prob), at(0(0.1)1) tab probgroup foreign preserve collapse (mean) foreign, by(probgroup) replace probgroup=probgroup + 0.05 // since it's 0-10%, better to graph at 5% than at 0% scatter foreign probgroup || line probgroup probgroup, legend(order(1 "Observed" 2 "Predicted")) restore
The "problem" is that for a logistic regression the plot is not so meaningful because the probabilities range from 0 to 1, but the outputs observed are only 0 and 1... Maybe I'll try the "gof". However thanks for now
Yes, that's why you collapse it into groups.
|
|