# ROC curve from logisitc regression Bootstrap analysis in Stata 9.2

#### MRH

##### New Member
Hello, I am doing an analysis to predict an outcome (death) from a database. The model is suposed to be used to predict which children need immediate care. As I only have 44 deaths out of 948 children I am doing a bootstrap logistic regression on Stata 9.2. I am trying to see how good my prediction model is with my five predictors. However, I have no idea how I can get AUC and an ROC curve from this to see how good the model is that I fitted.

#### terzi

##### TS Contributor
After fitting a model with the logit command you should usually type lroc to get the ROC curve.

I remember it worked after bootstrap, didn't it?

#### MRH

##### New Member
sorry it does not work, apparently not allowed function :shakehead

#### terzi

##### TS Contributor
You can't either get the results with the roc or roctab commands?

Which commands are you using exactly? What's your input?

Last edited:

#### MRH

##### New Member
my code

thanks for your help! Here is my code:

. logistic earlydea TreamentArm ElevatedPulse Leucytosis Hypotensive MeanO2Sat
> Age3group, vce(bootstrap, reps(1000))
(running logistic on estimation sample)

Bootstrap replications (1000)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................................. 150
.................................................. 200
.................................................. 250
.................................................. 300
.................................................. 350
.................................................. 400
.................................................. 450
.................................................. 500
.................................................. 550
.................................................. 600
.................................................. 650
.................................................. 700
.................................................. 750
.................................................. 800
.................................................. 850
.................................................. 900
.................................................. 950
.................................................. 1000

Logistic regression Number of obs = 932
Replications = 1000
Wald chi2(6) = 44.78
Prob > chi2 = 0.0000
Log likelihood = -115.92168 Pseudo R2 = 0.2552

------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
earlydea | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
TreamentArm | 3.144032 1.304644 2.76 0.006 1.394034 7.090889
ElevatedPu~e | 3.786335 1.752647 2.88 0.004 1.5283 9.380576
Leucytosis | 5.702793 2.418744 4.10 0.000 2.483503 13.09515
Hypotensive | 3.434464 1.722635 2.46 0.014 1.285047 9.179073
MeanO2Sat | 1.880585 .266983 4.45 0.000 1.423801 2.483913
Age3group | 1.324033 .5473046 0.68 0.497 .5889042 2.976823
------------------------------------------------------------------------------

. estat gof
no observations
r(2000);

. lroc
no observations
r(2000);

#### terzi

##### TS Contributor
??

Well, everything seems fine, this should be working

Have you tried reducing the replicates? I was thinking that, since you have very few positive outcomes, at some point the replication process will sample only 0's. I really don't know how STATA handles this cases, so this could be a problem. Some literature claims that 200 or 300 replicates could be good enough to estimate this type of variances.

The other alternative I can think of is using the prefix bootstrap:. This is usually not recommended, since the option vce(bootstrap) may be a better alternative, but in your case, you could try it to see if it can solve this problem.

Don't you have missing data or something that may be causing troubles in your dataset? Haven't you subset the data? I assume your dependent variable is correctly coded as 0 and 1, isn't it? Other than that I'm blank.