Question about ROC analysis


I'm sorry if this question is obvious to some of you. It's just that I have a test that I would like to analyse using Receiver Operating Characteristic and I'm not sure how to proceed.

I have a test. Below a certain threshold (80%) the test results are basically random (true or false of the test are pretty meaningless). Above the threshold, the results become meaningful.

I thought that what I should do is delete the results below the threshold of 80% and to work with the meaningful results. If I delete below the threshold, I have 635 positive test results and 13 negative. This means that above the threshold of 80% where the test starts to become meaningful, the test is true 98% of the time.

However, I plotted the data above 80% in SPSS and I get a strange looking ROC curve. Here is what it looks like...
View attachment 3713

Now what I have done is used the use the test variable values above 80% (the test variable is a percentage) and ignored the values below 80% which are pretty random. I have used a state variable of 1 for true test result and 0 for false test result...

I was hoping that someone could throw some light on this problem. Why is the curve like it is and how do I use ROC with this particular test...????

Many thanks for any help...

Don't delete anything. Try to understand what the ROC curve is really about: you probably have a model with dependent y which assumes values 0 and 1. Once you fitted the model and claculated predicted values you otbain values between 0 and 1, but never exactly 0 or 1. the ROC curve should help you understand which is the threshhold that should make you choose between 0 and 1 when predicting values. I.e. you try do predict yi and get as a result 0.7. Now should you conclude that yi = 1 or yi=0 ? That question is what the roc curve is for.


Less is more. Stay pure. Stay poor.
I agree that you should not delete any data. Plus it seems like it would be difficult to tell people, I am giving you a test, if you get low on it (which you would not know ahead of time), then I shouldn't have given you the test. It may help us if you provide some more information. You don't have to give away your secrets, but is this a bioassay that low levels equals health issues and high levels equal issues, but middle ground is not predictive. Knowing this may help us make suggestions. For example, persons with extremely low and high body mass indexes are probably at greater risk of death, but those in the normal zone may or may not be at risk given their other risk factors.


Less is more. Stay pure. Stay poor.
Plus you don't have confidence intervals on this curve. Perhaps low and high values are very predictive for outcome and middle values are very predictive for not having outcome. Without confidence intervals we don't know if any of these values exclude 0.5 or are really relevant.