ROC curve analysis - Implications of unequal no. of observations in each group


I'm using ROC curve analysis to identify thresholds for accelerometer (activity monitor) raw count data in order to classify count data as physical activity intensity (e.g. a count value of >600 would indicate moderate activity etc.).

Because of methodological limitations I have 18 observations for light activity and 30 observations for moderate activity. I plotted an ROC curve, and developed a threshold and accompanying sensitivity, specificity and AUC values. I'm sure that there is some implication of having an unequal number of observations in each group on the sensitivity, specificity and AUC values (i.e. if there were equal numbers in each group the values would be higher/lower) but am not sure what the implication is and why.

I'd appreciate anyone's thoughts on this matter. Thanks in advance.


Omega Contributor
I believe it may be more appropriate for you to say you have continuous data from accelerometers, not count data. Count data has a specific meaning. Please correct me if I am wrong about your data.

Also what is your gold standard, that is not clear. You are taking your accelerometer data and saying if it is >600 vigorous and lower than that, moderate. How do you test whether the activity really is moderate or vigorous? Also what is the true prevalence of moderate and vigorous exercise in your sample based on the gold standard?

Once you answer these questions it should be easy to address your concern. Which I am guessing will not be an issue.
Thanks for your reply. You are right to say that I have continuous data from accelerometers. The unit of the data is counts per minute. Apologies for the lack of clarity.

The gold standard is oxygen consumption (VO2) (measured with an indirect calorimeter). Based on the gold standard I have 18 bouts of light activity and 30 bouts of moderate activity.

Based on the threshold derived from ROC analysis I have 26 true positives and 4 false negatives. And 15 true negatives and 3 false positives.
Last edited: