cross-validated AUC


TS Contributor
Hello All,
I hope the thread's title makes sense to you. I need to perform internal Cross-Validation using k-fold CV (needless to say, to assess how well a model behaves in relation to 'unknown' data).

What I am after is getting the distribution of AUC values across the different folds. So far, I did not found a viable option. I mean, there are some packages that perform different sorts of CV, but no one of them (at the best of my understanding) return what I want.

One that I found quite easy to use if the DAAG package, whose CVbinary() function performs k-folds CV and returns the cross-validation estimate of accuracy. The latter, as far as I understand, is the average of the accuracy across the k-folds (using 0.5 as cutoff point on probabilities).

What I would like to have is something similar, but with the averaged AUCs instead of the averaged accuracy values.

Long story short: do you know of any package that does something like that, or can you provide some help in writing down some piece of code to help me implementing what I am after from scratch?

Thank you for any guidance you will provide.



Super Moderator
Have you tried the function:

in the boot package.

It does k fold cross validation and may have some of the arguments that you require.

Failing that, have you tried the package "cvAUC"? (I have not used that one).


TS Contributor
thanks for pointing out cv.glm from boot package. I was wondering what is the interpretation of the returned delta values.
As for the cvAUC, I did not manage to have it work properly: I can't get the AUC for the various (say, 10) folds. I keep getting the AUC for just one fold :-(

I did not get your question? Sorry


Omega Contributor
You said averaged accuracy and averaged AUC, but those terms are usually interchangeable. I was confused by your statement based on that.


TS Contributor
when I used 'accuracy' I was referring at the output of the DAAG package (command: CVbinary): it returns the accuracy which is the percentage of the correctly classified cases out of the total of the cases. This can be easily calculated from a confusion matrix. In this case, the accuracy depends on the cutoff threshold on probability. As far as I understand, AUC does not depends on a specific cutoff value and, indeed, in the dataset I was playing with, accuracy (50% cutoff point) was 85% while AUC was 0.917.


New Member
Ciao, non so se posso aiutarti, ma l'argomento mi interessa e sto cercando di capirci qualcosa di più anche io.

Se ho capito bene tu hai i tuoi 10 gruppi in cui hai spezzato il dataset e vorresti avere l'auc medio dei gruppi, che poi penso sia l'auc del modello crossvalidato, è giusto?


TS Contributor
Yes Greta, that would be a possibility. The package cvAUC provides the option to calculate the auc in the context of k-fold CV (i.e., getting 10 auc and their average), but I di not manage to put it to work.