+ Reply to Thread
Results 1 to 1 of 1

Thread: P-values for leave-one-out accuracy (on imbalanced label set)

  1. #1
    Points: 4, Level: 1
    Level completed: 7%, Points required for next Level: 46

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    P-values for leave-one-out accuracy (on imbalanced label set)




    Dear board members,

    I am happy to have found this very nice community :-) I have a question and I hope that someone can help me.

    I want to evaluate a threshold-based classifier in the leave-one-out setting. The classifier assigns samples to two different classes, depending on whether or not a certain numerical feature of the sample exceeds a previously learned threshold.

    Our aim is to distinguish between organisms that have a certain ability (phenotype+ class) and those which have not (phenotype- class). Our labeled data set is comprised of 150 organisms, splitted into 50 phenotype+ and 100 phenotype- organisms. Therefore the set is imbalanced.

    Let's say the classifier made 140 correct classifications for the 150 held-out test samples in LOO. I wanted to use a binomial test for computing a p-value for observing this result, assuming a random choice behaviour of the classifier, i.e. a success probability of 0.5.

    I used the R-method bin.test(#successes, #trials, probability) as follows:

    Code: 
    bin.test(140, 150, 0.5, alternative="greater")
    But I am not sure if this is correct, because of the imbalance of the label set. Does the difference in size of the two classes matter? Obviously, a classifier that prefers to predict the phenotype- class would yield a better success rate.

    An alternative would be to define the success probability as 50 / 150 = 0.333, i.e. the fraction of the phenotype+ samples in the set. However, this should also be false, because it is also a success when the classifier correctly identifies a phenotype- organism.

    How can I improve the approach? I hope someone could help me. In case something is unclear, please ask me.

    Best regards
    Bastian
    Last edited by polynom; 02-19-2014 at 01:31 PM.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats