Consider two classifiers A and B giving binary labels to a big set of candidates, say, we have a million cats and a million dogs for labeling. Assume the two classifiers give independent predictions. Now if classifier A gives a list (500,000) of candidates that are most likely to be dogs (top 500,000), and classifier B gives a list (500,000) of candidates that are most likely to be dogs. We define the precision of the list (the proportion ) given by A to be P_a, and the precision of the list given by B to be P_b. My question is, what will be the precision of the intersection of the two lists (say 40,000 candidates in common). Intuitively speaking, will it be more likely that the intersection are dogs, or they are more likely wrongly labelled, i.e., they are actually cats.
(My guess is that the precision of the intersection P_i will fulfill the equation, (1-P_i) = (1-P_a)*(1-P_b), but I am not sure if this is right.)
In pattern recognition and information retrieval with binary classification, precision (also called positive predictive value) is the fraction of retrieved instances that are relevant.
http://en.wikipedia.org/wiki/Precision_and_recall
Tweet |