How to consider false positive rate when there is only one true positive?

#1
I'm new to this forum and also not too well versed in statistics. We are working on an algorithm to predict the most likely causal gene from a group of genes of size n that only contains one true positive. If we rank all the genes in this group, we have X% probability of predicting the true causal gene in the top 10% of the ranked genes. In this situation where there is only one known true positive, how do we consider false positive rate? Or is there another metric we should consider using?
 

obh

Active Member
#2
What is the meaning of "have X% probability of predicting the true causal gene in the top 10% of the ranked genes"?
Is the false positive rate depends on the true positive rate? if yes how?
 

hlsmith

Not a robit
#3
Sue i am also not following your problem. Please be more specific and provide data. How do you know there is only one true positive?
 
#4
OK, sorry for not being clear. Let me try to use a toy example. Let's say we have a bag with 1 red ball and 200 white balls. The problem is to find the red ball without using our vision (or can't use the color to pick the red ball out). Each ball has many other characteristics (e.g. shape, size, writing on it, etc.), and there are other red and white balls outside of the bag that we can train with. After we've built an algorithm using the red and white balls and their features, we try to rank the balls in the bag based on their likelihood of being red. What do you think is the best way of showing performance?

The way we tried to demonstrate performance was to say we find the red ball X number of times in the top X% of the ranked balls, either based on cross validation or several new bags we haven't trained with. Each bag only has one red ball but can have different numbers of white balls. For example, we applied the algorithm to 10 bags, each bag having 1 red ball and 200 white balls. When we ranked the balls in each bag, and we took the top 10% (20 balls), we got the red ball 4 out of 5 times. So we say we predicted the red ball correctly 80% of the time within the top 10% of the ranked balls. Is there a better way to measure or describe performance? Considering we only have one true positive in each case, is the concept of false positive rate relevant?