For the past few days, I have been trying to compute a p-value for some experiment I ran recently, and I simply cannot figure it out.

The experiment is very simple:

- Given a population of 2000 individuals, that can belong to one of N different classes

- Given a distance metric between them

- I use a leave-one-out method to compute the prediction accuracy I can obtain using a nearest-neighbor method (i.e. given one individual I predict its class based on the class of the closest individual)

- This gives me a % of correct predictions.

Now, what I can't figure our is whether I should compute a p-value for each individual prediction or one for the whole experiment.

My first guess is that:

- my "null hypothesis" is that the distance metric and the class of the individuals are not related

- my "test statistic" is the % of correct predictions

- so the p-value is the probability of getting a % of correct predictions as high as the one I have by pure chance.

Is this correct?

Thanks in advance for any help I can get on this!