background:

i have 40 individuals from a population. for each individual i have presence/absence data for 47 potential 'peaks' from RAPD analysis (using 4 different short primers). so each individual essentially ends up with a binary code of 47 digits. i want to know for all 40 individuals how similar they are to each other (e.g. how similar are individual 1 and 2, individual 1 and 3, individual 1 and 4 etc to individual 39 and 40) and the probability that they are significantly similar / significantly different from each other.

reason for using dice: reading through a fair few papers it seems that Dice's coefficient is the most suitable for data gathered in this manner, reducing potential errors incurred from the RAPD runs.

any advice atall or information on where else to look for this information would be really helpful! i've been at this for days now and am starting to worry that either a. there is no way to calculate the probability that they are signficantly similar/different or b. i've got too involved in this hunt and am missing something really obvious!

Many thanks