I work a biomedical lab, but I haven't had a statistics course in quite some time. It's important that our data have a p < 0.05. The question can be simplified by replacing biological components with 'particles', which I've done:

A tube contains 20,000,000 particles in total. Of these particles, 40,000 are blue and 19,960,000 are red. So, for every 500 blue particles, there is 1 red particle. If I were to remove 50,000 particles at random, what is the probability that I will obtain 80 red particles and 39,920 blue particles (the 'true' ratio of the population in the sample tube)?

How many times would I need to repeat this measurement, so that the averaged ratios (red:blue particles) represented the true ratio, with a p < 0.05?

If possible, please show me how your calculated this, so I can repeat with different values (for instance, if I were to take 100,000 particles per each sampling).

hi,
in theory you have a binomial distribution, where p is the probability of picking a red particle, N is the total number of particles. Because p is really small and N very large you can use the Poisson approximation - and this web-page to get the numbers http://faculty.vassar.edu/lowry/poisson.html

As for the sample size use this page: http://www.select-statistics.co.uk/s...tor-proportion You will need to decide with what precision you need to measure the true proportion i.e. if your original estimate is 2.5 percent then you might want to measure with a precision of +/- 0.1%. Once you input this it gives you the number of particles to test.

Thanks Rogohel, this was very helpful. I have one question, though: on the sample size calculator, the recommended sample size decreases with decreasing sample proportion.

I'm not sure I understand this. Why are fewer measurements needed when a rare event is being measured? I assumed sample proportion would be the number of 'red particles' / 'total particles' x 100. Or in my example, 0.2%. The calculator says I would only need to make 4 measurement to accurately measure this. If I decrease it to 0.1%, I only need to make 2 measurements. This seems unlikely to me - if we look at the poisson distribution, aren't I more likely to pipette a number of red particle that does not represent the true population proportion?

Furthermore, why doesn't this calculator take in to account the fact that I will be sampling 10,000 particles at a time? Does this not affect the calculations?

hi,
I think some settings might not be right in the way you used the program. It should give you the number of particles you test , something in the order of magnitude of 100 000. Then, if you know you will test batches of 10 000 then you will have to translate the raw number to the number of test batches.