beating the rules at discrete sampling


TS Contributor
I am looking at the question of how to sample rare occurances, like a rare defect in a manufacturing line. If I use the standard sample size formula the sample size goes into hundreds, which is unreasonable due to the long waiting times.

My idea s to measure the time intervals between occurences.mThis being continous I can get a much smaller sample size to get the average waiting time between occurences. Knowing the production rate I can even estimate the percentage occurences - thus beating the odds.

Does this make sense?


Less is more. Stay pure. Stay poor.
No, I don't get it. You want to get a random sample of just rare events or a random sample of all observations with a set number of the rare events?

I also did not under stand your time approach, but would like to understand it!


TS Contributor
imagine we have a line producing cars.Once in a while a csr comes off the line with a serious defect. I can measure the percentage defective cars and for a given precision I can get a sample size. but because the percentage is small I will need a hogh accuracy and this will lead to a large sample. If I nee 3000 samples and a defect happens once a day . I would need about 10 years to get the sample.

The alternative is to measure the waiting time between defects, I do not need a such a high accuracy and it is a continuous variable - I might get a sample size of about a hundred say - so I can be done in 4 month.