Rare event probability

#1
Hello,
I am stuck with a probability problem regarding rare events:

lets say I want to automatically detect cars passing by with a camera. I suppose that works pretty well already, meaning that the event of an error is rare (lets say 1/100).
I want to find out how much I have to drive around (or better: how many cars I have to detect) to make assumptions about my error rate. Also I would like to comment on the significance of this assumptions. Simply speaking: how many samples (cars) do I need to come to which sinifiance of an error rate.

When I looked at literature, I always come across hypothosis testing such as t-tests and alike. However, I think I need another approach since I am not looking for a mean. When I do not find an error for 100 events, surely my error rate does not have to be 0.
I came across this paper (https://www.ling.upenn.edu/courses/cogs502/GoodTuring1953.pdf) clearly stating that r/N (r=events, N=total samples) doesn't make much sense if r is very unlikely.

I hope someone can help me out with ideas on how to tackle the problem.
Thank you.
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
There are two types of possible errors, false positives (slow cars ticketed) and false negative (fast cars not ticketed). How you though about how this impacts your analyses?
 
#5
Thank you for the replies.
Fisher's exact test I will have a look at, is new to me, thanks for the hint.
The accepted margin of error is not fixed. Ideally I want to learn about the approach and than be able to determine for example "for a certain confidence interval (lets say 90%) I need to detect X cars".

The two types of error I have taken into account. the general problem applies to both error I would say, however, counting "cars" rather than time(or distance) makes more sense to false negatives, right? There were X cars and I missed Y of those.
 

katxt

Active Member
#6
This may be useful. There is a statistical rule of thumb called the rule of three which says that if you have N successes without a failure, you can be 95% sure that the error rate is less than 1 in N/3. So, if you observe say 600 cars without a miss, you can be 95% sure that the error rate is less than 1 in 200. Conversely, if you want to be 95% sure that you are missing less than 1 car in 500, you need to have 1500 successes without failure.