# Sample size for rare events

#### NN_STAT

##### New Member
Hello

I need to calculate the sample size for a study with the primary objective being to test the efficacy of a new treatment. The failure rate of the existing treatment is only 5%...

the developer claims that the new treatment will have a failure rate of not more than 2%, perhaps even 1%. However, if I use proportions difference, the difference is 3%-4%, which yields a sample size of hundreds if not thousands of patients to be examined. This is not practical.

I read that there are ways of reducing the sample size, by changing the question, for example, by using ratios or odds ratios instead of differences, however I am not sure I understand why.

If I use ratios, then not only that I need the sample size formula, I also need to know how to analyze it later (the regular hypothesis testing for proportion diff won't work)

thanks...

#### ichbin

##### New Member
I agree that the naive estimate for the required sample size is of order 1000, since error bars will fall like $1/\sqrt{N}$ and you want to detect a 3% difference and $(1/0.03)^2 = 1111$.

I went to the odds ratio calculator at http://www.meta-numerics.net/Samples/ContingencyCalculator.aspx and ran a few scenarios, trying to see how small I could make the sample and still get a 95% confidence interval on log odds that excluded zero, assuming the claimed success and failure rates of 5% and 2%. A sample size just over 600 is as small as I could go. That's less than the naive estimate, but I think the difference is just because the naive estimate messes all sorts of order 1 factors, not because measuring an odds ratio fundamentally requires lower sample sizes than measuring differences of means.

With a sample size of 600, you are looking at 15 failures of the old treatment and 6 failures of the new treatment. It's hard to imagine being able to confidently say anything about relative failure rates with less than about 20 total failures.

#### NN_STAT

##### New Member
first of all, let me thank you for the help and link, this calculator is pretty helpful !

I used some software for sample size, and in the help they say to use OR when the event is rare. When I compare 0.02 and 0.05 (the proportions), I get a difference of 0.03, which is very small, but an OR of 2.58. I ran a sample size analysis and got this output:

A sample size of 106 achieves 80% power to detect an odds ratio (odds1/odds0) of 2.5 using a
one-sided binomial test. The target significance level is 0.05. The actual significance level
achieved by this test is 0.0398. These results assume that the population proportion under the
null hypothesis is 0.05.

I am not too familiar with the Binomial test, how is it related to the OR ?