# Thread: Confidence intervals for proportions: approximating a discrete distribution with a co

1. ## Confidence intervals for proportions: approximating a discrete distribution with a co

I saw on this website http://onlinestatbook.com/2/estimati...ortion_ci.html

The following quote about calculating a CI for a proportion:

To correct for the fact that we are approximating a discrete distribution with a continuous distribution (the normal distribution), we subtract 0.5/N from the lower limit and add 0.5/N to the upper limit of the interval.
Giving:

Where (it appears) is the sample size.

Another website:

http://stattrek.com/estimation/confi...px?Tutorial=AP

has the approximation simply as:

Where is the sample size.

The first approach can give values larger than 1 or smaller than 0. Here the vector has all 1s though it was possible to have gotten a zero. The result with te first formula above gives the following CI:

Code:
set.seed(10)
x <- sample(0:1, 100, TRUE, c(.001, .999))

[0.995, 1.005]
This seems bad (> 1). Do we indeed need to:

correct for the fact that we are approximating a discrete distribution with a continuous distribution
With...

2. ## Re: Confidence intervals for proportions: approximating a discrete distribution with

In this instance you have specifically chosen probabilities p=0.001 that are extreme compared to sample size 100. The expected number of succeses is ceiling(99.9)=100. When approximating a binomial the poisson dist. is better for "very small" p and the normal will work best for p around 0.5. The way I see it you are choosing p=0.001 thereby creating a problem for the normal distribution approximation that more relates to the problem of approximating a bounded distribution taking values in 0,1,..,100 with an unbounded -inf,inf than it relates to the problem of approximating something discrete with a continuous distribution. The continuity correction does not solve the first type of these problems and the fact that the problem exists is no news.

So to answer the question I guess you could do a simulation study on samples where you in the first place actually would use a normal distribution or where the problem of approximation is a problem relating to what the correction is intended to correct.

As an alternative you could simple invent the Trinker continuity approximation where in the case you get a result below 0 you round up to zero and in the case you get a value above 1 you round down

3. ## The Following User Says Thank You to JesperHP For This Useful Post:

trinker (02-16-2016)

4. ## Re: Confidence intervals for proportions: approximating a discrete distribution with

I believe I recall seeing this correction before, but I think Jesper hit the nail on the head.

5. ## Re: Confidence intervals for proportions: approximating a discrete distribution with

Not sure if this would be of interest:

http://www.r-bloggers.com/confidence...r-proportions/

 Tweet