# Computing Confidence Intervals

#### JohnM

##### TS Contributor
Before we get into examples of the confidence interval, a short comment about interpreting what a confidence interval is.....it is not to be confused with a probability interval. In other words, if we have a sample mean of 10.2, and we compute the 95% confidence interval to be (9.65, 10.75), we CANNOT say that there is a .95 probability that the true population mean is within (9.65, 10.75).

The proper interpretation is that, if we draw many, many random samples of sample size n from a population, and compute the confidence interval around x-bar (the sample mean), then in the long run, 95% of the confidence intervals will contain the true population mean.

========================================================

1. Confidence Interval for the Mean, Large Samples

When we use the term "large samples," we are saying that the sample size is "large enough" to use the normal distribution as a model - in other words, the distribution of sample means will very closely follow a normal distribution. Traditionally, if the sample size is >= 30, then we consider it "large enough."

The formula for a large sample confidence interval around a mean is:

x-bar +/- [ z * s/sqrt(n) ]

where:
x-bar = sample mean
s = sample standard deviation
n = sample size
z = two-tailed z score for the desired level of confidence

2. Confidence Interval for the Mean, Small Samples

When we use the term "small samples," we are saying that the sample size is "too small" to use the normal distribution as a model - in other words, the distribution of sample means will not follow a normal distribution.

In ths situation, the distribution of sample means will follow a t distribution, which changes shape (becomes more "normal") as the sample size increases. It starts out with a shorter peak and higher tails, and as the sample size increases, the peak becomes taller and the tails taper out.

The formula for a small sample confidence interval around a mean is:

x-bar +/- [ t * s/sqrt(n) ]

where:
x-bar = sample mean
s = sample standard deviation
n = sample size
t = two-tailed t-distribution score for the desired level of confidence, AND the number of degrees of freedom

3. Confidence Interval for the Proportion, Large Samples

As long as n*p > 5 and n * q > 5, the sampling distribution of the proportion follows the normal distribution.

p +/- [ z * sp ]
where p = sample proportion
q = 1-p
sp = standard error of the proportion = sqrt(pq/n)
z = two-tailed z score for the desired level of confidence

4. Confidence Interval for the Standard Deviation

The sampling distribution of the standard deviation follows a chi-square distribution.

lower confidence limit = sqrt[ (n-1)*s^2 / X^2(a/2) ]
upper confidence limit = sqrt[ (n-1)*s^2 / X^2(1-(a/2)) ]

where n = sample size
s = sample standard deviation
a = alpha level
X^2(a/2) = chi-square statistic at a/2, with n-1 degrees of freedom
X^2(1-(a/2)) = chi-square statistic at 1-(a/2), with n-1 degrees of freedom