A bit confused on confidence intervals

trinker

ggplot2orBust
#2
I assume 2 things (1) you're comparing the t distribution to the normal distribution (2) you meant 95% confidence interval (as 1.96 is the critical value for a 95% CI of a normal distribution) he sample size determines these values. Because the variance of the sample is unknown it is estimated from the sample. This cause the tails of the t distribution to be heavier than a normal distribution at lower sample sizes. At larger sample sizes (say 150+) the critical value at the 95% (I think this is what you meant) is closer to the normal distribution (pretty much aproximates it) as seen below.

Code:
> sample_size <- 150
> df <- sample_size - 2
> df
[1] 148
> qt(.95, df)
[1] 1.655215
 
#3
Thanks for your response trinker. So you are saying that the sample size determines these values, why do most people people use a 95% confidence interval versus a 99% confidence interval? Wouldn't it make sense to be 99% confident? Sorry for the dumb questions. Statistics is so confusing to me.
 

trinker

ggplot2orBust
#4
The confidence means you're x% confident you're not making a type I error; that being you rejected the null hypothesis when it was true. Generally when you increase this level to 99% you're more sure you're not making a type I error but you increase the likelihood that you're making a type II error; that is rejecting the alternative hypothesis when it is true. As far as the 95% and 99% these are more conventions that were used early on by Fisher that seem to have stuck though there's no hard and fast reasoning as to why we use these other than these were the values the statisticians of those days were able to publish in books for lack of space. Modern computers make this notion a moot point yet we hold fast to the alpha = .05 or .01 though we have no real reason for doing so. This is where confidence intervals are useful over p-values in that they allow us to gather more information about the variability of the model and where the true value may lie.

As far as sample size determining these values...

I'm not sure what these values you're referring to. The probability comes from the researcher (they're pre set alpha level or willingness to make a type I error). In a normal distribution sample size doesn't affect the critical values. In a t, F, \(\chi^2\) distribution the sample size will affect the critical value given the confidence level. If we can assume the sample we're drawing from comes from a normally distributed population we do not have to take into account the sample size. This is often not the case in reality.