- Thread starter Marcmancinelli
- Start date
- Tags sample size

It depends very much on what you are trying to accomplish. For a hypothesis test (2-sample t-test) to detect a delta 3 x the std dev, a sample size of 5 might be sufficient, while to detect a delta of 0.1 x std dev might require over 100. If you only want to assess the population mean, 30 might be sufficient depending on how tight a confidence interval is required.

There is a general rule some people use for normality assumption, which is a sample size of 30. As alluded to, this may not be related to power, just possibly normality of residuals. Unsure but would assume it came about via simulations.

This page says "at values of ν [the sample size] as small as 10 or 12, the graphs of [t pdf] are nearly indistinguishable from graphs of the standard normal probability density function, and by the time ν is as large as 29 or 30, results using the t-distribution agree with results from the standard normal distribution to within a percentage point or two, and so statisticians tend to use the standard normal probability tables in place of t-tables whenever the value of ν is larger than 29 or 30."

So they use a benchmark of "a percentage point or two" difference in pdf?

At n=30, the difference in CDF between the t and normal is 0.005 (as seen here).

So n=30 is good. Are you wondering why not n=10?

I always thought this was determined empirically as compared to theoretically.

Dason posted a joke about this topic once, a professor telling a student that at about 30 samples the data was equivalent to infinity The problem I have always had with this issue, is that it suggests you have 30 seperate samples when in fact you almost never will do this. You will have so many cases. As trinker noted the true issue is not normality, commonly, but statistical power although they seem different issues to me.