PDA

View Full Version : Minor question on sample sizes

JoynerCN
06-17-2010, 11:44 AM
Pardon the double-posting -- I'm not sure what this forum's culture is on creating new threads for new topics, but I tend to try to keep each topic within one question. If I should add on to an open topic in the future, let me know :)

As I mentioned, I'm tutoring a student in Probability/Statistics, and there's also one question on their latest assignment that I consider a little ambiguous:

Basically, given a population mean of 11 and a standard deviation of 3 to describe some set of data, the question is, "What size sample would be required to compute probabilities regarding the sample mean using the normal approximation?"

Considering they haven't yet covered confidence intervals or other things, I'm not sure what sample size might be 'required'. The bigger the better, naturally, but I don't see how there is a minimum threshold in this instance. Any suggestions?

BGM
06-17-2010, 12:07 PM
I just have little knowledge about CLT. Here is some of my guesses.

The convergence rate is some how related to the third moments (skewness)
and also the higher moments.

In a proof of CLT, we just simply Taylor Expand the characteristic function and argue
that the series will converge to the standard normal characteristic function.
http://en.wikipedia.org/wiki/Central_limit_theorem#Proof

So if the remaining error terms (mainly dominate by the third term) is significant,
the convergence rate will be slower.

For example, for normal approximation for binomial distribution, when p = 0.5,
the distribution is symmetric and in this case the convergence is fastest.
If p is close to 0 or 1, the distribution is highly skewed and you will probably need
a larger n to see an acceptable bell shape.

http://en.wikipedia.org/wiki/Berry-Esseen_theorem

Dason
06-17-2010, 12:13 PM
What level course is this? Is it an intro course or is it more mathematically rigorous?

JoynerCN
06-17-2010, 02:43 PM
As intro as intro can be. Does that alter what you suspect the answer is?

Dason
06-17-2010, 05:04 PM
Well in a lot of intro stats classes that I've seen they give the general rule of thumb to have a sample size of n>=30 before trying to apply the CLT. This is very general and gives really no intuition for why it should be so but in intro classes sometimes that's how they operate...

In actuality it depends on the underlying distribution so if you don't know what that is then the best you can do is use some arbitrary rule of thumb.

JoynerCN
06-21-2010, 10:33 PM
Yep, that's what it was -- thanks, Dason.