+ Reply to Thread
Results 1 to 11 of 11

Thread: When the central limit theorem works...and when it doesn't?

  1. #1
    Fortran must die
    Points: 57,827, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,521
    Thanks
    689
    Thanked 915 Times in 874 Posts

    When the central limit theorem works...and when it doesn't?




    I had always thought that given say a 100 cases the central limit theorem always worked. But today I read (in a book about multilevel analysis) that...

    The central limit theorem holds in practice..if the individual variances are small compared to the total variance. For example the heights of women in the United States follow an approximate normal distribution. The central limit theorem applies here because the height is affected by many small additive factors. In contrast, the distribution of heights of all adults in the United States is not so close to normality. The central limit theorem does not apply here because there is a single large factor -sex- that represents much of the total variation.
    I actually did not think the Central Limit Theorem ever applied to raw data. I thought it applied to the distribution of the statistic.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  2. #2
    Devorador de queso
    Points: 94,135, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent PosterActivity Award
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,900
    Thanks
    306
    Thanked 2,621 Times in 2,237 Posts

    Re: When the central limit theorem works...and when it doesn't?

    In that example they're considering the single variable to be a sum of a lot of different factors. Like how you might consider the amount of time it takes to drive to work to be the amount of time it takes to drive from home to point A, from point A to point B, from point B to point C, and from point C to work.

    The amount of time it takes me to write this reply can be thought of as the sum of the amount of time it takes to write each word. Sometimes people like to think about their data as derived from other variables in this fashion.
    I don't have emotions and sometimes that makes me very sad.

  3. The Following User Says Thank You to Dason For This Useful Post:

    noetsi (03-20-2017)

  4. #3
    TS Contributor
    Points: 12,003, Level: 71
    Level completed: 89%, Points required for next Level: 47
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,450
    Thanks
    159
    Thanked 330 Times in 310 Posts

    Re: When the central limit theorem works...and when it doesn't?

    hi,
    the CLT is talking about the sum of many independent random variables. This can be raw data if it results from the additive effects of many small influences or a statistic such as the mean.

    I bet, the mean height of inhabitants in the US would follow a normal distribution - e.g. if you took random samples of 100 people and calculate the mean height of each group. The individual heigths would not because they are not the result of the small effects of many random variables, there is one variable that has a large effect, sex.

    1 minute too late

    regards

  5. The Following User Says Thank You to rogojel For This Useful Post:

    noetsi (03-20-2017)

  6. #4
    Fortran must die
    Points: 57,827, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,521
    Thanks
    689
    Thanked 915 Times in 874 Posts

    Re: When the central limit theorem works...and when it doesn't?

    Quote Originally Posted by Dason View Post
    In that example they're considering the single variable to be a sum of a lot of different factors. Like how you might consider the amount of time it takes to drive to work to be the amount of time it takes to drive from home to point A, from point A to point B, from point B to point C, and from point C to work.

    The amount of time it takes me to write this reply can be thought of as the sum of the amount of time it takes to write each word. Sometimes people like to think about their data as derived from other variables in this fashion.
    I usually don't think about variables like height this way, although I can see that it makes sense. But I also did not realize that one variable being influenced by many others, had anything to do with the central limit theorem.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  7. #5
    Fortran must die
    Points: 57,827, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,521
    Thanks
    689
    Thanked 915 Times in 874 Posts

    Re: When the central limit theorem works...and when it doesn't?

    The CLT is talking about the sum of many independent random variables. This can be raw data if it results from the additive effects of many small influences or a statistic such as the mean.
    Given that nearly anything is the sum, or influenced by, many factors wouldn't this make the CLT apply generally? And clearly many variables have highly abnormal distributions.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  8. #6
    Points: 3,006, Level: 33
    Level completed: 71%, Points required for next Level: 44

    Posts
    177
    Thanks
    1
    Thanked 29 Times in 29 Posts

    Re: When the central limit theorem works...and when it doesn't?

    Quote Originally Posted by noetsi View Post
    I had always thought that given say a 100 cases the central limit theorem always worked. But today I read (in a book about multilevel analysis) that...

    I actually did not think the Central Limit Theorem ever applied to raw data. I thought it applied to the distribution of the statistic.
    Since when does the central limit theorem not apply to bimodal population distributions?

    I am not understanding what the authors of that book are trying to imply. If I were to take n samples from a bimodal distributed population, it would most certainly converge at a normal distribution as n approached infinite.

  9. #7
    Points: 1,652, Level: 23
    Level completed: 52%, Points required for next Level: 48

    Posts
    222
    Thanks
    35
    Thanked 68 Times in 59 Posts

    Re: When the central limit theorem works...and when it doesn't?

    Quote Originally Posted by noetsi View Post
    I had always thought that given say a 100 cases the central limit theorem always worked.
    The central limit theorem has various forms and "works" (or not) under varying circumstances. For example, there are some cases where the CLT doesn't hold, irrespective of the sample size, as in the case of a standard Cauchy random variable. The determination of a "large enough" sample will also depend on how much the underlying distribution deviates from normality. Sometimes 20-30 observations are enough, and some times thousands of observations are needed.

  10. The Following User Says Thank You to ondansetron For This Useful Post:

    CowboyBear (03-22-2017)

  11. #8
    TS Contributor
    Points: 18,619, Level: 86
    Level completed: 54%, Points required for next Level: 231
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,056
    Thanks
    120
    Thanked 426 Times in 327 Posts

    Re: When the central limit theorem works...and when it doesn't?

    This idea - of some variables being the result of the small additive effects of many other random variables - underlies the assumption of normally distributed measurement error in classical test theory.
    Matt aka CB | twitter.com/matthewmatix

  12. #9
    TS Contributor
    Points: 18,619, Level: 86
    Level completed: 54%, Points required for next Level: 231
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,056
    Thanks
    120
    Thanked 426 Times in 327 Posts

    Re: When the central limit theorem works...and when it doesn't?

    Quote Originally Posted by ondansetron View Post
    The central limit theorem has various forms and "works" (or not) under varying circumstances. For example, there are some cases where the CLT doesn't hold, irrespective of the sample size, as in the case of a standard Cauchy random variable. The determination of a "large enough" sample will also depend on how much the underlying distribution deviates from normality. Sometimes 20-30 observations are enough, and some times thousands of observations are needed.
    Yep. I'd add that that the idea of a minimum number of observations isn't about making sure the CLT will "work". Take a case where you are calculating the mean of a set of independent random variables. What the CLT says is that as the number of variables in the set of independent random variables you're averaging increases, the sampling distribution of the mean will converge towards a normal distribution. When people say you can invoke with CLT with X number of cases, what they mean is that with this number of cases you can be reasonably sure the sampling distribution of the statistic will be approximately normal - due to the CLT. It's not a case of the CLT being invalid with small sample sizes and valid with large ones.
    Matt aka CB | twitter.com/matthewmatix

  13. #10
    Points: 1,652, Level: 23
    Level completed: 52%, Points required for next Level: 48

    Posts
    222
    Thanks
    35
    Thanked 68 Times in 59 Posts

    Re: When the central limit theorem works...and when it doesn't?

    Quote Originally Posted by CowboyBear View Post
    Yep. I'd add that that the idea of a minimum number of observations isn't about making sure the CLT will "work". Take a case where you are calculating the mean of a set of independent random variables. What the CLT says is that as the number of variables in the set of independent random variables you're averaging increases, the sampling distribution of the mean will converge towards a normal distribution. When people say you can invoke with CLT with X number of cases, what they mean is that with this number of cases you can be reasonably sure the sampling distribution of the statistic will be approximately normal - due to the CLT. It's not a case of the CLT being invalid with small sample sizes and valid with large ones.
    Definitely good to mention. That's mainly why I used "works" to imply a fast and loose, but likely more common and less correct, interpretation of it. I think people tend to miss the idea that it's not black and white, but rather has to do with the appropriate use of a normal distribution for inferences. In other words, is the sampling distribution you're working with for that fixed sample size reasonably approximated with a normal distribution? If so, we can use some more familiar approaches. If not, we lose some nicer properties and need to look elsewhere for some support. It seems to me like a lot of this gets lost on many people and they boil it down to black-and-white thinking as they've done with the rest of their stats knowledge (if some of it is lost on me, I'm currently unaware, so feel free to point it out!).

  14. #11
    TS Contributor
    Points: 18,619, Level: 86
    Level completed: 54%, Points required for next Level: 231
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,056
    Thanks
    120
    Thanked 426 Times in 327 Posts

    Re: When the central limit theorem works...and when it doesn't?


    Quote Originally Posted by ondansetron View Post
    It seems to me like a lot of this gets lost on many people and they boil it down to black-and-white thinking as they've done with the rest of their stats knowledge (if some of it is lost on me, I'm currently unaware, so feel free to point it out!).
    Yep - some good "psychology of data analysis" in there!
    Matt aka CB | twitter.com/matthewmatix

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats