+ Reply to Thread
Results 1 to 8 of 8

Thread: Normal distribution

  1. #1
    Points: 48, Level: 1
    Level completed: 96%, Points required for next Level: 2

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Normal distribution




    Hi guys,

    I'm a newbie here. My question is the following.

    Are the following set of values normally distributed?
    26, 33, 65, 28, 34, 55, 25, 44, 50, 36, 26, 37, 43, 62, 35, 38, 45, 32, 28, 34

    The above values are from the below link
    https://www.mathsisfun.com/data/stan...tribution.html

    They go on to compute the mean and standard deviation and the corresponding z scores assuming they are normally distributed.

    However when i plotted the values on a histogram using excel, i get the following chart(Attached image) which shows a positive skewness and we know that a normally distributed set of observations has no skewness at all i.e its perfectly symmetrical.





    Do we need to transform the data-set into normally distributed values before calculating the mean , standard deviation and the z scores ? ...since in real world situations , data-sets may not be normally distributed , then how do we go ahead to perform statistical tests on them.
    Attached Images  

  2. #2
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Normal distribution

    Way to question the establishment!


    Well I know what you mean, the histogram is some what skewed. I ran normality tests on the data, and these data pass 3 of the standard tests and the 4th was very close to being passed. Ideally you want a pretty symmetric shape, but that is never really the case. You could have some fun and try to transform these data if you wanted. The issue, as you mentioned can come from placing confidence intervals or interpreting dispersion measures. So you say 68% data within +/- 1 standard deviation, 95% within +/- 2 SD, etc. These data are probably not egregious, you can also look at the QQplot, which also shows departures from normality.
    Stop cowardice, ban guns!

  3. #3
    Points: 3,506, Level: 37
    Level completed: 4%, Points required for next Level: 144

    Posts
    129
    Thanks
    3
    Thanked 26 Times in 26 Posts

    Re: Normal distribution

    I agree these data are not terribly skewed (although they have a kind of odd shape, which is uniform for much of the range).

    But if they converted ALL the values to z scores were they actually interested in the percentiles and such (68% etc...)? Or was the purpose to standardize the distribution to a mean of 0 and SD = 1? The standardization will always work if you convert a set of data to z scores. It does NOT makes them any more normally distributed. But it sure will give them a mean 0 and SD 1, which can be useful for comparison to other datasets that may be in different units, say, even if neither is terribly normal.

    OK I just clicked the link (should have done that first) and they actually ARE using data like these (but a different example) for individual decisions, like failing a student who gets below 1 SD from the mean. I don't use that rule and would be uncomfortable with it unless the data was fairly normal. The article doesn't say why the first dataset is being standardized -- they just do it.
    Last edited by EdGr; 03-29-2016 at 03:03 PM. Reason: Further thoughts

  4. #4
    Points: 3,927, Level: 39
    Level completed: 85%, Points required for next Level: 23

    Posts
    85
    Thanks
    1
    Thanked 1 Time in 1 Post

    Re: Normal distribution

    Quote Originally Posted by hlsmith View Post
    Way to question the establishment!


    Well I know what you mean, the histogram is some what skewed. I ran normality tests on the data, and these data pass 3 of the standard tests and the 4th was very close to being passed. Ideally you want a pretty symmetric shape, but that is never really the case. You could have some fun and try to transform these data if you wanted. The issue, as you mentioned can come from placing confidence intervals or interpreting dispersion measures. So you say 68% data within +/- 1 standard deviation, 95% within +/- 2 SD, etc. These data are probably not egregious, you can also look at the QQplot, which also shows departures from normality.
    For my information, can you tell me which tests you used to determine normality?

  5. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Normal distribution

    There are formal test for skew. Plus it probably would help to run a qq plot

    None of the formal test for normality are very good. They all have power issues. A qq plot is the best I have found to assess this.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  6. #6
    Points: 3,006, Level: 33
    Level completed: 71%, Points required for next Level: 44

    Posts
    177
    Thanks
    1
    Thanked 29 Times in 29 Posts

    Re: Normal distribution

    Quote Originally Posted by noetsi View Post
    There are formal test for skew. Plus it probably would help to run a qq plot

    None of the formal test for normality are very good. They all have power issues. A qq plot is the best I have found to assess this.
    ah yes, the fun of the theoretical gaussian distribution...

  7. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Normal distribution

    The fun of coming up with a test you get credit for that does not in fact work, but gets used all the time in both statistics and the work place. Over the years, as a data analyst rather than a statistician, I have grown increasingly concerned just how often well known tests have serious flaws - but these flaws are commonly only known to the statistical community (which does not include many who use statistics).

    The Durbin Watson test (in its most common form) is another example of this. Its used all the time and it has serious issues...which I suspect many who use it have never heard of.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  8. #8
    Points: 3,006, Level: 33
    Level completed: 71%, Points required for next Level: 44

    Posts
    177
    Thanks
    1
    Thanked 29 Times in 29 Posts

    Re: Normal distribution


    This issue is compounded further by the fact that widely used statistical programs use generalized estimators due to computing efficiency. You sacrifice some measure of accuracy but save computing power.

    But yeah, its a well known issue within the field of statistics. The limitations of quite a few of the current common modeling procedures.

    The truth is, though, modern statistics has become fairly sophisticated with dealing with those issues. The concern is that there is a lag between what is going on in the world of academic statistics and what is going on in industry. In general, my estimation is that there is about a 20 year gap between the stats department of a R1 university and other departments (or even greater in some cases). This gap is even greater between academic statistics and widespread industry. My guess is probably about 50 years for the vast majority of companies using statistics.

    The other issue is that due to this gap, there is a lot of the underlying "why we do it" thats lost on its way to becoming "how we do it" on the way from academia to industry.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats