+ Reply to Thread
Results 1 to 13 of 13

Thread: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SPSS

  1. #1
    Points: 22, Level: 1
    Level completed: 43%, Points required for next Level: 28

    Posts
    5
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SPSS




    Iím writing a paper on regression analysis in SPSS, and the data set of the outcome variable is assumed to be normally distributed. Itís a constructed data set (N=200?) with possible values 0, 1, 2, 3, 4 and 5. Mean is 2,5, and St.dev=1,3 (?). Is it possible to construct a data set that is normally distributed according to the Kolmogorov-Smirnov and Shapiro-Wilk-tests? What is then the frequency of the values 0, 1, 2, 3, 4 and 5 to achieve such a normality? Iím grateful if anyone can help me.

  2. #2
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    So response variable is an integer from 0-5, with n=200. No you want to know if it is normal or if you can simulate a normal sample given those moments?
    Stop cowardice, ban guns!

  3. The Following User Says Thank You to hlsmith For This Useful Post:

    Morten67 (12-01-2016)

  4. #3
    Points: 22, Level: 1
    Level completed: 43%, Points required for next Level: 28

    Posts
    5
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    Thanks hlsmith! I want to simulate (generate, create) a normal sample given those moments and that is normal distributed according to the K-S-test and S-W-test in SPSS. I have tried different approaches including simulation in Excel (the average of 2, 3, 5 and 10 random generated integers from 0-5) and for "hand". But no of these sets seems to be normally distributed in the two tests in SPSS.

  5. #4
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    I wonder if you could use uniform distribution or just simulate it using normal then round values to nearest integer?
    Stop cowardice, ban guns!

  6. The Following User Says Thank You to hlsmith For This Useful Post:

    Morten67 (12-02-2016)

  7. #5
    Points: 22, Level: 1
    Level completed: 43%, Points required for next Level: 28

    Posts
    5
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    I have tried both these approaches. For example
    0 - 0,052 → 52
    1 - 0,159 → 159
    2 - 0,279 → 279
    3 - 0.279 → 279
    4 - 0,159 → 159
    5 - 0,052 → 52


  8. #6
    Points: 22, Level: 1
    Level completed: 43%, Points required for next Level: 28

    Posts
    5
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP


  9. #7
    TS Contributor
    Points: 22,399, Level: 93
    Level completed: 5%, Points required for next Level: 951
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    A normal distribution is a continuous distribution. You're forcing data to follow a discrete distribution. I guess it shouldn't be too much of a surprise when you test a discrete (and, therefore, non-normal) distribution for normality and you don't find it.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  10. The Following User Says Thank You to spunky For This Useful Post:

    Morten67 (12-02-2016)

  11. #8
    Points: 22, Level: 1
    Level completed: 43%, Points required for next Level: 28

    Posts
    5
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    Thanks Spunky, Your answer addresses the core of the problem. So the K-S-test and S-W-test statistics does not apply on discrete dataset. Should I then rely on a Visual examination in combination With skewness and kurtiosis to support the assumption of normal distribution?

  12. #9
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    I just generated normal random numbers and rounded them to integers. Then I recoded the negative values to 0 and the values larger than 5 to 5.

    Then I did the Kolmogorov- Smirnov test.
    I was surprised when the test gave a warning. (I don't usually use the Kolmogorov-smirnov test.) But of course, as Spunky said, the the normal distribution is continuous and the data is discrete.

    For the discrete data with zero decimals normality was rejected, (but of course not rejected for the original data).

    For 1 decimal normality was not rejected but the p-value deviated from the correct value given in the original data. The more decimal that were allowed the closer the p-value were to the "correct" value.



    The code is in R. You can download R and RStudio for free.
    Spoiler:


    I prefer to do a histogram and QQ-plot to evaluate the normality of residuals.

    When I was looking for code for Shapiro-Wilks I found this amusing blog post and also this.

    If we round the data to zero decimals, then the data will be discrete (and not continuous and thus not normal). But if we round it to one or two decimals it will still be discrete, but just with more levels, but still a discrete variable. Even if we have 15 decimals it will still be discrete. In fact, all of our data are discrete since we always do a rounding. The real question is, does it deviate a lot from normality? A discrete binomial variable or a Poisson variable can be approximated with a normal distribution for certain parameter values. "All models are wrong, but some models are useful" as Box put it.

    To round a variable loses information. It causes the the variance to increase. So those people who give us some data and say that they have rounded the data "because the uncertainty is so large anyway", they have not done us a favour.

    If I read an instrumental value with 15 decimal then I can enter it as data with 2 decimals, especially if the standard deviation is 1.3 (more than 100 times larger than the last decimal).

    It is not the appropriate question to ask: "are the residuals exactly normally distributed?"

    The important question is: "Do the residual deviate so much from normality that it will have a destructive influence on the regression parameters (alpha and beta) that you are really interested in.

    Regression parameter estimates are known to be robust to non-normality (but they are certainly not robust to outliers).

  13. The Following User Says Thank You to GretaGarbo For This Useful Post:

    hlsmith (12-02-2016)

  14. #10
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    Nice synopsis GG. Did the decimal value matter. I am guessing so. Otherwise if it is just that it needed a distinction between values could one round the simulated values to integers and then just add the slightest noise back at say the thousandth's place.


    Floating points are always a mystery to me, but beyond people rounding data, it seem in exact scenarios the floating point can also be a nuance. I saw something like your mentioning of rounding data somewhere else awhile ago, in something about know and variance. I am thinking is was John Cook or Gelman.
    Stop cowardice, ban guns!

  15. #11
    TS Contributor
    Points: 22,399, Level: 93
    Level completed: 5%, Points required for next Level: 951
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    Quote Originally Posted by GretaGarbo View Post
    When I was looking for code for Shapiro-Wilks I found this amusing blog post and also this.
    These blog posts make a great point which is often overlooked among applied researchers. About 2-3 weeks ago a student sent me an email concerning some analyses he was doing. What caught my att'n of what he said was this:

    Specifically, I am worried about using a t-test for such a small sample size (5 participants) even though I tested the Normality of the sample using the Shapiro-Wilk's test

    So, in his mind, because he tested for normality and the null hypothesis was not rejected, n=5 should be appropriate. It took a little bit of a while for me to explain to him that what he was looking at was an underpowered test of the assumption of normality.

    Sometimes it makes me wonder how many people just go ahead and do these things though, without stopping to think about what they're actually doing.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  16. #12
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP

    What are you talking about Spunky, look at all of these beautiful normal simulations of n= 5.


    P.S., I thought you were going to say the person also had n=5 integer data, that would be awesome.
    Attached Images  
    Stop cowardice, ban guns!

  17. #13
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Generating a dataset with normal distribution acc. to K-S-test and S-W-test in SP


    PSS, when I ran these 40 through normality test the S-W was more generous (gave greater p-values). All were normal per S-W and two rejected by K-S.


    Any guesses which two failed K-S normality test? There will be cake for the winner!

    __________________________________

    GG, has argued for no lower limit on sample size for t-test, the following is for her visual interpretation.
    Attached Images  
    Stop cowardice, ban guns!

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats