+ Reply to Thread
Results 1 to 6 of 6

Thread: Testing normality of distribution: Shapiro-Wilk's test/Levene's test?

  1. #1
    Points: 19, Level: 1
    Level completed: 37%, Points required for next Level: 31

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Testing normality of distribution: Shapiro-Wilk's test/Levene's test?




    I have data from an experiment with 5 conditions. The goal is to compare the impact of treatment on dependent variables between the conditions. To determine whether to use ANOVA or non-parametric equivalents, I conducted Shapiro-Wilk's test. It shows that the data is not normally distributed, so I decided to use Kruskal-Wallis test for the analysis. But then, I run Levene's test to analyze the homogeneity of variance between the groups, and the result shows that there the variances are homogenous. Could I use ANOVA instead of a non-parametric test? Thank you for your help.

  2. #2
    TS Contributor
    Points: 19,199, Level: 87
    Level completed: 70%, Points required for next Level: 151
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,080
    Thanks
    123
    Thanked 429 Times in 330 Posts

    Re: Testing normality of distribution: Shapiro-Wilk's test/Levene's test?

    Hi there, sorry for the delay in releasing your post - it was caught in the spam filter for some reason. A couple of thoughts:

    1) ANOVA (and other regression models) do not assume that the marginal ("overall") distribution of the dependent variable is normal (link). ANOVA does assume that the distribution of the DV is normal within each group. That said, testing this with a Shapiro-Wilk test is virtually pointless: If the sample is small, the normality assumption matters, but the Shapiro-Wilk test will have poor power to detect violations of normality; if the sample is large, the Shapiro-Wilk test will have good power, but the normality assumption probably won't matter (due to the central limit theorem).

    2) A non-significant Levene's test statistic indicates a lack of evidence to reject a null hypothesis that the variances are equal. It doesn't necessarily indicate that the variances are homogenous; a non-significant result might just be due to low power.

    3) A Kruskal-Wallis test is a non-parametric alternative to ANOVA, but it tests a completely different null hypothesis (that the mean ranks are equal across the populations). That might not be what you're interested in testing. If all you're worried about is normality, a simpler alternative would be to use ANOVA, but apply bootstrapping or a permutation test to calculate confidence intervals or p values.

    Hope that helps!
    Matt aka CB | twitter.com/matthewmatix

  3. #3
    Points: 19, Level: 1
    Level completed: 37%, Points required for next Level: 31

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Testing normality of distribution: Shapiro-Wilk's test/Levene's test?

    Thank you for your response! A couple of thoughts:
    - How do I know if Levene's test brings a statistically significant result because of low power, rather than because the variances are not equal?
    - Several sources outline that the general assumptions of ANOVA assume both normal distribution of the DVs (that is what I have been testing with Shapiro-Wilks) and homogeneity of variances (testing with Levene's test). See e.g. here: https://statistics.laerd.com/statist...al-guide-2.php. That is why I'm concerned about distributions, although ANOVA is pretty resistant to non-normality
    - Non-parametric alternatives are recommended for small-sample (30 participants per condition) Likert-scale data, see e.g. https://dl.acm.org/citation.cfm?id=1...TOKEN=76933434
    - Kruskal-Wallis doesn't use means, it uses medians. See e.g. here: http://www.statisticshowto.com/kruskal-wallis/, https://statistics.laerd.com/spss-tu...statistics.php
    - I use Dunn post-hoc test for pairwise comparisons.
    Thanks a lot

  4. #4
    TS Contributor
    Points: 12,501, Level: 73
    Level completed: 13%, Points required for next Level: 349
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,491
    Thanks
    162
    Thanked 334 Times in 314 Posts

    Re: Testing normality of distribution: Shapiro-Wilk's test/Levene's test?

    hi,
    just to build on what CB said, you should run the ANOVA test and then do the diagnostics: check whether the residuals are normal or not, whether you have unhomogenous variances (like the horn shape in the residual graph) etc. This is generally much easier and more sensible to do then the checks before the test. If your residuals show patterns and/or non-normality you might want to either use more advanced techniques (like a data transformationtransformation) or move over to a non-parametric test. They have generally a lower power, so you might want to consider gathering more data in this case.
    regards

  5. #5
    TS Contributor
    Points: 17,981, Level: 85
    Level completed: 27%, Points required for next Level: 369
    Karabiner's Avatar
    Location
    FC Schalke 04, Germany
    Posts
    2,563
    Thanks
    56
    Thanked 644 Times in 606 Posts

    Re: Testing normality of distribution: Shapiro-Wilk's test/Levene's test?

    Quote Originally Posted by luckycat View Post
    - How do I know if Levene's test brings a statistically significant result because of low power, rather than because the variances are not equal?
    Small sample size. For example, with a total n=40, the power to detect that population SDs of 5 versus 9 are different would just be 40.5%
    https://ncss-wpengine.netdna-ssl.com...Simulation.pdf page 553-11


    - Several sources outline that the general assumptions of ANOVA assume both normal distribution of the DVs (...). See e.g. here: https://statistics.laerd.com/statist...al-guide-2.php
    But your source does NOT outline what you say. Instead, it states:
    "1.The dependent variable is normally distributed in each group that is being compared in the one-way ANOVA..."
    Careful reading is highly recommended.

    With kind regards

    Karabiner
    »Jetzt kann mich der Führer mal am Arsch lecken.« (Ernst Kuzorra, 1941)

  6. #6
    Human
    Points: 12,972, Level: 74
    Level completed: 31%, Points required for next Level: 278
    GretaGarbo's Avatar
    Posts
    1,402
    Thanks
    460
    Thanked 474 Times in 414 Posts

    Re: Testing normality of distribution: Shapiro-Wilk's test/Levene's test?


    My view is that few things have made as much damage to the practice of statistics as the dichotomy of "parametric" and "non-parametric" metods. That division seems to be the story of elementary statistics book. The division was correct in the 1950ies. But in the 1960ies the Box-Cox transformation came (to transform to approximate normality) and in the 1970ies the generalized linear models (with many other parametric distributions) and it was at least well established in the 1990ies. Many other non-parametric metods appeared.

    In my knowledge CBear is correct in that the Wilcoxon-Mann-Whitney-Kruskal-Wallace is based on the null hypothesis that the MEAN of the ranks is the same. I am sorry but I am to lazy to search for sources for that.

    I am sorry but I don't trust the sources that luckycat gives in post 3. We all know that some sources on the internt are not reliable.

    But Fagerland Sandvik (2009) "Performance of five two-sample location tests for skewed distributions with unequal variances" says that it is not generally true that the Wilcoxon-Mann-Whitney is a test of medians.

    Have a look at Fagerland, Sandvik and Mowinckel (2015) where the abstract says:

    Results
    The Welch U test (the T test with adjustment for unequal variances) and its associated confidence interval performed well for almost all situations considered. The Brunner-Munzel test also performed well, except for small sample sizes (10 in each group). The ordinary T test, the Wilcoxon-Mann-Whitney test, the percentile bootstrap interval, and the bootstrap-t interval did not perform satisfactorily.

    Conclusions
    The difference between the means is an appropriate effect measure for comparing two independent discrete numerical variables that has both lower and upper bounds. To analyze this problem, we encourage more frequent use of parametric hypothesis tests and confidence intervals.

  7. The Following User Says Thank You to GretaGarbo For This Useful Post:

    CowboyBear (10-29-2017)

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats