Parametric vs nonparametric tests

#1
I'm pretty confused about the conditions in which I should use a parametric or a nonparametric test to compare the means of 2 (or more) populations and would be very grateful if someone would explain it to me.

I know that if the distribution of the sample is not normal, non-parametric (Wilcoxon, Kruskal-Wallis) should be used. It is assumed that the central limit theorem allows to use parametric if n> 30, generalized error because in any case they are n> 30 for each level of the factor. On the other hand, now it seems that CLT is questioned and that 30 cases does not guarantee normality.

When do we have to use one or the other?

If ANOVA requires homoscedasticity, why are post-hoc tests for unequal variances?

Many thanks
 

CowboyBear

Super Moderator
#2
I know that if the distribution of the sample is not normal, non-parametric (Wilcoxon, Kruskal-Wallis) should be used.
A few quick thoughts on this:

1) It is the distributions of the errors (i.e., the distributions within each groups) that are assumed normal, not the marginal distribution across all groups

2) Unless you have a tiny sample size and the departure from normality is very extreme, a breach of normality is unlikely to cause meaningful problems. I blogged about this recently. Normality is an assumption of these tests, but it is by far the least important of several assumptions.

3) If you are worried about non-normality, bootstrapping or permutation tests allow you to test for mean differences without changing your whole framework of inference (Note: A Wilcoxon or Kruskal-Wallis test is not a test for differences in means, or at least not unless we assume that the distributions have the same shape and spread within each sample; without that assumption these tests have quite different and counterintuitive null hypotheses).

PS. I see this is your first post. :Welcome:
 
#3
MANY THANKS CowboyBear for your quick reply !!
Only another question. How could I know if the distributions of both samples are similar?
Thank you again
Best wishes
David
 
#5
Thank you gdaem, but I asked for the shape of two samples to compare the means with a nonparametric test, if I understood correctly CowboyBear's reply. Not necessarily to check if they are normal distributions.
Best wishes
David
 

Dragan

Super Moderator
#6
I'm pretty confused about the conditions in which I should use a parametric or a nonparametric test to compare the means of 2 (or more) populations and would be very grateful if someone would explain it to me.

I know that if the distribution of the sample is not normal, non-parametric (Wilcoxon, Kruskal-Wallis) should be used. It is assumed that the central limit theorem allows to use parametric if n> 30, generalized error because in any case they are n> 30 for each level of the factor. On the other hand, now it seems that CLT is questioned and that 30 cases does not guarantee normality.

When do we have to use one or the other?

If ANOVA requires homoscedasticity, why are post-hoc tests for unequal variances?

Many thanks

What you are forgetting is the issue of Power versus Type I error. That is, nonparametric rank-based tests can demonstrate to be much more powerful than the usual parametric (t or F) tests when the underlying distributions are skewed and/or heavy-tailed....Studies abound.
 

CowboyBear

Super Moderator
#8
MANY THANKS CowboyBear for your quick reply !!
Only another question. How could I know if the distributions of both samples are similar?
Thank you again
Best wishes
David
Shape: You could use histograms or a qqplot (qqplot of the one data distribution against the other, instead of one data distribution against a normal distribution, as in a qqnorm plot).

Spread: Levene's test or just look at the sample variances.

But making analytic decisions contingent on data can be problematic, so if you do decide to do a different analysis than you had initially planned, do report both.