How can one compare using a simple experimentation or simulation whether a parametric test (e.g. t-test) is more powerful than its non-parametric counterpart (Wilcoxon signed rank test)?

I understand that statistical power is defined as the probability of rejecting the null hypothesis when it is in fact false and should be rejected. The power of the parametric tests can be calculated easily from formulae, tables and graphs based on their underlying distributions.

I have learned that power for non-parametric tests can be calculated using Monte Carlo simulation methods. I am not sure if I have understood the procedure correctly. Please correct me if the procedure below is wrong.

• An alternative hypothesis (Ha) is specified together with a sample size.

• Sample data are generated pseudo-randomly from the probability density function (under Ha) and the test is carried out.

• This process is repeated many times (e.g. 1000 times) and the proportion of the ‘null acceptances’ is recorded. Since any acceptance of the null hypothesis is false by definition, this proportion represents the magnitude of Type II error denoted as β and its complement is power denoted as 1-β.

Can we readily compare the calculated power for the parametric test with the calculated power for the non-parametric test? Or we shall use the corresponding variances of the t-test statistic and Wilcoxon test statistic, then conclude whichever has smaller variance as the more powerful test?

I understand that statistical power is defined as the probability of rejecting the null hypothesis when it is in fact false and should be rejected. The power of the parametric tests can be calculated easily from formulae, tables and graphs based on their underlying distributions.

I have learned that power for non-parametric tests can be calculated using Monte Carlo simulation methods. I am not sure if I have understood the procedure correctly. Please correct me if the procedure below is wrong.

• An alternative hypothesis (Ha) is specified together with a sample size.

• Sample data are generated pseudo-randomly from the probability density function (under Ha) and the test is carried out.

• This process is repeated many times (e.g. 1000 times) and the proportion of the ‘null acceptances’ is recorded. Since any acceptance of the null hypothesis is false by definition, this proportion represents the magnitude of Type II error denoted as β and its complement is power denoted as 1-β.

Can we readily compare the calculated power for the parametric test with the calculated power for the non-parametric test? Or we shall use the corresponding variances of the t-test statistic and Wilcoxon test statistic, then conclude whichever has smaller variance as the more powerful test?

Last edited: