t-test vs. Wilcoxon-test

Emre91

New Member
Hi there,

I am new in this community. I hope you can help me out.
For my master thesis, I have a dataset of 6043 observations. This observations contain the spread differential (delta) of matched securities (I compare one corporate bond with a Green corporate bond of the same issuer)
What I want to investigate is, whether there is a significance in delta >= 0 (H_0) vs. delta < 0 (H_1)

Therefore I ran two tests on this datas.
1. one sample one tailed t-test
2. Wilcoxon test

The results (run by R) show that 1.) H_1 is true and 2.) H_0 is true.

Now, I am kind of frustrated, since i have two different outcomes of my hypothesis.

Best Regards,
Emre hlsmith

Less is more. Stay pure. Stay poor.
Can you provide a histogram of your data, so we can understand what you are working with. Also, see the below quick simulation I did where both tests generate comparable frequentist conclusions. Also, why do you think the ttest would not be a good fit?

Code:
X <- rnorm(6043, -1, 1)
X
hist(X)
t.test(X, mu=0, alternative = 'less')
wilcox.test(X, mu=0, alternative="less")

Dason

A histogram would be nice to see. If we are looking at symmetric distributions then the t-test and wilcox test should give comparable results. But if there is extreme skew they might disagree. The t-test is a test of the mean. The wilcox test is a test of the median. In the skewed case these can be very different. Here is an example:

Code:
> y <- rexp(10000)
> mean(y)
 1.003803
> median(y)
 0.6841946
> t.test(y, mu = .9, alternative = "greater")

One Sample t-test

data:  y
t = 10.177, df = 9999, p-value < 2.2e-16
alternative hypothesis: true mean is greater than 0.9
95 percent confidence interval:
0.9870244       Inf
sample estimates:
mean of x
1.003803

> wilcox.test(y, mu = .9, alternative = "greater")

Wilcoxon signed rank test with continuity correction

data:  y
V = 22960851, p-value = 1
alternative hypothesis: true location is greater than 0.9

Emre91

New Member
A histogram would be nice to see. If we are looking at symmetric distributions then the t-test and wilcox test should give comparable results. But if there is extreme skew they might disagree. The t-test is a test of the mean. The wilcox test is a test of the median. In the skewed case these can be very different. Here is an example:

Code:
> y <- rexp(10000)
> mean(y)
 1.003803
> median(y)
 0.6841946
> t.test(y, mu = .9, alternative = "greater")

One Sample t-test

data:  y
t = 10.177, df = 9999, p-value < 2.2e-16
alternative hypothesis: true mean is greater than 0.9
95 percent confidence interval:
0.9870244       Inf
sample estimates:
mean of x
1.003803

> wilcox.test(y, mu = .9, alternative = "greater")

Wilcoxon signed rank test with continuity correction

data:  y
V = 22960851, p-value = 1
alternative hypothesis: true location is greater than 0.9

Hi, thanks for the replies @hlsmith and @Dason,
The histogram shows, that my data is skewed. That's why the test accepts the alternative hypothesis (less) and the Wilcox test rejects the alternative.
What would you conclude based on my data?

Attachments

• 67.9 KB Views: 5

obh

Well-Known Member
Hi Emre,

The t-test and the Wilcoxon test doesn't check the same thing, so you don't expect to get the same result...

1. If you want to check if the probability to get a random value that is smaller than zero equal to the probability to get a random value that is larger than zero then choose the Wilcoxon test.

2. If you want to check if the average equals zero then use the t-test.

Despite the above, the Wilcoxon is still a substitute for the t-test, as it has fewer assumptions, and you may say that it checks a similar issue.

With symmetrical data, you may get a similar result.

Asymmetrical example
[-5,-4,-3,-2,14]
The average is 0.
The estimate of the probability to get a randomly selected number smaller the zero is 0.8

Symmetrical example
[-3,-2,-1,1,2,3]
The average is 0.
The estimate of the probability to get a randomly selected number smaller the zero is 0.5

Last edited:

fourize

New Member
Hi,
To use a t.test, you must check the normality of the distribution of your data with a shapiro.test.
If yes, you use the t.test, if not, you use wilcoxon.test
F.

Karabiner

TS Contributor
Hi,
To use a t.test, you must check the normality of the distribution of your data with a shapiro.test.
If yes, you use the t.test, if not, you use wilcoxon.test
F.
Well, not really. First f all, distribution of the sample data is not so interesting.
What might be interesting for the one-sample t test is whether the data are
from a normally distributed population (in case of a 2-sample test: whether
each sample is drawn from a normally distributed population). But with
n > 6000 it is really, really not relevant for the validity of a test
(for the random sampling distribution of the means) whether the population
is normally distributed.

With kind regards

Karabiner