# Asymptotic sig (2sided test) V Pairwise comparision ad. sig

#### mss90

##### New Member
Greetings

I just ran a the non-parametric KW test (as independent samples) in SPSS on my data with 5 groups which is not normally distributed and found that the overall Asymptotic sig (2sided test) was 0.025 ergo significant while in the Pairwise comparision ad. sig, no groups were statistically different? How does that work?

Thanks

#### hlsmith

##### Not a robit
Is it, the overall test KW was based on alpha = 0.05 and the pairwise comparisons corrected the alpha so it was no longer 0.05.

#### mss90

##### New Member
Is it, the overall test KW was based on alpha = 0.05 and the pairwise comparisons corrected the alpha so it was no longer 0.05.
Is that what Bonferroni does? How does that work? Please elaborate

#### hlsmith

##### Not a robit
Bonferroni multiplies pvalue by number of comparisons. So 3 groups = a v b, a v c, b v c, thus pvalues are all multiplied by 3. Thus 0.01 would be 0.03.

#### mss90

##### New Member
Bonferroni multiplies pvalue by number of comparisons. So 3 groups = a v b, a v c, b v c, thus pvalues are all multiplied by 3. Thus 0.01 would be 0.03.
Okay I see, but the signifcance level should still be the same? So I am guessing this does not apply to asymptotic sig then? How do I go about explaining this?

Thanks

#### GretaGarbo

##### Human
Okay I see, but the signifcance level should still be the same?
The overall significance should remain 0.05. But each of the three tests is evaluated at the nominal significance of 0.05/3 =0.0167.

If one of the test is significant then you can improve the test with Bonferroni-Holm (search for it) by doing the next test by 0.05/2 = 0.025, so then you test at the nominal significance level of 0.025. And then you can continue like that as long as it is significant.

So I am guessing this does not apply to asymptotic sig then? How do I go about explaining this?
Of course it does apply to asymptotic tests.

#### mss90

##### New Member
The overall significance should remain 0.05. But each of the three tests is evaluated at the nominal significance of 0.05/3 =0.0167.

If one of the test is significant then you can improve the test with Bonferroni-Holm (search for it) by doing the next test by 0.05/2 = 0.025, so then you test at the nominal significance level of 0.025. And then you can continue like that as long as it is significant.

Of course it does apply to asymptotic tests.
three tests, you mean five right? So are you saying the significance level for the pairwise comparison is lower than 0.05 depending if Bonferroni is applied or not? How does that explain that my overall p-value for the 5 groups said that "some of the groups are significantly different" but in the actual pairwise comparion between each group all have p-values => 0.087. I dont get how the test first says something is different but when i try find out whats different its saying nothing is different. Sorry, did that make any sense? Please see attatchments

#### GretaGarbo

##### Human
I did not read the first post carefully enough. (I looked at Hlsmith's example of 3 comparison .) If there are 5 groups then there will be 10 pairwise comparisons, just as your table shows.

It can happen that the hypothesis of no difference among all 5 groups is rejected, but at the same time it is not possible to reject any of the hypothesis of pairwise no difference. Welcome to the stochastic world and goodbye to the deterministic world!

But with the data shown (the boxplots) I would not trust Kruskal-Wallis or the Wilcoxon-Mann-Whitney tests. They are sensitive to "spread", so then the test can have a much higher error rate than the 5%.

I would log the data and look at that and do a parametric test based on the normal distribution, or else look for an other distribution.

Why don't you show us the data, in one column for the group and one column for the measurements?

#### mss90

##### New Member
I did not read the first post carefully enough. (I looked at Hlsmith's example of 3 comparison .) If there are 5 groups then there will be 10 pairwise comparisons, just as your table shows.

It can happen that the hypothesis of no difference among all 5 groups is rejected, but at the same time it is not possible to reject any of the hypothesis of pairwise no difference. Welcome to the stochastic world and goodbye to the deterministic world!

But with the data shown (the boxplots) I would not trust Kruskal-Wallis or the Wilcoxon-Mann-Whitney tests. They are sensitive to "spread", so then the test can have a much higher error rate than the 5%.

I would log the data and look at that and do a parametric test based on the normal distribution, or else look for an other distribution.

Why don't you show us the data, in one column for the group and one column for the measurements?
Thanks for getting back to me, I really appreciate it Okay so this is my data:

EDTA Cd Group
0.3724 1
0.7475 2
0.6028 3
5.7238 4
6.8980 5
0.3866 1
0.7905 2
0.8434 3
1.3017 4
1.5030 5
0.1011 1
0.3307 2
0.8868 3
7.3019 4
15.3473 5
0.1397 1
0.1061 2
0.1023 3
0.3488 4
0.5801 5
0.0486 1
0.0344 2
0.0419 3
0.1188 4
0.1855 5
0.1578 1
0.1295 2
0.3902 3
1.1356 4
1.1093 5

#### GretaGarbo

##### Human
Well, it seems like to take the log (logarithm) of the dependent variable cures the problem. (There is some heteroscedasticity left so it could be improved a little.)

Anova (analysis of variance) says that there is a difference between groups. But if you base the conclusion on "Bonferroni testing" then it is not possible to conclude which pairs differ.

But I suspect that the groups are a result of an increasingly influencing treatment variable. Is that so? Then the groups are definitely significant.

Code:
# install R
# install RStudio
# install the library dplyr

dat <- read.table( header=TRUE, text = "
cd group
0.3724 1
0.7475 2
0.6028 3
5.7238 4
6.8980 5
0.3866 1
0.7905 2
0.8434 3
1.3017 4
1.5030 5
0.1011 1
0.3307 2
0.8868 3
7.3019 4
15.3473 5
0.1397 1
0.1061 2
0.1023 3
0.3488 4
0.5801 5
0.0486 1
0.0344 2
0.0419 3
0.1188 4
0.1855 5
0.1578 1
0.1295 2
0.3902 3
1.1356 4
1.1093 5" )

dat

plot(dat$group, dat$cd)
table(dat$group) library(dplyr) dat %>% group_by(group) %>% summarize(mean(cd), median(cd), sd(cd)) dat2 <- dat %>% mutate(lcd = log(cd)) %>% mutate(scd = sqrt(cd)) boxplot( dat2$cd ~ dat2$group ) boxplot( dat2$lcd ~ dat2$group ) # seems OK boxplot( dat2$scd ~ dat2$group ) # sqrt transformation not enough dat2 %>% group_by(group) %>% summarize(mean(lcd), median(lcd), sd(lcd)) plot(dat2$lcd ~ dat2$group) abline(lm(dat2$lcd ~ dat2$group),col= "red" ) summary(m1 <- lm(dat2$lcd ~ as.factor(dat2$group))) anova(m1) summary(lm(dat2$lcd ~ dat2$group)) class(dat$group)

# below a VERY tedious way of t-testing
# suggest a better one!
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 2] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 2]))
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 3] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 3]))
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 4] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 4]))
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 5]))
t.test(dat2$lcd[dat2$group == 2 | dat2$group == 3] ~ as.factor(dat2$group[dat2$group == 2 | dat2$group == 3]))
t.test(dat2$lcd[dat2$group == 2 | dat2$group == 4] ~ as.factor(dat2$group[dat2$group == 2 | dat2$group == 4]))
t.test(dat2$lcd[dat2$group == 2 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 2 | dat2$group == 5]))
t.test(dat2$lcd[dat2$group == 3 | dat2$group == 4] ~ as.factor(dat2$group[dat2$group == 3 | dat2$group == 4]))
t.test(dat2$lcd[dat2$group == 3 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 3 | dat2$group == 5]))
t.test(dat2$lcd[dat2$group == 4 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 4 | dat2$group == 5]))

#### mss90

##### New Member
Well, it seems like to take the log (logarithm) of the dependent variable cures the problem. (There is some heteroscedasticity left so it could be improved a little.)

Anova (analysis of variance) says that there is a difference between groups. But if you base the conclusion on "Bonferroni testing" then it is not possible to conclude which pairs differ.

But I suspect that the groups are a result of an increasingly influencing treatment variable. Is that so? Then the groups are definitely significant.

Code:
# install R
# install RStudio
# install the library dplyr

dat <- read.table( header=TRUE, text = "
cd group
0.3724 1
0.7475 2
0.6028 3
5.7238 4
6.8980 5
0.3866 1
0.7905 2
0.8434 3
1.3017 4
1.5030 5
0.1011 1
0.3307 2
0.8868 3
7.3019 4
15.3473 5
0.1397 1
0.1061 2
0.1023 3
0.3488 4
0.5801 5
0.0486 1
0.0344 2
0.0419 3
0.1188 4
0.1855 5
0.1578 1
0.1295 2
0.3902 3
1.1356 4
1.1093 5" )

dat

plot(dat$group, dat$cd)
table(dat$group) library(dplyr) dat %>% group_by(group) %>% summarize(mean(cd), median(cd), sd(cd)) dat2 <- dat %>% mutate(lcd = log(cd)) %>% mutate(scd = sqrt(cd)) boxplot( dat2$cd ~ dat2$group ) boxplot( dat2$lcd ~ dat2$group ) # seems OK boxplot( dat2$scd ~ dat2$group ) # sqrt transformation not enough dat2 %>% group_by(group) %>% summarize(mean(lcd), median(lcd), sd(lcd)) plot(dat2$lcd ~ dat2$group) abline(lm(dat2$lcd ~ dat2$group),col= "red" ) summary(m1 <- lm(dat2$lcd ~ as.factor(dat2$group))) anova(m1) summary(lm(dat2$lcd ~ dat2$group)) class(dat$group)

# below a VERY tedious way of t-testing
# suggest a better one!
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 2] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 2]))
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 3] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 3]))
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 4] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 4]))
t.test(dat2$lcd[dat2$group == 1 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 1 | dat2$group == 5]))
t.test(dat2$lcd[dat2$group == 2 | dat2$group == 3] ~ as.factor(dat2$group[dat2$group == 2 | dat2$group == 3]))
t.test(dat2$lcd[dat2$group == 2 | dat2$group == 4] ~ as.factor(dat2$group[dat2$group == 2 | dat2$group == 4]))
t.test(dat2$lcd[dat2$group == 2 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 2 | dat2$group == 5]))
t.test(dat2$lcd[dat2$group == 3 | dat2$group == 4] ~ as.factor(dat2$group[dat2$group == 3 | dat2$group == 4]))
t.test(dat2$lcd[dat2$group == 3 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 3 | dat2$group == 5]))
t.test(dat2$lcd[dat2$group == 4 | dat2$group == 5] ~ as.factor(dat2$group[dat2$group == 4 | dat2$group == 5]))
Thank again for looking into this! Ah alright so transforming the data using log reduced the heteroscedasticity of the data, but you still did not manage to decipher which groupes where significantly different right? Did you say you used anova on it? Shouldnt you have used KW, or did the transformation make the data normally distributed? My SPSS trial just expired today so I cant really attempt this transformation my self at this point.
However, would it make sense if I explained this in my report as a significantly different dataset but due to heteroscedasticity KW test does not work properly and that I therefore chose to use the p-values without the Bonferroni correction in order to achience a means to and end. I had a look at the values and it seems to make sense, 1-4, 1-5 and 2-5 are statitically significantly different and 2-4 is borderline which could be due to Type 2 Error? Thoughts?

Thanks

#### GretaGarbo

##### Human
Don't use the Kruskal-Wallis test. It is irrelevant when there is such a big difference in spread.

You could use the code I gave you. Just install R and RStudio and run the code. It is completely free and would take 15 minutes to download and install.

(The t-test there is "Welsh test" so it is a little bit more robust to heteroscedasticity than usual t-tests.)

However, would it make sense if I explained this in my report as a significantly different dataset but due to heteroscedasticity KW test does not work properly and that I therefore chose to use the p-values without the Bonferroni correction in order to achience a means to and end.
I think you have misunderstood something here.

#### mss90

##### New Member
Don't use the Kruskal-Wallis test. It is irrelevant when there is such a big difference in spread.

You could use the code I gave you. Just install R and RStudio and run the code. It is completely free and would take 15 minutes to download and install.

(The t-test there is "Welsh test" so it is a little bit more robust to heteroscedasticity than usual t-tests.)

I think you have misunderstood something here.
Alright cheers, what is it I have missunderstood?

Thanks

#### GretaGarbo

##### Human
You can use the code I gave you. Just install R and RStudio and the package dplyr. They are free.

#### mss90

##### New Member
Don't use the Kruskal-Wallis test. It is irrelevant when there is such a big difference in spread.

You could use the code I gave you. Just install R and RStudio and run the code. It is completely free and would take 15 minutes to download and install.

(The t-test there is "Welsh test" so it is a little bit more robust to heteroscedasticity than usual t-tests.)

I think you have misunderstood something here.
Alright cheers, what is it I have missunderstood?

Thanks

#### GretaGarbo

##### Human
However, would it make sense if I explained this in my report as a significantly different dataset but due to heteroscedasticity KW test does not work properly and that I therefore chose to use the p-values without the Bonferroni correction in order to achience a means to and end.
I just don't understand that statement.

The Kruskal-Wallis test is not valid due to the large difference in "spread". Therefore the p-values (based on the Kruskal-Wallis test) are not valid not valid and the Bonferroni correction based on these p-values is not valid.

Did you run the code I gave you?