# Interpreting Kruskal-Wallis results

#### glenn.althor

##### New Member
Hi all,

I've just finished my undergrad thesis and the reviewers strongly recommended that I pursue formal publication of my results. I thought 'COOL'.

However, I've just realised that my stats were a bit off and as such I'm redoing everything. I have Likert scale data (1-7) an n of 300 so from reading it appears that the Kruskal-Wallis test is the best. However I'm having issues understanding the results of the test. As the Uni-umbilical chord has been cut, I need to look to lovely folks such as yourselves for help and advise

Can anyone help with the following questions, please!

1. How do I know if my data is tied?
I can't tell exactly, but as I understand it, if two (or more) groups reports the same median response to a scale, they are tied? (eg. two medians of 7)
2. What does my p-value truly signify?
As I understand, a low p (at alpha 0.05) signifies that there is a detectable difference between some of the groups.
3. Do the results of the Dunn test give me a pretty good idea of where my main diferences lie?
As I understand, the Dunn test will tell me which groups have the greatest detectable difference?

Any help, advice, thoughts or questions would be much appreciated.:tup:

#### gianmarco

##### TS Contributor
I think that you got right quite all.

You have ties in the data when any observation get a rank similar to that of another (or others) observations. You should not be bothered by ties since in general any statpack performing KW test should make correction for ties,
A low p values of the KW test statistic means that at least one sample shows a significant difference versus the other. For this reason you need a follow up test (i.e., a post-hoc test) that helps you in pinpointing what sample is different from the others. The Dunn test would help you locating where the significant difference(s) lies.

Hope this helps
Gm

#### glenn.althor

##### New Member
I think that you got right quite all.

You have ties in the data when any observation get a rank similar to that of another (or others) observations. You should not be bothered by ties since in general any statpack performing KW test should make correction for ties,
A low p values of the KW test statistic means that at least one sample shows a significant difference versus the other. For this reason you need a follow up test (i.e., a post-hoc test) that helps you in pinpointing what sample is different from the others. The Dunn test would help you locating where the significant difference(s) lies.

Hope this helps
Gm
Thanks kindly Gm! I'll have another look at my results with more confidence now

#### glenn.althor

##### New Member
Hi again, I'm back!

Now that I (kinda) understand the purpose of this test and what the results indicate. I need a bit of help understanding some of my results (pasted below).

If I have a small enough p-value to reject the null, and yet the median for both of my groups is the same, how am I supposed to interpret this result? Statistically there is a significant difference, but intuitively, the difference between strongly disagree and well, strongly disagree, is not significant.

Kruskal-Wallis Test: INMEDIAN versus PLACE_ID

Kruskal-Wallis Test on INMEDIAN

PLACE_ID N Median Ave Rank
Rural 90 7.000 163.3 1.59
Urban 211 7.000 145.8 -1.59
Overall 301 151.0

H = 2.54 DF = 1 P = 0.111
H = 4.20 DF = 1 P = 0.040 (adjusted for ties)

#### gianmarco

##### TS Contributor
Hello!
The seemingly oddity is due to the fact that, like Mann-Whitney test, KW is not strictly testing the difference in median, but the mean rank. It can happen that samples having the same median could nonetheless have a different sum of rank, i.e. the ranks of one sample could tend to be lower than the ranks of the other samples.
As for a two-samples case, see this link. I believe that the explanation can be generalized to the KW test as well.

May be that, besides reporting the test output, it could be useful to plot the distribution of the samples' values by mean, e.g., of boxplots. Better, you could try boxplots adding notches to them .

Cheers
Gm

#### glenn.althor

##### New Member
Thank you so much friend Gm! That really helps with how I'll conduct my analysis and present it!

#### noetsi

##### Fortran must die
One way to carry out analysis of specific levels for Kruscal Wallis to perform a Wilcoxon sign ranked test. You can either do an Exact test or a asymptotic test, both yield p values of the null that the location of the two levels (what gianmarco is calling samples and some call populations) is the same. Exact test are likely better with 300 cases.

I believe you can also run contrasts, which have more power but you can not have seen the results of the Kruskal Wallis test to do this (which I suspect you did).

An alternative to Kruskal Wallis in some cases would be either ordered or multinomial logistic regression. It provides what I feel are more useful and better known results.

#### glenn.althor

##### New Member
Thanks noetsi! That's handy to know

#### GretaGarbo

##### Human
About the Wilcoxon–Mann–Whitney test as a special case of Kruskal-Wallis test:

Maybe the paper from Fagerland Sandvik "The Wilcoxon–Mann–Whitney test under scrutiny" 2009 can help. (You can find a pdf from researchgate.net)

The paper says among other things that "The Wilcoxon–Mann–Whitney (WMW) test is often used to compare the means or medians of two independent, possibly nonnormal distributions. .... This usage of the WMW test is not in accordance with the original intentions, which is to test the null hypothesis that P(X<Y )=0.5,
where X and Y are random samples from the two populations at interest."

The paper goes on and does a lot of power comparisons and concludes:
"As a test of means or medians, the Wilcoxon–Mann–Whitney (WMW) test can be severely nonrobust
for deviations from the pure shift model. Our simulation study demonstrates that this problem
is more serious than previously thought. We show that a variety of minor deviations from this
model can lead to true significance levels that are alarmingly far from the nominal level."

- - -

I would prefer to go directly to an analysis of variance (anova) or a t-test (which is the same thing if there are just two groups) and skip the non-parametrics.

---
I must admit that I don't understand what Noetsy means when he says: "One way to carry out analysis of specific levels for Kruscal Wallis to perform a Wilcoxon sign ranked test." But I thought that the "Wilcoxon sign ranked test" was a pairwise or matched pairs test, and as I understood it, that was not the case here. But maybe I was - once again - reading incorrectly.

- - -
Going back to the original posters text: "I have Likert scale data (1-7) " and "two medians of 7". Is the maximum of the scale equal to 7 and the median is also 7? An extremely skewed data? or a type?

#### gianmarco

##### TS Contributor
May be this article could be interesting, as far as likert scale and the choice between t-test or MW are concerned.

#### glenn.althor

##### New Member
Thanks Greta and Gian, I'll do some more reading based on what you suggested.