Bootstrapping - Should/Can I add the bias to the sample median?


New Member
Dear All

I'm trying to analyse a questionnaire with responses to questions ordered on a Likert Scale ( 1= Disagree strongly to 5 = Agree strongly).

We are comparing question responses between two groups of students, one group (n-25) who saw a video, one group (n=35) who were given a practical demonstration.

I have summarised the response score for each question by taking the median and used the Mann Whitney U test to test for statistically significant differences in response score between groups.

In addition one of my supervisors suggested adding the standard error of the median to the tables. So I've bootstrapped the samples to get a standard error and bias.

Now, here comes the question.

Because there are only 5 possible responses (1-5) some of the medians are the same, while the Mann Whitney U test says that there are stat sig. differences between the groups. I guess this is because the scores are different but the median is not sensitive enough to show this (because there are a lot of tied scores in the middle)

To bring out the differences in the discussion, can I add the bias to the sample median?

Here's an example from question 7 "The video/presentation was audible and understandable":

Raw Data

Group 1 - shown video

3 3 4 2 4 4 3 4 2 3 4 3 4 4 2 3 4 3 4 4 4 3 4 4 3

Group 2 - given presentation

4 4 3 5 4 4 4 4 4 5 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 4 4 4 4 5 5 5 5 5 4

Just scanning the data seems to show that the presentation did better than the video. The Mann Whitney confirms this with p < 0.0001.

The problem is that median is 4 for both, so it's hard to show in the discussion in which direction the preference was, or what the magnitude of the preference is.

Sample median, bias (from bootstrapping), standard error of median (from bootstrapping) and sample mean for groups 1 and 2 are given below:

Group 1

4, -0.42, 0.49, 4.2

Group 2

4, 0, 0.03, 3.4

So, in the discussion should I write (a) or (b)

(a) The presentation was considered more understandable than the video (mean 4.2 versus 3.4, P<0.0001) [but this confuses the parametric and non parametric statistics]

or can I say

(b) The presentation was considered more understandable than the video (median + bias, 4.03 versus 3.58 p<0.0001)

Which of the two is preferable?

Many thanks for your help with this!



New Member
I should probably add that

Median = sample median

Bias = Mean of the medians of the bootstrapped samples (100,000 bootstrapped samples with replacement from each group) subtract the original sample median


TS Contributor
Hi !
Since you want to use bootstrap procedures, have you tried to perform a bootstrap version of Mann-Whitney, instead of use M-W and than attach a CI to the medians?
I have read somewhere that the use of bootstrapped medians is to be taken with caution: may be this chapter can provide useful info.

As for M-W, in my opinion, even if the medians are the same, you have to take into account what the test is actually verifying. Since it tests "whether one of two samples of independent observations tends to have larger values than the other" (from Wikipedia), in your context it is indicating the very fact that the values of your second sample are higher than the ones of the first. So, I believe that you have not to worry about the equality of medians (this is my opinion; I would like to hear from users with deeper theoretical bases than mine).

Besides, you could use the Probability of Superiority "index": U/(n1*n2), where U is the M-W statistic, n1 and n2 is the size of your two samples.
In a nutshell, the more PS is away from 0,5, the more the difference between the two samples (in M-W terms: observations that tend to be larger in one of the samples).
For PS, see this link.

Hope this helps,


New Member
Hi Gianmarco

Thanks for your thoughts. I'm happy that the MW test is telling me that there is a difference and that it is 'correct'. But I'm interested in what the magnitude of the difference is. In other words it is a very minor, insignificant, but stat sig. difference or is it a meaningful difference (the old chestnut: stat significant versus real world significant). For this I need the numbers really. The confidence intervals might work for that, but for some of the cases R is not able to compute them presumably because there are so many ties - in many cases I get an error along the lines of w out of range or is infinite.

The only two solutions that I can think of is to use the means of the original samples (example (a) above) which is technically wrong as the likert is ordinal data, or to add the bias to the median (example (b)) but I don't know if I can do this.

Thanks again!

P.S. I can see from the link that the PS would give an indication of the magnitude of the difference but it is quite hard to interpret in terms of 'points' on the Likert Scale.

Last edited:


TS Contributor

I think you only want a number other than p-value/sample mean to represent the difference. So this recall me the post earlier, which discuss about the "interpolated" median (something like a continuity correction). According to the ties in 4, you some how interpolate a point between 3.5 and 4.5 according to the rank of the data.

I have never use it before but you may take a look.


New Member
Thanks BGM. Took a look at this and Dason didn't seem to think very much of it as a technique which worries me somewhat.....