# T-Test vs. Mann-Whitney U Test for ordinal data

#### lancearmstrong1313

##### New Member
Hello all:

Imagine people were asked to decide how likely they were to do something on a scale of 1 to 5 (say 1=not at all likely, 2=somewhat unlikely 3=neutral, 4=somewhat likely 5=very likely) or any sort of scale like this - doesn't really matter. And let's say you wanted to compare how 2 groups answered for this question.

Since this type of data is ordinal and not continuous but is ordinal, it would seem to lend itself to using a Mann-Whitney U-test to compare the medians. However, I know that with a large enough sample size one can get away with using a t-test.

So I'm curious what "large enough" means to people on here? Should it be required that each group have at least 50 people? 100 people? Just curious as to thoughts and opinions.

#### spunky

##### Super Moderator
So I'm curious what "large enough" means to people on here? Should it be required that each group have at least 50 people? 100 people? Just curious as to thoughts and opinions.
this kind of seems like one of those questions where simulation is in order. we deal with this kind of data in the social sciences all the time (and i mean ALL THE TIME) because all we have are surveys and rating scales. i would say the 99.9% of people use t-tests regardless of the nature of the data. but it really depends on lots of things. for instance, if the histograms of your two groups are reasonably symmetric, you can actually get away with small samples (say 15-20) and still get valid-enough results.

or you can have large samples with very skewed data and your t-tests would be off.

so yeah... i don't think there's an easy answer for this one. however, this debate of whether rating scales and likert-type scales should be analysed as continuous VS discrete data has been going in on my field of knowledge for say.... what, close to 100 yrs already? with no end in sight?

so yeah... do simulations under various conditions. for us in sociology, psychology, education, etc. it starts with the assumption that your variable is truly continuous (like in your example is maybe the "tendency to do something") which becomes discretised by the very act of measuring it. so you could vary the discretisation thresholds, the number of them, the sample size, etc. you'll see that under various conditions, the t-test may or may not be valid regardless of the nature of the data.

#### victorxstc

##### Pirate
There was once a complete discussion on this and I recall Dason quoted his proff calling infinity as about 30 samples!! (you can search for that thread and other ones and a lot of fine topics). There is no cutoff for that and they just say large enough. The larger, the better but around the value 30 or more, CLT starts to kick in!

#### hlsmith

##### Omega Contributor
I use Mann-Whitney U-test (median) for this type of data. The generic cutt-off you normally see is the 30 persons in the study for use of ttest, however that seems risky in this scenario.

What I would do is take the mean of all values minus all of the observations for the variable, then plot these differences. If they look normal then you may have a case for further examination and use of ttest.

#### CowboyBear

##### Super Moderator
Since this type of data is ordinal and not continuous
I agree, but I'll just pedantically note that it's also possible for a variable to be continuous and ordinal (e.g., a visual analogue scale). Whether a variable is ordinal vs interval vs ratio is a measurement theory issue; whether it's continuous or discrete is more of a distributional issue. You may need to clarify what you're most worried about here:

1) Are you concerned that the distributional/statistical assumptions of the t-test won't be satisfied? (i.e. because Likert data will by definition result in non-normal errors). If so, a large-ish sample deals with this (how large I don't know - simulations do sound good. It will depend on the number of items for the DV, the number of response options for each item, and the observed distribution of responses).

2) Are you concerned with the old measurement argument that parametric tests are only admissible with interval or ratio data, but not ordinal data? (I.e. the S.S. Stevens argument). This is a measurement theory issue, not a statistical or distributional one, and sample size isn't really relevant to this at all. (Note: Most people ignore this argument nowadays, but the fact that you mention ordinality makes me bring it up)

it would seem to lend itself to using a Mann-Whitney U-test to compare the medians.
The Mann-Whitney U can only be interpreted as testing a null hypothesis of equal medians under quite restrictive conditions (i.e. identical distributions in each group bar a possible location shift). The more general null hypothesis it tests is "that P ( X < Y )= 0.5, where X and Y are random samples from the two populations at interest" (Fagerland 2009, see linked paper).

#### lancearmstrong1313

##### New Member
This doesn't actually involve anything I'm working on, I was just kind of curious (hypothetical situation)