Comparing categorical data

I'm comparing two groups in a large dataset. For some of the variables, such as # of arrests, the following 5 choices are available: 1, 2, 3, 4, 5 or more
This seems like ordinal categorical data. Can I just use the 5 choices on their own to compare across groups, like doing a chi squared analysis and regression models? Or should I be doing any sort of dummy coding or watching out for anything else?
Last edited:


Active Member
The default choice to compare 2 groups for ordinal data is Mann-Whitney U test. (
You can also use the chi-square test for goodness of fit, which used usually for categorical data (no ordinal data)

The two tests don't check exactly the same question.

The chi-square test will show you if there is any significant difference between the distributions, while the Mann-Whitney U test. will compare the rank.

for example, the chi-test may say that there is a significant level between the groups, while Mann-Whitney U test: "The randomly selected value of the Group1's population is considered to be greater than the randomly selected value of the Group2's population" (from the following calculator:

If the 2 distributions have the same shape you may assume Mann-Whitney compare the medians. but only 5 values have a "limited shape"