shouldn't it be a Mann-Whitney U?

With small sample you can’t check the normality assumption, so you can use the non-parametric test and get a clear cut result while you don’t know how to interpret an edge results in t-test when you aren’t sure about the assumptions.

But since we use rank we also lose information which is critical in

**very small** samples. If you check all the combinations with only 4 subjects you can’t get more than a p-value=0.25 total of 4! Possibilities but on 3! Possibilities that group A is in the edge. 1!*3!/4! (one tail) or just ¼.

(if one group has only 1 observation then the other group must have 19 observations to get "best" p-value of 0.05, but with group 2 via a group of 4 you may get a better result "best" p-value 0.066 (2!*4!)/6!). 2 groups of 3 best p-value=3!*3!/6!=0.05.

This is only the

**best result p-value **for the extream case.

Edge Example (as Greta suggested)

With a

**rank test**, If you compare 2 groups of elephants, A:[800kg] B: [802kg, 860kg,890kg].

you will get the same result as comparing a mouse to elephants, A:[0.02kg] B: [802kg, 860kg,890kg]. "best" p-value 0.25.

But if you run t-test and get p-value=0.0000001 it may be incorrect because you don’t meet the normality assumption and p-value should be only 0.01 (and you need to

**know/guess the standard deviation**)