# Simple 4 group comparison of means problem

#### RGrieves

##### New Member
I'll start by saying that I'm a neuroscience PhD student and have searched a long time for a solution to this problem but I can't find one. If you just have time to give a rough idea or point in the right direction then I'm grateful for any help.

I am measuring the frequency of the same material under four different conditions, i.e.

Condition 1 = 1 2 3 1 2 1 2 3 1 3
Condition 2 = 1 3 2 1 1 1 2 2 1 1 2 1
Condition 3 = 1 2 1 2 1 2 1 2 2
Condition 4 = 10 12 11 9 8 11 14 12 11

I have lots of these data sets, the number of measurements in each condition varies and they are not necessarily equal, however the group sizes don't deviate much from 10.

It is obvious that in condition 4 the frequency was consistently higher, however, I can't find a good statistical test to show this. I can run an Anova to see if there is an underlying significant distribution (just now I'm using a permutation F test) but this doesn't tell me specifics. I can run multiple comparisons but then the p-value is driven down because I have to compare every group to every other group before I can say the frequency was higher in condition 4 than in any of the others.

Basically, I am writing a piece of code in Matlab to analyse data sets as shown above, but I need a test which I can run on the data and which will tell me if the groups are all similar, if one is significantly higher than the rest, if two are higher than the other two etc etc. The condition with the highest frequency is not always the same one and often there are conditions which have statistically similar frequencies.

I have looked for ever to find a good test or method but I just can't find one, they all require equal group sizes or independent groups or normally distributed data.

Any help would be greatly appreciated,

Roddy.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
This should all depend on what you proposed before you started your project. However, if residuals in data are normal you typically run anova, if significant run pairwise ttests correcting alpha for repeated testing. If data not normal than typically kruskal Wallis followed by corrected wilcoxon tests.

#### RGrieves

##### New Member
We hypothesized that we would either see two conditions with similar values or one condition higher than all the rest, it could have gone either way. Seeing the data now I know that one condition caused higher frequencies than the others but of course I need a test which can test the hypothesis we set out with.

I know I should run an anova and multiple comparisons as post hoc (or the non-parametric equivalent), however, I find with that method that I get many significant anova's followed by non-significant multiple comparisons - not because the data are not significantly different but because I have to compare all the groups, which means 9 comparisons, which means a very small p value...

One suggestion I have come across before is to run multiple comparisons with a very liberal correction (i.e. LSD or no correction at all) and with a quite conservative correction (i.e. scheffe or sidak) and then compare the overall results of these tests when all the data have been analysed - if the overall conclusion remains the same regardless of the correction then the effect is obviously there. Is that something which seems reasonable to you?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Hate to be the one to say it but that approach always seems fishy. Ideally what you probably should do is use these data to fuel some sample size calculations and then start over collecting enough observations to power your test (if possible) . Or report your undrpowered results.

#### RGrieves

##### New Member
Collecting more data is unfortunately not possible - we have to work with the data we have. The data may be underpowered but often in practice it is not possible to collect masses of data at a time, depending on the nature of the materials.

It seems odd to me that there are many alternative multiple comparison tests but there is no real alternative to doing multiple comparisons. We can dress it up as much as we want but in reality we are just running a test that has been around for donkey's years over and over again.