I have 8 experimental datasets, 24 data points in each, which I compared using ANOVA followed by post hoc Tukey's HSD, using R. They are OK for the ANOVA assumptions. I found significant differences at 0.01 level between two of the groups and everything else, nothing else was significant.

But if I remove the two significantly different datasets and do the same analysis on the other 6, there are significant differences between them, and not marginal ones, p <0.01.

Is this OK or does it mean I did something wrong in my script? I can see that more groups will mean that there is more caution in deciding differences are significant, but this seems a big jump and I thought ANOVA was designed to allow for the numbers of groups.

This is the bit of the script

strain.aov = aov(abs ~ strain,data=df)
strain.tukey = TukeyHSD(strain.aov)

and df has either 8 or 6 sets of 24 readings of absorbance for each strain.

Thanks for any advice. I am not trying to break up my data any old way to get significance, I am wondering why this happens and whether I misunderstood something.


TS Contributor
Tukey's HSD reduces the individual alpha risk in order to control the family alpha risk. This is dependent on the number of individual comparisons. By removing some comparisons, the allowable individual alpha is increased.
Thank you,

I tried the pairwise.t.test with and without a Bonferroni correction, and the values were very different, I guess the number of groups makes more difference than I expected.

But doesn't that mean that if you have a lot of groups you will miss real differences just because of the analysis?