Significant and not significant depending on the test. What to trust?


I'm a PhD student in the field of Bioengineering (sorry in advance for my poor english, btw). I'm analyzing some data to be included in a paper and have some concerns about what'd be the proper test for me to use. Basically, I'm studying the effect of increasing the concentration of a given reagent on the time the chemical reaction takes to start (from now on, latency time). When plotting the data (mean +/- SD), I see two particular consecutive concentrations (C1 and C2; N=9) with slightly different means and with SD that make the values overlap partially in the plot. With the naked eye, one would say both "populations" are not significantly different because of the overlapping. Generally, I like Kolmogorov-Smirnov test since it takes into account all possible differences among groups, not just the mean. By unpaired two-samples K-S test, the two populations I mentioned were not significantly different, but then I decided to perform an ANOVA since everybody does it (most of the cases without taking into account its assumptions, though)... and guess what: both groups were significant (and the same happened when performing t-test).

Then the question arised: should I trust K-S or ANOVA/t-test? For me it'd be odd to say two groups that overlap in terms of mean+/-SD are significantly different, but maybe it's just a concept problem of mine.

Fow what I've read, unpaired two-samples K-S is a pretty potent test for small populations and I thought it'd be a good alternative given the heteroschedasticity I found in my data. But I've read also it's not a decent test, so I don't know what to trust.

I'd really appreciate your help since my mates and bosses seem not to find consensus. Most of them agree with "seeing the groups as populations (mean+/-SD)" but, at the end, systematically run ANOVA tests.

Thanks in advance.

First, you should explain more about your groups: how many groups do you have? What is your total sample size? Besides, what is your exact P value for KS test and ANOVA?

Second, KS compares distributions. The alternative for ANOVA is Kruskal-Wallis. The alternative for t-test is Mann-Whitney. So why you ran a KS in the first place? I first confused KS with KW! but later saw you have actually run something seemingly irrelevant. If your design needs an ANOVA, KS is not even a test of choice to be compared with ANOVA.

Third that, if ANOVA's assumptions are met, it is a better option than KW test(KS is not even a choice).

Fourth: You shouldn't run t-tests alongside ANOVA. Instead, you should run ANOVA's specific post hoc tests such as Tukey.

5: Visual assessment of standard deviations (looking for overlaps) is not something you can rely on too much, especially in an ANOVA setup and especially overlap of standard deviations, in a way that you invalidate test results only based on some visual overlap.

ANOVA and KS tests answer different questions. In your case, it seems that ANOVA is the correct one, if its assumptions are met. Otherwise, do a KW test instead.
Hi, Victor.

Wow, I just realized I'm a fool when it comes to statistics! First of all, thanks for your reply. Let me go through your questions.

1.- I have 7 groups with 9 measures each, making N=63, p=0.037 (Kolmogorov-S). Regarding ANOVA, p=0.000 assuming variances from two groups were equal. I just checked their variance and saw it was 8362,5 and 1709,028. Given these variances, is it feasible running ANOVA with post hoc test not assuming homoschedasticity or should I move to Kruskal-Wallis*? Is there kind of a rule/threshold to know when to consider your variances are "equal"? For example, I've read one shouldn't run ANOVA if SD of one sample is <2xSD of another one (at this time, I don't even know if this rule is true).
*When assuming equal variance, p=0.000; with Tamhane's post hoc test, p=0.155. Now I'd trust more the latter one, but still don't know if this p is something I can trust with such a difference in variance.

3.- I chose Kolmogorov since I read it can detect any kind of difference between groups and it has a good potency when compared to t-test for small samples. But I see that's when one want to check differences in distribution.

4.- I assume, for what you said, running multiple t-tests would introduce more error than running ANOVA with post hoc test. Which one would you recommend? Some mates tell me Tukey and some other Bonferroni.

Many thanks for your help.
1. Instead of rules and thresholds, you can use a test too. Levene's test can be used to test if SDs are different or not. I think it is already reported in ANOVA's tables, by SPSS. Check its P value out. If the SDs were not OK, KW test can still do it and it is a rather powerful test too.

1B. I see your KS and ANOVA are both "significant" (both P values < 0.05). So why you worried at all? I remember you told us in your first post that one test shows statistical significance while the other one showed the opposite. But seemingly both are significant. Am I missing something?

3. KS compares the distribution shapes. It can be potent but in detecting something not relevant to your case.

4. Both post hocs are common and good. If you want to compare all the groups with each other, Tukey is a little bit better in your case with lots of pairwise comparisons.