# Thread: Is it better if it isn't statistically significant?

1. ## Is it better if it isn't statistically significant?

So I am having a discussion with a colleague about a paper that we are writing. We have some results that has 3 levels of 1 Factor (A, B, and C). Using a Tukey test it was determined that A and B are better than C but that A and B are not different at the 95% confidence level. However, the mean of the score from A is higher than B. He is arguing that you can say that A is better than B but you can't stay it is significantly better.

My argument is you can't even say it is better, the whole point of the test was to determine if the difference between was outside the margin of error. The only way you could say it is better would be to include a confidence level with which it would be significant. To me it is assumed if you say something is better than something else then you are stating it with the assumption that it is statistically significant.

Any thoughts or resources anyone can offer on this subject?

2. ## Re: Is it better if it isn't statistically significant?

You normally wouldn't say it is "better" at all which employs a substantive judgement of what good and bad is that is not really provable by statistics Higher or lower seems to make more sense. In journals I have seen they will at times say A is higher or lower than B but the results are not statistically signficant. I think the key here is how close to the signficance level the results are given that a given alpha level is essentially artificial. If A is higher than B and the p level is .049 is that really very different than .051 (assuming an alpha level of .05 is being used)? If someone choses to use .05 than an effect size is signficant at the .049 level, but if they choose .01 it is not. So what does signficance really mean? The real point I am making is that simply saying something is real or not (you are really saying whether it is likely to exist in the real population not just the sample) because it is above or below a specific p value is doubtful if extremely common in the world of research.

It is common to list if something is signficant at a certain level in tables where you report statistical test.

3. ## The Following User Says Thank You to noetsi For This Useful Post:

dalupus (06-11-2015)

4. ## Re: Is it better if it isn't statistically significant?

You are absolutely right. Your colleague is looking at the point estimate which is higher for A but the point estimate is a random variable so no there is no point (pun intended) in basing a decision on it. Confidence intervals are the right choice for a decision.

regards

5. ## The Following User Says Thank You to rogojel For This Useful Post:

dalupus (06-11-2015)

6. ## Re: Is it better if it isn't statistically significant?

Thank you both for your replies. I think you have given me the information I need in order to correctly formulate my argument.

7. ## Re: Is it better if it isn't statistically significant?

An always good rule of thumb is to look at past journals that deal with your topic and see how they handle it. Understanding that journal often get stats wrong (including elite ones).

There is supposedly a move, I no longer read academic research as I used to, to start ignoring p values entirely and focus on effect size instead. I have my doubts that really will occur.

8. ## Re: Is it better if it isn't statistically significant?

I would be willing to say the following two things:
1. We were unable to demonstrate that A is different from B at 95% confidence
2. It's more likely than not that A is higher than B (there's a greater than 50% chance that A is higher than B)

I'd like to hear if the stats experts (I'm an engineer, not a statistician) have any quibbles with statement #2.

9. ## Re: Is it better if it isn't statistically significant?

I am not a statistician either, I am a data analyst, but I am pretty sure 2 is not correct. I don't think you can say there is a greater than 50 percent chance (or any percent chance). What I have seen said, although I do not know if this is statistically correct, is that the p value is close to the level chosen to reject the null (with the implied sense that A might be larger than B). This is a grey area because it deals with the interpretation social scientists place on p values that (I think) do not have anything to do with the statistics at all.

The history of interpretation of p values, particularly outside formal statististical occupations, is a long and fasinating one that primarily was decided before 1940 (since then relatively little thought appears to have been given to it although as I noted before this may be changing).

10. ## Re: Is it better if it isn't statistically significant?

Thanks for the response, noetsi.

Originally Posted by noetsi
I am not a statistician either, I am a data analyst, but I am pretty sure 2 is not correct. I don't think you can say there is a greater than 50 percent chance (or any percent chance).
Ok, so why do you think that?

To me it makes intuitive sense that if I have, say, two groups with 10 data points each and Group A has a value of 80, and Group B has a value of 90, I can say that there's a better chance that a) Process B has a higher value than Process A, than b) that Process A has a higher value than Process B. I wouldn't state that it has any particular percentage, but if I had to choose to implement one process or the other (and the cost to implement is the same), I would implement Process B because the sample from it was higher than the sample from Process A. Assume higher=more desirable.

If possible, I'd like to understand why I'm either right or wrong in my intuition based on statistical principles.

11. ## Re: Is it better if it isn't statistically significant?

Well, I'd say that if your null hyopthesis for the A-B comparison is that A=B and if the P-value does not reach the pre-specified level, then you would conclude that there is no evidence, or at least insufficient evidence, that A≠B and therefore fail to reject the null hypothesis that A=B. So in that sense, it's not a matter of whether one is bigger than the other, since you essentially aren't able to tell that they are different.

My impression is that in some literature hypothesis testing isn't so strict, or in some cases is frowned upon, but in those cases I think the actual P-value (or bounds of the confidence interval) and effect size, in addition to previously published evidence for the comparison of interest, is used for interpretation.

12. ## Re: Is it better if it isn't statistically significant?

My argument is you can't even say it is better, the whole point of the test was to determine if the difference between was outside the margin of error.
If you use the Neyman-Pearson decision-making (hypotheses testing) approach, yes.
There, you want to decide between competing hypotheses and you reject the null
hypothesis if the sample parameter is outside the pre-specified region (most often,
associated with< 5%).

But if you use the Fisherian significance testing approach, you'll consider the p-value
and contemplate how much evidence there is against the null hypothesis.By the way,
what was the actual p-value for the A versus B comparison, and how large was your
sample size?

With kind regards

K.

13. ## Re: Is it better if it isn't statistically significant?

Your colleague is correct. The p value cut-off of 0.05 is completely arbitrary. If you chose a larger one, the difference would be significant. Or, if you got a larger sample and the same effect size, it would be significant at 0.05

The p-value is the chance of falsely rejecting a null hypothesis. Usually, the null is false so you cannot possibly make a type I error. The real question is how big the effect is and how precisely it is estimated.

14. ## Re: Is it better if it isn't statistically significant?

A point of all of this is that statisticians, and even more social scientist, disagree on what exactly p means. Different schools have different interpretations. For example the frequentist and bayseian approach to what this means is signficantly different. I gave the view that is most common in my field, that if p is above the set alpha level you can say nothing at all about the null. Literally you don't know if the effecti size is real or created by random sampling error.

15. ## Re: Is it better if it isn't statistically significant?

Originally Posted by noetsi
A point of all of this is that statisticians, and even more social scientist, disagree on what exactly p means. Different schools have different interpretations.
That isn't correct. There really is only one definition. Any statistician worth anything knows the definition of a p-value and understands it. I think what you were trying to get at is that there are a lot of people doing statistics that don't understand what a p-value is (but just because they are "doing" statistics doesn't mean they are a statistician).

Bayesians don't use p-values but any good bayesian has learned the frequentist framework and understands what a p-value is.

16. ## Re: Is it better if it isn't statistically significant?

I was actually confusing confidence intervals with p values in discussing differences between Bayseans and frequentist. According to my professor in CTT/IRT the way that the two groups viewed the former was very different (although the differences seemed technical substantively they were not).

Statisticians may agree on what a p value is. But social scientist, and perhaps other disciplines as well have signficantly different interpretations of what they entail. The discussion we have had on this thread suggest some of those disagrements. There is also the issue of whether the null can ever be "true" or simply not rejected.

You are right that I was talking about those who use statistics in academics who are not statisticians. My experience with true statisticians (those who's degrees are in math) is limited.

17. ## Re: Is it better if it isn't statistically significant?

Originally Posted by noetsi
I was actually confusing confidence intervals with p values in discussing differences between Bayseans and frequentist.
Well Bayesians have "credible intervals" not confidence intervals so I would say that bayesians and frequentists don't disagree on how to interpret confidence either. They just disagree on which is actually appropriate.