A bayesian wouldn't interpret a confidence interval in such a way - they would interpret a credible interval in that way. Bayesians don't use confidence intervals.
It has been too many years to be sure but the way I remember the professor drawing the distinction was that baysians would say that a 95% Ci there was a certain probability that the true effect level lay in the CI. And frequentist had an entirely different interpretation of what the 95% CI neant (one far less intuitively obvious to me),
But as I said it has been many years. The one thing I do remember was the professor stressing strong substantive differences between the baysian members in the department and the frequentist on this issue
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
A bayesian wouldn't interpret a confidence interval in such a way - they would interpret a credible interval in that way. Bayesians don't use confidence intervals.
I don't have emotions and sometimes that makes me very sad.
Hi all, thanks for the discussion. Let me add some more comments.
I agree that I cannot reject the null hypothesis _at the 0.050 level_ (or whatever level I test at). I wouldn’t dispute that at all. But, that’s not really what I’m saying when I say “there’s at least a 50% chance that population A is greater than population B.” I don’t see p-values even entering into it when sample “a” has a greater mean than sample “b”. That fact by itself seems, to me, to lead to being able to say “it’s more likely than not that A is greater than B.”
I see two similar, but very distinct questions. Or, perhaps, two different claims.
Perhaps it would help to talk in terms of a one-sided t-test. If A is greater than B, then the p-value for the one-sided t-test can only be <0.5. It can never be >0.5, or even exactly equal to it.
Does that bring clarity, or more confusion?
The problem with that intepretation (that is your original comment not the one you just added) is that you are assuming the effect size (the difference between A and B) exists in the population rather than just the sample. Instead the result you found could have been due just to sampling error and not in the real population. At its heart that is what the null is testing, does the effect size that I found in my sample actually exist in the population (or one more extreme I believe). If you had the population you could make the statement you did and not even need a statistical test. But you don't. and thus nearly any statement you make about the effect size is uncertain.
I don't think this is changed by whether you use a one or two tailed test. It is the nature of statistical tests.
Last edited by noetsi; 06-30-2015 at 08:46 AM.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
That seems like a misinterpretation. It's entirely possible that you get a significant hypothesis test and the "true" effect size is *much* smaller than the observed effect size. Really if you're testing against a null hypothesis of a 0 effect size then just from the result of the hypothesis test all you can really say is if you have evidence that a non-zero effect size actually exists. Your best guess at the true effect size is what you observe in the sample but there is nothing saying that the effect size in the population needs to be what you saw in the sample *or greater*.
I don't have emotions and sometimes that makes me very sad.
The way that the p value was explained to me was what was the probability of getting a result at least as extreme as you got if the null is true. But I am sure now that was wrong
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
I don't have emotions and sometimes that makes me very sad.
noetsi (06-30-2015)
Ah that is an important difference. I always thought it meant an effect size as extreme not a test statistic.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Tweet |