# Thread: Pop corn vs Chocolate. Which statistical test?

1. ## Re: Pop corn vs Chocolate. Which statistical test?

You can create a bayesian model for any corresponding frequentist model.

2. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (01-01-2015)

Prove it!

4. ## The Following User Says Thank You to hlsmith For This Useful Post:

trinker (01-01-2015)

5. ## Re: Pop corn vs Chocolate. Which statistical test?

Originally Posted by newbiestat
I have the impression that we say the same thing with different words. The more extreme a calculated test statistic from a sample is, the less probable the sample comes from a population with the characteristics dictated by the null hypothesis is. In effect, the more extreme a calculated test statistic from a sample is, the more confident we are to reject the null hypothesis. It is that simple.
There is some good insight here, but it's not quite that simple.

P(H|D), the probability that an hypothesis is true, given the data observed
P(D|H), the probability of observing the data, given that the hypothesis is true

It is true that the smaller the probability of the data, P(D|H), the smaller the probability that the hypothesis is true, P(H|D). We know this from Bayes theorem. But that does not mean that these two probabilities are the same thing. The probability of the hypothesis, P(H|D), is what we generally want to know. But a p value (conceptually anyway) is the probability of the data, not of the hypothesis.

The other more specific issue here is that a conventional 2-tailed significance test would test a null hypothesis of exactly equal proportions, as mentioned above, but that isn't the hypothesis you're wanting to test (you want to test an hypothesis about the direction of the effect, i.e. that the number of people who like popcorn only is higher than the number who like chocolate only).

So the Bayesian test of the direction of the effect can end up giving a different answer for two reasons:
1) Because P(H|D) is not the same as P(D|H)
2) Because the hypothesis being tested is different.

Now with the data you have, it's got to be admitted: A significance test and a Bayesian test end up suggesting very similar answers. As I mentioned earlier, a Bayesian test implies that we can be very certain (99.8% sure) that the proportion of people in the population who like popcorn only is higher. A conventional binomial test (via binom.test) also provides a really small p value of 0.0066, implying that we can reject the null hypothesis.

But the tests won't always give similar answers. Say the number who liked popcorn was 31, and chocolate only 19. Then:

Code:
``binom.test(x = 31, n = 50)``
The significance test gives a p value of about 0.12, meaning that we can't reject the null hypothesis with an alpha level of 0.05, while the Bayesian test:

Code:
``bayes.binom.test(x = 31, n = 50)``
Gives a probability of 95.5% that the proportion who like popcorn only is higher. So the significance test is unable to give a firm answer even in a case where there actually is quite strong evidence that the relationship falls in a particular direction.

A big part of the reason that the tests don't give similar answers here is that the usual rules we use for interpreting significance tests have (imo) an implicit assumption that the null hypothesis is something plausible, that we should only reject when we have strong evidence. But in this case even a skerrick of common sense would tell us that in for a research question like this one, the null hypothesis, taken literally, is false. The only question to be answered is: What is the direction of the relationship? That question can be more directly and easily answered by abandoning the significance test.

Also, in non-parametric statistics you make no assumptions about distributions. Isn't that correct?
Definitions of non-parametric vary, but one way to think about them is that they are tests that don't assume that the data (or errors) take a particular probability distribution. But even if a test doesn't assume that the data or errors take a particular probability distribution, it may have other assumptions about the variables or data. For example, if you want to use a Mann-Whitney test as a test of a null hypothesis of equal medians, you need to assume that the two distributions being compared have the same shape and spread (even though you don't have to assume that the distributions take some defined probability distribution).

6. ## The Following User Says Thank You to CowboyBear For This Useful Post:

newbiestat (12-31-2014)

7. ## Re: Pop corn vs Chocolate. Which statistical test?

Just a note that in the last example with the binomial test it's not completely a fair comparison since the binom.test is doing a two-sided test by default but the bayes.binom.test is only look at one side.

8. ## The Following 2 Users Say Thank You to Dason For This Useful Post:

newbiestat (12-31-2014), trinker (01-01-2015)

9. ## Re: Pop corn vs Chocolate. Which statistical test?

Originally Posted by Dason
Just a note that in the last example with the binomial test it's not completely a fair comparison since the binom.test is doing a two-sided test by default but the bayes.binom.test is only look at one side.
Yep. You could definitely do a 1-tailed significance test. Hell, you could even interpret (1 - one-tailed p value) as the posterior probability that the effect is in the direction found. Assuming a uniform prior, that's going to be a reasonable interpretation.

So I guess the problem isn't necessarily the p value so much as the rules we generally apply for using them (e.g., you have to do 2-sided tests; you have to have a binary cutoff criterion of 0.05 instead of interpreting p values quantitatively; you can't interpret p values in a Bayesian way; etc).

10. ## Re: Pop corn vs Chocolate. Which statistical test?

Originally Posted by CowboyBear
A McNemar test is for when you have two (paired) nominal variables. There is only one nominal variable here. The conventional frequentist test would be a binomial test.
Thanks for the reply and sorry for keep on dragging on this, but aren't there two paired nominal variables? Variable one: Like C, Variable two: Like P, each with possible values yes/no. It appears to be paired as well (each respondent was asked the same question).

11. ## Re: Pop corn vs Chocolate. Which statistical test?

Blubblub,

I agree with you. I think it comes down to whether it is appropriate to turn 1 question into two and mcnemar still works. Since the original question could be did you purchase anything. Though since there are only two foods of interest (no data collection on pretzels or say cola) then I think it does fit into McNemar 's.

12. ## Re: Pop corn vs Chocolate. Which statistical test?

Originally Posted by blubblub
Thanks for the reply and sorry for keep on dragging on this, but aren't there two paired nominal variables? Variable one: Like C, Variable two: Like P, each with possible values yes/no. It appears to be paired as well (each respondent was asked the same question).

Originally Posted by hlsmith
Blubblub,

I agree with you. I think it comes down to whether it is appropriate to turn 1 question into two and mcnemar still works. Since the original question could be did you purchase anything. Though since there are only two foods of interest (no data collection on pretzels or say cola) then I think it does fit into McNemar 's.
Sorry for not addressing this bit earlier. Yep, it depends on whether or not the OP sticks with their original strategy on just comparing the "only chocolate" and the "only popcorn" responses. My response above assumes that this strategy is used. In that case, the two variables you'd create by having a "like chocolate" variable and a "like popcorn" variable would just be linear transformations of one another. So treating them as separate variables doesn't really make sense, and isn't really necessary (you can just use a simple binomial test).

But if they wanted to include the people who said they liked both or neither, then you could construct two variables that are actually distinct from one another (e.g., because some of the people who liked chocolate would also like popcorn, while some wouldn't, etc.) Then a McNemar test would sort of make sense. Though I'm not sure why you'd want to include the both and neither responses, only to not test any hypotheses or calculate any interval estimates for the proportion falling into the both and neither categories.