Thread: Data analysis - which test is best

1. Re: Data analysis - which test is best

Fantastic thanks guys! You've made me a very happy girl

2. Re: Data analysis - which test is best

Originally Posted by Karabiner
Beg your pardon, but wouldn't that mean p(Hypothesis|Data), i.e. Bayes statistics?
With the frequentist approach, we achieve p(Data|Hypothesis) .

With kind regards

K.
I was giving a very general comment on what the p value tells you as I understand it. I am not enough of an expert in the theory of statistics to understand the distinctions you are raising The way I interpret a p value of a test is that you either reject the null or you don't - and if you do not it means that you can not be sure that the effect size you found was not due to random error in your sample. That is it would not exist in the population.

3. Re: Data analysis - which test is best

Originally Posted by noetsi
the "p" value which is an assessment of how likely that the results you got were entirely due to random error.
Hi noetsi,

I think Karabiner makes a good point about your comment here. The p value tells you the probability of observing the a test statistic as or more extreme than you have if the null hypothesis is true.

Note that the bit on the end of the definition means that the p value is conditional - it's a probability of observing something if the null hypothesis is true. We can fiddle with the wording of the definition above and still get across its essential meaning, but any definition of a p value that leaves off the conditional bit is always going to be wrong.

Some sources do describe the p value as the probability that the results were due to "random error" or "chance", but this interpretation leaves off the conditional part of the p value definition. If we changed it to "the probability of observing the results we have due to chance, if the null hypothesis was true", then it'd be better.

Without that conditional bit, the definition you've used could be interpreted as saying that the p value is the probability that the results were due to the null hypothesis being true; i.e. the probability that the null hypothesis is correct. But a p value absolutely can't tell you that information (unfortunately!)

4. Re: Data analysis - which test is best

While I don't doubt that is correct cowboybear I have no idea at all what it means in practice That is I don't know in practical teerms what it means to say it is conditional on "if the null hypothesis is true." Particularly since I was taught that the null hypothesis could never be determined to be true by statistics. You could (by rejecting the null) show the alternate hypothesis was true but you could never show the null was true. You either rejected it or failed to.

Actually this is one of those areas that has always puzzled me. If the probability is only when the null hypothesis is true, why does the p value have any value at all when the null is rejected? Which is exactly what occurs when p is below a certain value. It seems we are using p to reject the null, then saying the p value only has meaning when the null is true....

I interpet the p value as the the probability that you can reject the null primarily although I also think of it as the chance that the results could be tied to random error. The later might not be right

5. Re: Data analysis - which test is best

Originally Posted by noetsi
I interpet the p value as the the probability that you can reject the null primarily although I also think of it as the chance that the results could be tied to random error. The later might not be right
Both of those don't make much sense to me as intepretations of p-values. I don't like the second options because... well - we're assuming that there is random error even if we think the alternative is true so how does this help us say anything about the null hypothesis? As for the first interpretation you give here... what exactly does "the probability that you can reject the null" mean? That doesn't really make sense.

The p-value is literally just the result of a probability calculation. But to calculate probabilities we need a model to calculate these probabilities under. I can't tell you "Find the probability that X > 32.3" and have you give me a valid probability if you don't know the distribution of X. The p-value is the probability of observing a test statistic as extreme or more extreme than the observed test statistic under the assumption that the null hypothesis is true. The entire hypothesis testing framework is based on the idea that what we do is assume that the null is true, figure out the probability of the result happening (conditioned on the assumption that the null is true) and then use that to decide if the event was sufficiently rare enough (assuming the null is true) to think that the null isn't true. The p-value can't exist without the assumption that the null is true.

6. Re: Data analysis - which test is best

Well the way it appeared to me in classes and text is that you were trying to determine (in asking questions about differences in levels on the DV) if there was a real effects. In other words did level 1 and 2 of a variable differ in the population on the DV (that is was there an effect size in the population between them). I understood the p value to show how certain you could be that the effect size you discovered existed in the population as compared to just in your sample.

The first intepretation that you meantioned Dason means to me literally how reasonable is it tp reject the null - that is how likely is it if you do reject the null that this would be an error. As the p value went down I understood this to mean you could reject the null and be more certain that you were not making a type one error.

Although I know what you say is true dason (I have read it many times) I don't really understand what it means in practice. As for instance the two definitions I gave above would be. I obviously have a lot of work to do in theory (like most analyst I suspect I focus not on that, but on "are males more likely than females to do x, if p is below .05 they are if not we don't really know'

Rather than assuming the null is true to determine the distribution, why not simply use a QQ plot to determine what it is( I am guessing because the unknown population distribution not the sample distribution is what is critical).

7. Re: Data analysis - which test is best

Originally Posted by noetsi
I understood the p value to show how certain you could be that the effect size you discovered existed in the population as compared to just in your sample.
It would be nice if p values told us this, but they just don't

Significance Tests Die Hard: The Amazing Persistence of a Probabilistic Misconception

Very roughly speaking, a p value is the probability that you would observe the data* that you have, if the null hypothesis was true. Again, focus on the "if" bit: We're constructing an imaginary scenario in which the null hypothesis is true. In this imaginary scenario, how probable is it that we'd see the data that we have?

When you talk about "how certain you could be that the effect size you discovered existed in the population", it suggests to me that when you are analysing data you are actually interested in how certain you can be that particular hypotheses are true.

For instance, you might want to know the probability that the null hypothesis is true, given the data that you have observed. In fact, this is just another way of saying "how likely that the results you got were entirely due to random error", in your words.

The thing is that

1) the probability of the data, assuming the null hypothesis is true,

is completely different to

2) the probability that the null hypothesis is true, given the data you have observed.

This is a crucial point! Think about these two hypotheses until this becomes clear The problem with significance testing is that it gives us (1) when surely as researchers we are interested in (2).

Worryingly, there are cases where (1) may be very low, but (2) actually quite high. Bem's supposed finding of precognition in university students is an example. Yes, he found some results which would be reasonably unlikely if precognition does not exist. But it's still much more likely that the results did occur due to chance (and the odd questionable statistical decision) then it is that they occurred due to precognition actually existing.

*Obviously by the "data" we technically mean a test statistic as or more extreme than that observed.

8. Re: Data analysis - which test is best

Originally Posted by CowboyBear
*Obviously by the "data" we technically mean a test statistic as or more extreme than that observed.
Thank God for this. I thought your post was great except when you kept referring to the "probability of the data" I flinched a little bit. But you saved it all with this last line.

9. Re: Data analysis - which test is best

Oh this is priceless!!
"You have been banned for the following reason:
First, as user palmer you insulted people who helped you for free, now as Justice! you insulted one of our contributors again. I have no tolerance for these kind of practises. Be nice, or your next ban will be an IP ban!"

I would love to know how exactly Greta 'helped' me? All they did was talk down to me ! Obviously this message and my account will be deleted (again!) because Greta doesn't like it when the tables are turned, nor can they deny the fact that their form of 'help' on this forum involves snide and patronising comments.
I have read many of your comments Greta and it seems you use the same tactic you call 'helping' with other people other than just me. Obviously if any of these people have the 'gawl' to stand up to you they get deleted so you can get away with acting in this (I must say..) disgusting and shameful manner.
Funny how fast I had my question answered when you were not involved in the thread, and, amazingly, no-one felt the need to talk down to me either!
I have no idea who Mmanuel is, obviously some other poor soul who had the delight of experiencing your own brand of 'help'.
If there is one thing you should take from this (and I doubt you will as you are, like I said before, arrogant) is that you can help people WITHOUT the need to try and belittle or patronise, your colleagues have demonstrated this beautifully and I thank them for their help.
As for you Greta, you are, I'm sorry to say, rather pathetic. I was advised not to use these forums for help on statistics, not because it was not allowed, but because..and I quote, "they are all arrogant tw**s"... shame you have proved them right Greta as everyone else I encountered on this forum has been very helpful.

Now I have had my final say, go ahead and delete my account, I know you will

10. Re: Data analysis - which test is best

Actually, I pride myself on never being arrogant, it helps no-one, I highly doubt your 'expertise' in psychology Greta. Maybe you should look at yourself in the mirror my dear. Maybe I call you arrogant because (SPOILER ALERT) you ARE arrogant??

11. Re: Data analysis - which test is best

By the way, you said it all by starting your response with "I did not read your word as it is worthless"

I on the other hand did read yours because I am not an arrogant tw**