p value is a probability that an event happened by chance.

I know this is a common interpretation, but it's probably best avoided. The p value tells you the probability of observing the "event" by chance

*if *the null hypothesis is true. It does not tell you the probability that the data that you have observed occurred because of "chance" (as opposed to because the null hypothesis is actually false).

Anyway, criticising is easy but explanations aren't, so here's my go:

----------------

Imagine we are interested in some

**parameter **- say, a correlation.

We would like to know whether the value of this parameter in some

**population **is zero or not. So we specify a couple of hypotheses about this parameter that we will test:

The null hypothesis: The parameter is

*exactly* equal to zero in the population

The alternative hypothesis: The parameter is not equal to zero in the population

Although our interest is in the population, we don't have unlimited time and money, so can't get data from every member of the population. So we draw a sample from the population, and calculate a

**test statistic** that is an

*estimate *of the population parameter. For example, a Pearson product-moment correlation coefficient.

(Note: Chances are, even if the population parameter is exactly zero, our sample statistic would

*not* be exactly zero, due to "chance" / sampling error).

We are then able to ask the following question:

IF the null hypothesis is actually true, what is the probability of observing a test statistic as far or further from zero, in our sample of data?

This is the p value.
E.g., if the true value of a correlation between two variables in a population is actually zero, the probability of observing a correlation of 0.2 or greater in a sample of 30 people is 0.289.

If the p value is "small" (usually the cutoff is 0.05) we say that we can

*reject *the null hypothesis, and support the alternative hypothesis. If the p value is

*above *0.05, we cannot reject the null hypothesis. Note that a p value larger than the 0.05 cutoff is NOT evidence that the null hypothesis is true; it just means we haven't got enough evidence to reject it yet.

Essentially the logic here is that if the data would be unlikely if the null was true, we therefore think that the null hypothesis itself must be unlikely, and reject it. Formally, this is a logical fallacy known the probabilistic modus tollens: Just because a set of data would be unlikely if an hypothesis was true does not necessarily mean that the hypothesis itself is unlikely.

Now you may be left with questions such as:
-What good is a method that can only provide evidence in favour of one hypothesis, but not the other?

-A parameter such as a correlation could take any of an infinite number of values; why on earth would it be

*exactly* zero? And if such an hypothesis isn't very plausible, why would we bother testing it?

-Why do we worry so much about whether a parameter is zero or not, instead of trying to figure out the most likely range of values for the parameter?

-Even if we did care about whether a parameter is zero or not, why don't we calculate the probability of the hypothesis

*given the data observed*? Surely that's more interesting than the probability of the data observed

*given the hypothesis*?

-Why is a statistical method that essentially relies on a simple logical fallacy so popular?

If you have concerns like this, you'd be in

very good company.