the way I explain p value: probability of observing effect when, in reality, there is no effect. This translates to probability of data given the null (null=there is no effect!). I was thinking that this is the same as saying "the probability of a false positive". But then I was told that it's wrong, because p value is NOT probability of an error. This is where I am confused. Type 1 error is probability of false positive, and alpha is accepted max level of Type 1 error.

With this definition of p value: prob(data) | null, let's say that we have our cutoff alpha at .05. We observe p = .04, and so we say there is a 4% probability that we observed our data when there isn't an effect (null is true). I can reword this to say "there is a 4% probability of a false positive", or "every 4 out of 100 times I do experiment I will find effect of this size or more, supporting my alternative hypothesis".

To me, saying "p=prob(data) | null" is exactly the same as saying p is the probability of observing a false alarm (which by definition is noise incorrectly identified as signal). What is my misunderstanding?