In hypothesis testing, we usually set it up with Ho, the null hypothesis, and Ha, the alternative hypothesis.

For example, the null (Ho) can be a statement about two or more means being equal. Ho: mu1=mu2

The alternative could be worded as the meansbeing equal.not

Ha: mu1 not= mu2.

sometimes it looks like --> Ha: mu1 <> mu2

Now, we want to try to disprove Ho by collecting data to see if mu1 and mu2 are "different" enough to reject Ho. How different do they need to be? We determine that by setting an alpha level. If we want a lot of evidence that mu1 and mu2 are different before we reject Ho, we set alpha very low, say at .01. If we only want a little bit of evidence, maybe we would set it higher, around .10. Usually, in stats textbooks, the typical alpha level is set at .05.

Then when we do compare the data between mu1 and mu2, we will compute a test statistic, say a t-test (t statistic), and we determine how likely it would be to compute a t-statistic that large or larger, if in fact Ho is true. This likelihood, or probability, is the p-value.

If the p-value is less than or equal to the alpha level we set before collecting the data, then we have enough evidence to reject Ho. If the p-value is greater than alpha, then we don't have enough evidence to reject Ho (sometimes people say that we "accept" Ho, but that's not technically correct - we "fail to reject" Ho.)

For a more thorough explanation of hypothesis testing, go to this link and read through the sections:

http://davidmlane.com/hyperstat/logic_hypothesis.html