A standard question with the definition of power/significance level.
The important thing in this example is that the sampling distribution is the hypergeometric distribution.
A box contains 20 balls, some red and some black. We want to test the hypotheses:
Null: there are exactly 4 red balls in the box.
Alternative: there are more than 4 red balls in the box.
Suppose we draw two balls without replacement, and our decision rule is to reject the null hypothesis if both balls are red.
i) calculate alpha, the significance level of the test.
ii) calculate the power of the test for the specific alternative situation where there are in fact 12 red balls and 8 black balls in the box.
A standard question with the definition of power/significance level.
The important thing in this example is that the sampling distribution is the hypergeometric distribution.
Re 1): alpha, the Type 1 error, is not calculated. It is fixed (set in advance), say alpha=5%, alpha=10%. Then the test is said to be of size 5%, 10%.
Hmmm... Alpha is constant by it's very definition. It is chosen a priory. Do you mean we can figure out what p-value is?
Also, what do you mean by cutoff here: "then choose our cutoff to reflect that". What is this cutoff?
Last edited by d21e7x11; 10-09-2011 at 12:06 PM.
But it is a function of the cutoff value. It doesn't need to be chosen a priori. In practice that is what (typically) happens. But for a homework problem to illustrate these concepts it doesn't get set a priori.
No. What Dason (and you) mean is that in the classical hypothesis testing set up, the significance level is fixed in advanced and the rejection rule is chosen to satisfy this condition. But now given a rejection rule, the question try to ask you what is the original significance level is.
Of course in practice if you want to do hypothesis testing you are not doing the reverse way. But that is just an exercise.
I think it would be interesting to find out if it was written like this "i) calculate alpha, the significance level of the test." in the text-book or it's drippydrop22's wording. Again, the significance level of the test is set (ALWAYS!) a priory. Then after the data are collected, we can calculate the (observed) p-value. P-value IS NOT a Type 1 error!
Just to let you know... It's not always the significant level that is set a priori. We just need to set something a priori. Whether it is the cutoff level (which implies a certain type-I error rate) or alpha (which directly implies the type-I error rate) or the power against a certain alternative hypothesis (which once again implies a certain type-I error rate but isn't the thing we're setting directly) or the expected FDR.
In the case of the question for the OP they're setting the cutoff level which implies what the type-I error rate given that the null is true will be - but it isn't the thing we're setting directly - it's just something we can calculate once we have the cut off level.
Last edited by Dason; 10-09-2011 at 12:27 PM.
Yes, thank you. Now here: "for the OP they're setting the cutoff level which implies what the type-I error rate given that the null is true will be", what is this cut off level? Cut off of what?
It's right there in the OP
Suppose we draw two balls without replacement, and our decision rule is to reject the null hypothesis if both balls are red.
Oh, so it's something problem-specific, and then in 1) it is beeing asked to translated it into what alpha is. Ok.
That was the wording provided in the problem. I'm still not certain how to calculate the alpha given this problem. I was under the impression from lecture that alpha was typically set a priori. We have never dealt with hypergeometric distributions, so I don't think that would be the way he would expect us to solve the problem. Is there any alternative ways to solve this problem?
I have been playing around with the probability of having four red balls (.2) and the probability of drawing two back to back if there are actually four in the bag (.032). But it is just a guess at this point.
Well you can derive the hypergeometric distribution from a simple combinatorics argument so that's probably the route your professor was looking for.
i will try to use that. thank you all for your help.
Tweet |