Hi guys,
This is my first post here. I am currently enrolled in the Coursera Bayesian Statistics course from Duke University. While I've enjoyed the Statistics with R specialization so far, I think this course is not at the same level as the other ones.
In one of the quizzes from the course there is the following question:
**Suppose that you are trying to decide whether a coin is biased towards
heads ($p$ = 0.75) or tails ($p$ = 0.25). If you decide incorrectly, you incur a
loss of 10. Flipping another coin incurs a cost of 1. If your current posterior
probability of a head-biased coin is 0.6, should you make the decision now or
flip another coin and then decide?**
The possible answers are the following:
*A) Flip another coin, since the minimum posterior expected loss of
making the decision now is 4, while the minimum posterior
expected loss of making the decision after seeing another coin flip
is between 2 and 3.*
*B) Make the decision now, since the minimum posterior expected
loss of making the decision now is 4, while the minimum posterior
expected loss of making the decision after seeing another coin flip
is between 5 and 6.*
*C) Flip another coin, since the minimum posterior expected loss of
making the decision now is 4, while the minimum posterior
expected loss of making the decision after seeing another coin flip
is between 3 and 4.*
*D) Make the decision now, since the minimum posterior expected
loss of making the decision now is 4, while the minimum posterior
expected loss of making the decision after seeing another coin flip
is between 4 and 5.*
I will write down my solution (and the errors that I made before getting to this point) and I would appreciate any feedback, comments or corrections. Since this is my first encounter with Bayesian statistics, I will use the notations from the Coursera course, but I hope they will be clear.
Let $BH$ be the event that the coin is biased towards heads (H) and $BT$ the event that it is biased towards tails (T). The prior probability that the coin is biased towards heads is $P(BH)$ is 0.5, and similarly $P(BT) = 0.5$.
From the statement of the problem we know that the posterior probability that the coin is biased towards heads is $P^*(BH) = P(BH|data) = 0.6$. This implies that $P^*(BT) = P(BT|data) = 0.4$.
Let $d_1$ be the decision that the coin is biased towards heads (BH) and $d_2$ the decision that it is BT. Then $L(d_1) = 0$ if $d_1$ is correct and $L(d_1) = 10$ if $d_1$ is wrong. Similarly, $L(d_2) = 0$ if $d_2$ is correct and $L(d_2) = 10$ if $d_2$ is wrong. Using the posterior probabilities, we can compute the expected loss of each decision. We have
$$E(L(d_1)) = P^*(BH) \cdot 0 + P^*(BT) \cdot 10 = 4,$$
$$E(L(d_2)) = P^*(BH) \cdot 10 + P^*(BT) \cdot 0 = 6.$$
So the minimum expected loss is when taking decision $d_1$ and this expected loss is 4.
We now consider the case of a new flip. Our posterior probabilities become the priors.
We first assume that this new flip is H (heads). Then the new posterior probability that the coin is BH is
$$P^{**}(BH) = P(BH|new flip = H) = \frac{P(new flip = H|BH) \cdot P^*(BH)}{P(new flip = H|BH) \cdot P^*(BH) + P(new flip = H|BT) \cdot P^*(BT)} = \frac{0.75 \cdot 0.6}{0.75 \cdot 0.6 + 0.25 \cdot 0.4} = \frac{9}{11}.$$
This implies that $P^{**}(BT) = \frac{2}{11}$ and the expected loss in this case is
$$E(L(d_1)) = P^{**}(BH) \cdot 0 + P^{**}(BT) \cdot 10 = \frac{20}{11},$$
$$E(L(d_2)) = P^{**}(BH) \cdot 10 + P^{**}(BT) \cdot 0 = \frac{90}{11},$$
so we choose decision $d_1$.
We now check what happens if the next flip is T (tails). As before, the new posterior probability that the coin is BH is
$$P^{**}(BH) = P(BH|new flip = T) = \frac{P(new flip = T|BH) \cdot P^*(BH)}{P(new flip = T|BH) \cdot P^*(BH) + P(new flip = T|BT) \cdot P^*(BT)} = \frac{0.25 \cdot 0.6}{0.25 \cdot 0.6 + 0.75 \cdot 0.4} = \frac{1}{3}.$$
This implies that $P^{**}(BT) = \frac{2}{3}$ and the expected loss in this case is
$$E(L(d_1)) = P^{**}(BH) \cdot 0 + P^{**}(BT) \cdot 10 = \frac{20}{3},$$
$$E(L(d_2)) = P^{**}(BH) \cdot 10 + P^{**}(BT) \cdot 0 = \frac{10}{3},$$
so we choose decision $d_2$.
Now it's when I started choosing one of the answers. My initial argument was to take the minimum expected loss when the next flip is H (which is $\frac{20}{11}$) and add 1 (for the cost of the flip) to obtain a number in the interval (2, 3), so I chose the answer A). This was obviously wrong and I realised immediately that I cannot exclude the case when the next flip is T.
My second approach was to average the posterior expected losses for the two cases that the new flip is H or T. I obtained
$$\frac{1}{2}\left(\frac{20}{11} + \frac{10}{3}\right) = \frac{170}{66} \in (2, 3).$$
Since the cost of the new flip is 1, I said that the minimum posterior expected loss is between 3 and 4, so my answer was C) and it was correct. However, after thinking a bit more about it, I realised that the two outcomes H and T of the new flip are not equally likely, so I cannot just use the average. I first computed the probability that the next flip is H. This is
$$P(next flip = H) = P(next flip = H|BH) \cdot P^{**}(BH) + P(next flip = H|BT) \cdot P^{**}(BT) = \frac{3}{4} \cdot 0.6 + \frac{1}{4} \cdot 0.4 = 0.55,$$
so $P(next flip = T) = 0.45$.
Hence the minimum posterior expected loss of making the decision after seeing another coin flip is
$$0.55 \cdot \frac{20}{11} + 0.45 \cdot \frac{10}{3} = 2.5,$$
to which we have to add 1 (for the cost of the flip). We again get a number between 3 and 4, so we arrive at the same answer C) as before.
Could you please tell me if my argument is valid? And if you have other ideas on how to solve this problem, let me know. I would also appreciate any suggestions for nice introductory books on Bayesian statistics. I looked for similar problems, but I couldn't find a nice example on decisions and discrete expected losses. Is the quantity that I'm computing here (so the 2.5 that I got at the end, before adding the cost of the new flip) the Bayes risk? There is a lot of information available online, but it is not that easy to get to the exact point that one needs when writing a solution.
And one more thing: from the formulation of the question, is it clear that the cost of a new flip should be added to the minimum posterior expected loss? Sometimes I feel like the questions are a bit vague. Here is the feedback that I got when I selected the wrong answer:
**Using Bayes rule and the prior, if you observe tails, you should
predict a tails bias and if you observe heads, you should predict a
heads bias. Use that decision rule to find the posterior expected loss.**
I don't know about you, but I don't think these lines are that helpful to get the correct answer.
Sorry for keeping it so long. I'm looking forward for your replies.
Andrei
Tweet |