Dice and Don't Cares

zogg

New Member
Hi. I'm working on deriving a probability formula to help in some research I'm doing, but I've hit a wall. I've managed to reduce my problem down to a dice problem, which effectively makes it a textboot problem. There one aspect of the problem I think I've solved, but another that I can't seem to crack.

Part I understand: If I roll 8, 4-sided dice, what is the probability that I will get 4 unique pairs? (ie, 1 1 2 2 3 3 4 4)
My solution: (8 2)(6 2)(4 2)(2 2) / 4^8
[ (8 2) is 8 choose 2 ]
That one seems to work, and in the end simplifies down to a multinomial coefficient. (Which makes sense.)

Now, I expand the problem: If I roll 16, 4 sided dice, what is the probability that I will get 4 unique pairs? (I don't care what the rest of the dice do, but at minimum I need the 4 pairs. So, I could get 1 1 2 2 3 3 4 4 1 2 3 4 4 4 3 and be happy.)

Originally I thought it would be (16 2)(14 2)(12 2)(10 2) / 4^16, but I think I've now realized that that dice I don't care about are causing me trouble.

Does anyone have any thoughts?

BGM

TS Contributor
Let $$X_i$$ be the number of times obtaining the number $$i, i = 1, 2, 3, 4$$

If you toss the fair dice $$n$$ times, then

$$(X_1, X_2, X_3, X_4) \sim \text{Multinomial}\left(n, \frac {1} {4}, \frac {1} {4}, \frac {1} {4}, \frac {1} {4}\right)$$

i.e. The joint probability mass function is given by

$$\Pr\{X_1 = x_1, X_2 = x_2, X_3 = x_3, X_4 = x_4\} = \frac {n!} {x_1!x_2!x_3!x_4!} \frac {1} {4^n}$$

In general it is difficult to count for the number of integer points in a arbitrary set when $$n$$ is large. However since you only counting up to 2, it is still possible to count by hand:

$$\Pr\{X_1 \geq 2, X_2 \geq 2, X_3 \geq 2, X_4 \geq 2\}$$ $$= 1 - \Pr\{(X_1 < 2)\cup(X_2 < 2)\cup(X_3 < 2)\cup(X_4 < 2)\}$$

We can use the inclusion-exclusion principle to evaluate the latter property. Note that
the following vectors

$$(X_1, X_2 + X_3 + X_4), (X_1, X_2, X_3 + X_4)$$

also have multinomial distribution (the former one actually reduce to two categories and actually reduce to binomial). Therefore,

$$\Pr\{X_i < 2\} = \Pr\{X_i = 0\} + \Pr\{X_i = 1\} = \frac {n!} {0!n!} \frac {1^0 \times 3^n} {4^n} + \frac {n!} {1!(n-1)!} \frac {1^1 \times 3^{n-1}} {4^n} =$$ $$\frac {3^{n-1}(3 + n)} {4^n}$$

$$\Pr\{X_i < 2, X_j < 2\}$$
$$= \Pr\{X_i = 0, X_j = 0\} + \Pr\{X_i = 1, X_j = 0\} + \Pr\{X_i = 0, X_j = 1\}$$ $$+ \Pr\{X_i = 1, X_j = 1\}$$
$$= \frac {n!} {0!0!n!} \frac {1^0 \times 1^0 \times 2^n} {4^n} + \frac {n!} {1!0!(n-1)!} \frac {1^1 \times 1^0 \times 2^{n-1}} {4^n}$$
$$+ \frac {n!} {0!1!(n-1)!} \frac {1^0 \times 1^1 \times 2^{n-1}} {4^n} + \frac {n!} {1!1!(n-2)!} \frac {1^1 \times 1^1 \times 2^{n-2}} {4^n}$$
$$= \frac {2^{n-2}[4 + 4n + n(n-1)] } {4^n}$$

Similarly,

$$\Pr\{X_i < 2, X_j < 2, X_k < 2\} = \frac {1 + 3n + 3n(n-1) + n(n-1)(n-2)} {4^n}$$

And obviously $$\Pr\{X_1 < 2, X_2 < 2, X_3 < 2, X_4 < 2\} = 0$$ as $$X_1 + X_2 + X_3 + X_4 \equiv n, n > 4$$

Finally, by symmetry and the inclusion-exclusion principle,

$$\Pr\{(X_1 < 2)\cup(X_2 < 2)\cup(X_3 < 2)\cup(X_4 < 2)\}$$

$$= \binom {4} {1} \Pr\{X_i < 2\} - \binom {4} {2} \Pr\{X_i < 2, X_j < 2\} + \binom {4} {3} \Pr\{X_i < 2, X_j < 2, X_k < 2\}$$ $$- \binom {4} {4} \Pr\{X_1 < 2, X_2 < 2, X_3 < 2, X_4 < 2\}$$

and the answer is left for you to simplify

zogg

New Member
Thanks for the detailed reply, I appreciate it!

Sadly I will need to work through the generic problem (D n-sided dice rolled to get n unique groups of at least k each.)

Your reply gives a great place to start and to reason from. Thanks again.

BGM

TS Contributor
If the numbers $$n, k$$ are quite large, then you may also use the mutivariate normal approximation to help, rather than counting the exact one.