Dice and Don't Cares

zogg

New Member
#1
Hi. I'm working on deriving a probability formula to help in some research I'm doing, but I've hit a wall. I've managed to reduce my problem down to a dice problem, which effectively makes it a textboot problem. There one aspect of the problem I think I've solved, but another that I can't seem to crack.

Part I understand: If I roll 8, 4-sided dice, what is the probability that I will get 4 unique pairs? (ie, 1 1 2 2 3 3 4 4)
My solution: (8 2)(6 2)(4 2)(2 2) / 4^8
[ (8 2) is 8 choose 2 ]
That one seems to work, and in the end simplifies down to a multinomial coefficient. (Which makes sense.)

Now, I expand the problem: If I roll 16, 4 sided dice, what is the probability that I will get 4 unique pairs? (I don't care what the rest of the dice do, but at minimum I need the 4 pairs. So, I could get 1 1 2 2 3 3 4 4 1 2 3 4 4 4 3 and be happy.)

Originally I thought it would be (16 2)(14 2)(12 2)(10 2) / 4^16, but I think I've now realized that that dice I don't care about are causing me trouble.

Does anyone have any thoughts?
 

BGM

TS Contributor
#2
Let \( X_i \) be the number of times obtaining the number \( i, i = 1, 2, 3, 4 \)

If you toss the fair dice \( n \) times, then

\( (X_1, X_2, X_3, X_4) \sim \text{Multinomial}\left(n, \frac {1} {4}, \frac {1} {4}, \frac {1} {4}, \frac {1} {4}\right) \)

i.e. The joint probability mass function is given by

\( \Pr\{X_1 = x_1, X_2 = x_2, X_3 = x_3, X_4 = x_4\} = \frac {n!} {x_1!x_2!x_3!x_4!} \frac {1} {4^n} \)

In general it is difficult to count for the number of integer points in a arbitrary set when \( n \) is large. However since you only counting up to 2, it is still possible to count by hand:

\( \Pr\{X_1 \geq 2, X_2 \geq 2, X_3 \geq 2, X_4 \geq 2\} \) \(
= 1 - \Pr\{(X_1 < 2)\cup(X_2 < 2)\cup(X_3 < 2)\cup(X_4 < 2)\} \)

We can use the inclusion-exclusion principle to evaluate the latter property. Note that
the following vectors

\( (X_1, X_2 + X_3 + X_4), (X_1, X_2, X_3 + X_4) \)

also have multinomial distribution (the former one actually reduce to two categories and actually reduce to binomial). Therefore,

\( \Pr\{X_i < 2\} = \Pr\{X_i = 0\} + \Pr\{X_i = 1\}
= \frac {n!} {0!n!} \frac {1^0 \times 3^n} {4^n} + \frac {n!} {1!(n-1)!} \frac {1^1 \times 3^{n-1}} {4^n} = \) \( \frac {3^{n-1}(3 + n)} {4^n} \)

\( \Pr\{X_i < 2, X_j < 2\} \)
\( = \Pr\{X_i = 0, X_j = 0\} + \Pr\{X_i = 1, X_j = 0\} + \Pr\{X_i = 0, X_j = 1\} \) \(+ \Pr\{X_i = 1, X_j = 1\} \)
\( = \frac {n!} {0!0!n!} \frac {1^0 \times 1^0 \times 2^n} {4^n}
+ \frac {n!} {1!0!(n-1)!} \frac {1^1 \times 1^0 \times 2^{n-1}} {4^n} \)
\( + \frac {n!} {0!1!(n-1)!} \frac {1^0 \times 1^1 \times 2^{n-1}} {4^n}
+ \frac {n!} {1!1!(n-2)!} \frac {1^1 \times 1^1 \times 2^{n-2}} {4^n}\)
\( = \frac {2^{n-2}[4 + 4n + n(n-1)] } {4^n} \)

Similarly,

\( \Pr\{X_i < 2, X_j < 2, X_k < 2\} = \frac {1 + 3n + 3n(n-1) + n(n-1)(n-2)} {4^n} \)

And obviously \( \Pr\{X_1 < 2, X_2 < 2, X_3 < 2, X_4 < 2\} = 0 \) as \( X_1 + X_2 + X_3 + X_4 \equiv n, n > 4 \)

Finally, by symmetry and the inclusion-exclusion principle,

\( \Pr\{(X_1 < 2)\cup(X_2 < 2)\cup(X_3 < 2)\cup(X_4 < 2)\} \)

\( = \binom {4} {1} \Pr\{X_i < 2\} - \binom {4} {2} \Pr\{X_i < 2, X_j < 2\}
+ \binom {4} {3} \Pr\{X_i < 2, X_j < 2, X_k < 2\} \) \(- \binom {4} {4} \Pr\{X_1 < 2, X_2 < 2, X_3 < 2, X_4 < 2\} \)

and the answer is left for you to simplify :p
 

zogg

New Member
#3
Thanks for the detailed reply, I appreciate it!

Sadly I will need to work through the generic problem (D n-sided dice rolled to get n unique groups of at least k each.)

Your reply gives a great place to start and to reason from. Thanks again.
 

BGM

TS Contributor
#4
If the numbers \( n, k \) are quite large, then you may also use the mutivariate normal approximation to help, rather than counting the exact one.