Calculating special lottery odds

#1
#s out of a bag!

Hello, I would appreciate anyone's help in this problem:

There are 11 balls in a bag numbered 0-10.

You get 6 picks out of the bag. Pull 1 ball, write down number, and put it back (ie interpreted as with repetition?).

If person 1 chooses 6 balls out of the bag, IE 123456, what is the probability that person 2 who chooses matches 6 balls, 5 balls, 4 balls, 3 balls, 2 balls, 1 ball and none, considering that order does not matter (ie, combination? 123456 = 6 correct, 654321 still equals 6 correct, whereas 123457 = 5 correct).

Greatly appreciate assistance with this problem which has me scrambling!!!

Many thanks!!
 

BGM

TS Contributor
#2
- If your labels are \( \{0, 1, 2, \ldots, 11\} \), then you have 12 labels.

- If person 1 is allowed to draw with replacement, and obtain a repeated result say 111234, then how did you define the number of matches with person 2? Or person 1 is drawing without replacement?

- If you are drawing with replacement, then you are probably working with some multinomial probabilities. But before going on lets clear up the above questions first.
 
#3
Thanks for your reply.

The labels are {0,1,2,3,4,5,6,7,8,9,10} - 11 labels total.

Yes, person 1 is allowed to draw with replacement. Person 1 can draw 111111, and person 2 can draw 111222 (resulting in a match of 3 numbers). Another example is person 1 can draw 123455, and person 2 can draw 344455 (resulting in a match of 2 numbers).
 

BGM

TS Contributor
#4
Sorry missed your reply. The mathematical formulation is as follow:

Let \( \mathbf{X} = (X_1, X_2, \ldots, X_k) \sim \text{Multinomial}\left(n; \frac {1} {k}, \frac {1} {k}, \ldots, \frac {1} {k}\right)
\)
be the counts of each ball picked by person 1, where \( n \) is the number of picks and \( k \) is the number of balls in the bag. In particular, \( n = 6 \) and \( k = 11 \) in your question.

Similarly we can let \( \mathbf{Y} = (Y_1, Y_2, \ldots, Y_k) \) be another multinomial vector representing the picks from person 2, which has the identical distribution as \( \mathbf{X} \) and they are independent.

The number of matches, by definition, is

\( Z = \sum_{i=1}^k \min\{X_i, Y_i\} \)

and you want to find out the distribution of this discrete random variable.

I have not find a nice method to tackle this problem yet; just use R to simulate 1 million times to obtain a numerical solution first.

Code:
m<-1000000
k<-11
n<-6
pr<-rep(1/k,k)
x<-rmultinom(m,n,pr)
y<-rmultinom(m,n,pr)
count<-rep(0,7)
for (i in 1:m) {
	j<-sum(pmin(x[,i],y[,i]))
	count[j+1]<-count[j+1]+1
}
count/m

[1] 0.041412 0.209483 0.371843 0.280376 0.087191 0.009471 0.000224
where the last line gives you the estimate of probability mass function \( \Pr\{Z = z\}, z = 0, 1, \ldots, 6 \) from left to right.
 

BGM

TS Contributor
#5
Note that when \( k >> n \), the probability of having counts larger than 1 is very small. In such case, \( Z \) can be approximated by a hypergeometric distribution (as if considering drawing without replacement), with pmf

\( \Pr\{Z = z\} = \frac {\displaystyle \binom {n} {z} \binom {k - n} {n - z}} {\displaystyle \binom {k} {n}}, z = 0, 1, \ldots, n \)

Of course it is not very suitable for your question as \( n, k \) are pretty close. And you see the resulting pmf deviate from the previous simulation quite a lot.

Code:
dhyper(0:6,6,5,6)
[1] 0.000000000 0.012987013 0.162337662 0.432900433 0.324675325 0.064935065 0.002164502