A bet between a buddy and I regarding the lottery

#1
Hello all,

My coworker is convinced that the Powerball lottery is rigged against those players purchasing more than one game (set of 6 randomly chosen numbers) on a single ticket. His evidence: When he plays more than one game on a single ticket, he notices that a given number often shows up more than once across all games – e.g. the number 41 shows in game 1 and also in game 5. He claims that the lottery system is unfairly biasing subsequent 'random picks' on the ticket based on already-chosen numbers on the ticket, therefore reducing the chance of winning across all games when the paired numbers are not drawn at Powerball time.

Setting aside the fact that he would just as easily win across several games as he would lose across several games when pairs appear, I think he is mistaken in his surprise that a large number of pairs appear across randomly generated sets. And we have devised an experiment (and $1 wager) to test it. But I need a bit of help from you to inform the terms...

The experiment:
My coworker will buy one single ticket with 10 games on it. He will also buy 10 single tickets with only one game on each. I believe that we will see a relatively similar number of pairs across the single tickets as the 10-game ticket; he posits that the 10-game ticket will have significantly more matched numbers.

My conundrum:
I don't know how many pairs of numbers I should statistically expect, and thus cannot formulate the crux of the wager. If I knew, for example, that I should expect 20 matching numbers across 10 games, then I could propose reasonable terms to my coworker based on that. (Such as – he wins if the difference between matching pairs is +4 on his ticket. Or something. Maybe you have an idea of how that could be most fairly calculated, too?)

The Powerball game in our state asks you to pick 5 numbers between 1-59 and one Powerball number between 1-35 (a total of 6 numbers per game). No two numbers in the first 5 could be the same, though the Powerball may be.

If I play 10 random games, how many matching numbers can I expect to see on my ticket?

I kind of have no idea for how to go about solving this problem. I see that this board has a great number of students on it looking for homework help; I read the sticky about attempting to solve the problem before asking for answers, but I'm hoping that this applies only to students...? :)

Thanks in advance for any help you may offer.
 

Dason

Ambassador to the humans
#2
What is the difference between buying a single ticket with 10 games on it and 10 tickets with 1 game on each?
 
#3
That's exactly the point of the experiment – to prove that there is in fact no difference. My coworker believes that the lottery is deliberately and nefariously undermining the odds of winning for those playing multiple games on a single ticket. He sees the relatively high incidence of matching number pairs across games and believes that the random number-generating system is not, in fact, random. Because, says he, if the number 41 is shown on several games on that ticket, all those games lose when the number 41 is not drawn.

He's talking about taking this into a class-action lawsuit against the lottery – not a joke. I'd like to calm him down a bit.
 

hlsmith

Omega Contributor
#4
Get that person some meds. Lottery is for fools.


What about the scenario where 41 is a winning number? The lottery folks would be resulting in more winners.


So what is your question, on a ticket with 10 games - how often you should expect to see a number come up more than once?
 
#5
I think your friend is suffering from a severe case of selection/confirmation bias (i.e., counting the hits and ignoring the misses). He will need a few thousand lottery tickets of each kind (single & multiple selections) to confirm his suspicions with any reasonable degree of confidence.

That said, the following table gives the probability that at least one number will be repeated in n selections of five numbers (1 to 59) for n = 2 to 11. For n = 1 the probability is obviously 0, while for n ≥ 12 the probability is 100%. So even if only two selections of five numbers are made on the same ticket, there’s a chance of almost one in three that at least one number will appear in both selections.

Code:
2	30.51%
3	53.45%
4	70.17%
5	81.93%
6	89.81%
7	94.78%
8	97.67%
9	99.15%
10	99.78%
11	99.9998%
Note that this is similar to the so-called Birthday Problem, which illustrates that our commonsense ideas about probabilities can sometimes be wa-a-a-a-ay off.
 
#6
hlsmith
What about the scenario where 41 is a winning number? The lottery folks would be resulting in more winners.
Exactly the point I also made to him.

Con-Tester
I think your friend is suffering from a severe case of selection/confirmation bias (i.e., counting the hits and ignoring the misses). He will need a few thousand lottery tickets of each kind (single & multiple selections) to confirm his suspicions with any reasonable degree of confidence.

That said, the following table gives the probability that at least one number will be repeated in n selections of five numbers (1 to 59) for n = 2 to 11. For n = 1 the probability is obviously 0, while for n ≥ 12 the probability is 100%. So even if only two selections of five numbers are made on the same ticket, there’s a chance of almost one in three that at least one number will appear in both selections.
I agree with you both, hlsmith and Con-Tester, in that my coworker has a misinformed opinion about the lottery and how probabilities shake out.

Thank you for helping me with the math here, Con-Tester. I guess the only thing that I could say, given the table you posted, is that there is a 99.78% chance that there will be at least one matching pair of numbers on a 10-game ticket.

How about this – and yes, this is a similar setup to the birthday problem – how many pairs in n = 10 selections of five numbers would the probability be greater than 50%?

E.g. There is 51% chance that a lottery ticket with 10 selections contains at least X matching pairs.

Is this something that could be expressed?
 
#7
Interestingly enough, I just realized we needn't do a 'real life' experiment by buying tickets at all – I can simulate random lotto picks in Excel and count the matches. Trouble is that the RANDBETWEEN function runs independently in each cell. To do it without replacement, ensuring the same number cannot be picked for each set of five, requires some vba. :(
 
#8
R is better suited to this kind of simulation than Excel and because I had some time on my hands I've run a simulation of 100,000 x 10 game tickets for you:

Code:
set.seed(111)
ticket <- function() {
 games <- replicate(10, c(sample(1:59, 5, replace = F), sample(1:35, 1, replace = F)))
 reg_pairs <- table(games[1:5,])
 pb_pairs <- table(games[6,])
 return(c(length(reg_pairs[reg_pairs > 1]), length(pb_pairs[pb_pairs > 1]), 
          sum(reg_pairs[reg_pairs > 1]), sum(pb_pairs[pb_pairs > 1])))
 }

results <- replicate(100000, ticket())
apply(results, 1, mean)
This returns four mean values from the 100,000 tickets: (1) The mean of the number of distinct balls that are repeated at least once from the 5x10 regular draw. (2) The mean of the number of distinct balls that are repeated at least once from the powerball set. (3) The mean of the total number of duplicated balls from the regular draw. (4) The mean of the total number of duplicated balls from the powerball draw.

Code:
12.12607  1.10163 27.46005  2.29083
So on average there are a little over 12 distinct pairs per 10 game ticket and 1 paired powerball as well as a little over 27 balls that share a value with another ball in the set of 50 and about 2.3 powerballs that share a value with another powerball on the ticket.

Further, using this simulated data we can determine the approximate probability of getting exactly n pairs of regular balls on a 10 game ticket.

Code:
y <- c()
for (i in 1:26) { 
y[i] <-  mean(results[1,] == i-1)
} 
data.frame(pairs=0:25,prob=y)
Which gives:
Code:
   pairs    prob
   <=4 0.00000
     5 0.00006
     6 0.00067
     7 0.00324
     8 0.01578
     9 0.04845
    10 0.10964
    11 0.18502
    12 0.22047
    13 0.19819
    14 0.12944
    15 0.06139
    16 0.02116
    17 0.00544
    18 0.00095
    19 0.00010
  >=20 0.00000
And the approximate probability of getting exactly n pairs of powerballs:

Code:
z <- c()
for (i in 1:6) { 
  z[i] <-  mean(results[2,] == i-1)
}
data.frame(pairs=0:5,prob=z)

  pairs    prob
     0 0.24272
     1 0.46163
     2 0.24925
     3 0.04412
     4 0.00226
     5 0.00002
I'll add my voice here to the opinion that your co-worker seems to have a fundamental misunderstanding regarding the probabilities of pairs appearing on a 10 game ticket and I'm not sure how they believe the lottery benefits from manipulating multi game vs single game tickets. That said, in regards to settling your wager, purchasing 10x1 game tickets is not even remotely close to sufficient in determining if there is any bias in the system - you're going to need a much larger sample size than that.
 
#9
Wow, thanks helicon. This is quite comprehensive and nicely presented! Re: your opinion, I am 100% in agreement – though, as I'm sure you know (if even a small amount), that those whose noses smell a conspiracy are difficult to sway...even with, or perhaps despite, the logic and reason used to argue. Agreed, too, that our wager is not sufficient a test. It's just a fun thing that we're doing to pass the time.