# Interesting voting probability puzzle

#### Dason

##### Ambassador to the humans
John Jackson and Jack Johnson were going head to head in a mayoral race. Ultimately John Jackson obtained 300 votes and Jack Johnson obtained 200 votes. While counting the votes they announced after each vote what the current results were. What is the probability that at some point during the counting the result was a tie?

I found this problem at another forum I frequent and I thought it was interesting and that TalkStats might enjoy it. If you have an answer put it in [noparse][/noparse] tags so that others who want to try it don't accidentally read your solution. I'll probably add some hints eventually.

#### trinker

##### ggplot2orBust
I don't know the answer but am thinking aloud...

I used this R code to play around with real results:
Code:
table(sample(c(rep("Jack",200), rep("John", 300)), size=500, replace=F)[1:2])
By the second turn there's a 200/499 or 300/499 chance of a tie depending on who was chosen. This is by no means an answer just thinking... Thanks for the teaser Dason. I'm curious about the answer.

#### BGM

##### TS Contributor
I think of the following naive method:

Since tie can only occur at the even number of counts, so the event that there is a tie during the counting process is just the union of each of them.

For each probability one can use the pmf of the hypergeometric distribution, and using the inclusion exclusion principle to evaluate the union.

But it seems that this method is not quite computationally feasible. The number of terms in the inclusion exclusion principle is just too many.

#### Dason

##### Ambassador to the humans
I think of the following naive method:

Since tie can only occur at the even number of counts, so the event that there is a tie during the counting process is just the union of each of them.

For each probability one can use the pmf of the hypergeometric distribution, and using the inclusion exclusion principle to evaluate the union.

But it seems that this method is not quite computationally feasible. The number of terms in the inclusion exclusion principle is just too many.
There is a solution that is quite nice and can be generalized to any number of votes for both parties.

Hint 1:
Consider two cases: 1) Jack gets the first vote, 2) John gets the first vote

#### BGM

##### TS Contributor
To Dason: I have also considered your points before.

If the small party getting the first vote, then eventually there will be a tie. And thus if I consider the difference of the votes, I can have a tree-like structure by putting the difference on the y-axis and the counts on the x-axis. In such case the tie exist when it hit 0, and will not exist if it rise too high.

#### trinker

##### ggplot2orBust
I looked at your spoiler BGM. So simple, I hadn't thought about that point.

#### Dason

##### Ambassador to the humans
I think I'll wait until tomorrow to give another hint.

#### trinker

##### ggplot2orBust
@ no returns they're both tied (0-0) so the probability is 100%?

#### Dason

##### Ambassador to the humans
@ no returns they're both tied (0-0) so the probability is 100%?
Ok - we're not including the point before they start counting. So after they announce at least one vote what is the probability a tie will occur?

#### Dason

##### Ambassador to the humans
Alright time for another hint.

Just a reminder the previous hint was:
Consider two cases: 1) Jack gets the first vote, 2) John gets the first vote

which brought up the point that...
If Jack (the one who gets fewer votes overall) gets the first vote then clearly at some point they'll be tied.

Now consider:
If John gets the first vote is there a way to relate this to a situation where Jack got the first vote? (symmetry is powerful)

#### BGM

##### TS Contributor
Still thinking about your new hints. Another view from the combinatoric persepective:

Consider the successive counts of the number of votes of John and Jack are recorded on the y-axis and x-axis respectively. Then it forms a path from the orign (0,0) to the point (200, 300) over this rectangular grid. A tie occur if this path passing through the line y = x. So now we need to count the proportion of paths passing through this line.

#### Dason

##### Ambassador to the humans
I think I'll post the answer without the justification for now. We'll see if having the answer helps to figure out how to derive the answer.
Let X be the number of votes of the person that lost. Let Y be the number of votes of the person that won. The probability that at some point during the counting there will be a tie is 2*X/(X+Y)

In our case this gives 400/500 = 80%

#### Dason

##### Ambassador to the humans
Ok here is the full solution - spoilered once again in case people want to keep trying and getting hints.
Recap of the solution:
Let X be the number of votes of the person that lost. Let Y be the number of votes of the person that won. The probability that at some point during the counting there will be a tie is 2*X/(X+Y)

Justification:
Consider two cases - When the loser gets the first vote and when the winner gets the first vote.

Case 1 ) Loser gets the first vote: Clearly at some point there must be a tie announced. Hopefully this is easy enough to see - BGM gave a good explanation of why this must be earlier in the thread so feel free to read it up there if you don't see this right away. There is a probability of X/(X + Y) of the loser getting the first vote.

Case 2) Winner gets the first vote: We are looking for the probability that the winner gets the first vote and there is a tie at some point. Assume there is a tie at some point. Given that there is a tie there are an equal number of votes (n) for each candidate at that point. Reorder the votes by swapping the first vote for the winner and the first vote for the loser, swap the second vote for the loser and the second vote for the winner, ..., do this swap for each position. This creates a reordering where the loser obtained the first vote. This also creates a 1-1 correspondence between sequences where the winner got the first vote and a tie occured and where the loser got the first vote. Thus the probability of the winner getting the first vote and a tie being announced is the same as the probability that the loser got the first vote X/(X+Y).

This gives us a total probability of 2X/(X+Y). In our specific example the probability is 400/500 = .80

I might come back to this and reword the solution because I'm not sure I was clear enough at some points. I hope you enjoyed trying this out! I know I had fun doing this puzzle.

#### Dason

##### Ambassador to the humans
Triple posting... oh my.

I just thought of an extension to this puzzle which I haven't put too much thought into yet so I don't actually know the answer. But supposed there were 3 candidates running and candidate 1 obtained X votes, candidate 2 obtained Y votes, and candidate 3 obtained Z votes. Using the same procedure as before (drawing one vote at random and announcing the results every time) what is the probability of a 3 way tie at some point? I don't know if it matters but if it helps we can assume there is a max and that X <= Y < Z. Like I said I don't know if it matters if X and Y aren't the same.

A further extension is to assume there are k candidates and we're looking for the probability of a k-way tie at some point during the counting process.

Edit: I wrote some (inefficient) simulation code. It could probably be sped up considerably but it does the trick for now.
Code:
# How many votes each of the candidates obtained
ns <- c(20, 30, 40)

n <- sum(ns)
votes <- rep(1:length(ns), ns)

trial <- function(){
counts <- numeric(length(ns))

for(i in 1:n){
# Update the counts
counts[round[i]] <- counts[round[i]] + 1
# If all of the counts are equal to the first
# count then we have a tie.
if(sum(!(counts == counts)) == 0){
return(TRUE)
}
}
# We didn't get a tie anytime during the counting
return(FALSE)
}

j <- replicate(10000, trial())
mean(j)