Who is most likely to win, and exactly how likely?

#1
I will try to simplify the problem I'm trying to solve as much as possible. My statistics knowledge is pretty poor, but I have a good mathematical background and hopefully just need pointing in the right direction.

Two players take part in a task where they each attain a score that, for ease of explanation, is between 0 and 100.
Say these players had each done this task multiple times and I know the mean and standard deviation of the distribution of scores of each player.

Is it possible, given just these 2 means and standard deviations, to work out the probability Player 1 would beat Player 2 in any given iteration of 'the task'?
Can this be done analytically with some sort of formula, or would it require more complex simulations?

Thanks
 
#2
Yes, you can easily predict the probability, provided you know what sort of distribution the players’ scores form. (The most common assumption is that they are normally distributed.) It is then merely a case of combining the two distributions into a single one for d = (s₁ – s₂)—i.e. determining the mean and standard deviation for the distribution of the differences between player scores—and then calculating the probability P(d > 0) from the resultant distribution.

I hope the above points you in the right direction.
 
#3
Yes, you can easily predict the probability, provided you know what sort of distribution the players’ scores form. (The most common assumption is that they are normally distributed.) It is then merely a case of combining the two distributions into a single one for d = (s₁ – s₂)—i.e. determining the mean and standard deviation for the distribution of the differences between player scores—and then calculating the probability P(d > 0) from the resultant distribution.

I hope the above points you in the right direction.
Thanks for your reply, this sounds very promising. Assuming the distributions are normal (which they aren't but never mind) would you mind running through an example? Say the means are 50 and 55 and the st. devs are 20 and 10 respectively.
Thanks again!
 
#4
Refer to https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables

In your example, let the player with the higher mean score be s₁ ~ N(55, 10) and the one with the lower mean score be s₂ ~ N(50, 20). Then d = (s₁ – s₂) ~ N(55 – 50, √(10²+20²)) ≡ N(5, √500).

Note that the variances (square of the std dev) are still added together, whereas the means are subtracted. (This should be obvious when you rather consider the case of adding the second distribution but with its mean having had its sign reversed.) With μ = 5 and σ = √500, the standard z-score for d = 0 is (0 – 5)/√500 ≈ –0.2236.

You can now use the standard z ~ N(0, 1) table to find that the area between the standard curve and the x-axis from –0.2236 to +∞ is 0.5885, which is the probability that d ≥ 0, or s₁ ≥ s₂.

Alternatively, you can use the online calculator at http://onlinestatbook.com/lms/calculators/normal_dist.html with the following parameters:
• Mean = 5;
• SD = √500 ≈ 22.36086; and
• Select the “Above” option and enter “0” into the input box before clicking “Recalculate”.
 
Last edited:
#6
No problem. But note that I’ve made a notational error. The accepted way of writing the normal distribution is “x ~ N(μ, σ²)”—i.e., the second argument is the variance σ², not the standard deviation σ, as I had it.

So, in the second paragraph it should read:
s₁ ~ N(55, 10²)
s₂ ~ N(50, 20²)
d = (s₁ – s₂) ~ N(55 – 50, (10²+20²)) ≡ N(5, 500)

My error is perhaps understandable because I usually only deal with N(0, 1).