Right now I'm reading "The Drunkard's Walk - How Randomness Rules Our Lives" by Leonard Mlodinow. With an estimation in chapter 5 he wants to illustrate the importance of the law of large numbers, which I totally get. But the estimation itself, I can't really fully comprehend. I quote:

"For instance, if you polled exactly 5 residents of Basel in Bernoulli's day, a calculation like the ones we discussed in chapter 4 shows that the chances are only about 1 in 3 that you will find that 60 percents of the sample (3 people) supported the mayor.

Only 1 on 3? Shouldn't the true percentage of the mayor's supporters be the most probable outcome when you poll a sample of voters? In fact 1 in 3 is the most probable outcome: the odds of finding 0, 1, 2, 4, or 5 supporters are lower than the odds of finding 3. Nevertheless finding 3 supporters is not likely: because there are so many of these nonrepresentative possibilities, their combinied odds add up to twice the odds that your poll accurately reflects the population. And so in a loll of 5 voters, 2 times out of 3 you will observe the "wrong" percentage. (...)"

Again, I get the point he is trying to make. But my confusion comes from another side. Okay let's start with the numbers. I assume 1 of 3 is approximately, because when I'm calculating correctly the number of all possible combinations of the polled people supporting or not-supporting the mayor. The exact number of possibilities would be 32, with

- 1 combination of zero supporters

- 5 combinations of one supporters

- 10 combinations of two supporters

- 10 combinations of three supporters

- 5 combinations of four supporters

- 1 combination of five supporters.

10 combinations for three supporters out of 32 overall combinations is approximately 1 of 3, right?

But here's my problem. I would fully agree when the voters would be support "perfectly random". That is, 50 percent of the voters supporting and 50 percent of the voters not supporting. But since the "true" value in the population is 60%, that would have an impact on the sample of polled voters. Why is he not taking that in consideration? Just omitting it because it had nothing to do with the conclusion he wanted to make?

To illustrate that, gues the approval rate would be 99%. That would change nothing in Mlodinows calculation, right? The number of overall combinations would still be 32, and the highest probability would still be near to 33%, for a voters' approval of 60%. The numbers would just not change to the example of Mlodinow. But obviously when the population approves about 99% the sample would more likely exist of mayor approvers and therefore the probability of voters approving to 60% would not be the most likely outcome in a sample of 5.

Having illustrated that in the extreme case of 99%, even though the differences are far smaller with 60%, Mlodinow does not include that in his calculation, when I assumes right about the way, in which he calculated it.

Here is my "but": of course Mlodinow is a scholar and I'm just a casual reader of one of his books. So I think my assumption, about the way Mlodinow calculated the values is just wrong. He did it in another way. But I just don't find any other possibility and am therefore asking for your help. What am I missing? How did Mlodinow calculate?

Thanks for your impressions!