# Layperson with a question

#### Phi618

##### New Member
Hello! A friend of mine recently asked me for help with this, and while we kicked around a few ideas, we weren't really satisfied. That, and the fact that we have little formal statistics/probability education... Anyways, hopefully someone here can help us find and understand the solution. Or maybe at a bare minimum, where to look to understand this question, as I wouldn't have the slightest idea on where to look.

[please forgive me if my terminology is incorrect or if my wording is difficult to understand]

Let's use a coin flip as an event to base an example off of. The probability of one particular side landing up is 50%, correct? And is the implication of this that, if we repeat this coin flip many times, the resulting ratio of outcomes should approach 1:1, correct? Is there a way, though, to guess as to when we would reach that point? Specifically, is there a way to say, "We can be x% confident that the actual ratio will be within y% of the expected 1:1 ratio after z number of flips?"

Of course, if anyone does know the answer, could one be so kind as to explain? My friend proposed 99% confident within 1% of 1:1. When I attempted to lay out the possible outcomes, it seemed as though we approached a limit of 2% confident of being within 1% of 1:1 as number of flips approached infinity. Like said, though, we really don't know stats and probability formally, so I was hoping to get some advice and knowledge from those that do.

Thanks so much!

#### hlsmith

##### Omega Contributor
Given you have a fair coin, probability of H or T is 50%. You will see the probability equal closer to. 50 as the number of clips increase (law of big numbers) . The key to solving this problem is to look at the binomial distribution with a probability of success equals 0.50 and failure 0.50. The should be many calculators to examinne when the probability should convert to 0.5. You can also add cofindence interval (99% CI). There is a formula to do all of but I would start off by playing around with an online calculator or you can try to use MS Excel, which most people have access to. Let us know how you are proceeding, since we are happy to help.

#### Phi618

##### New Member
Thank you for your quick response! Sorry I was a bit slow in replying, but I'm still a bit confused. After reading an overview of binomial distributions, I found a few formulas (and I hope I've used them correctly).

(p^k)(1-p)^(n-k)

n!/[k!(n-k)!]

n = number of events (such as flips of a coin)
k = number of desired outcomes (such as 'heads')
p = probability

The bit I read said the first expression is the probability of each outcome. This makes sense to me. In the case of flipping a fair coin, if we flip 4 times, we get the same result as the reciprocal of 2^4, or 0.0625. Because there are 2^4 possible outcomes, correct?

It said the second expression is the total number of possible desired outcomes. Going back to the example of 4 coin flips, when I am looking for 2 heads, the expression results in 6. In other words, out of 16 possible outcomes from flipping a fair coin four times, 6 of those outcomes have exactly two heads (and therefore two tails), correct?

Finally, it said the product of the first and second expressions is the probability of k out of n ways. This makes sense to me, if I have understood the above. Back to the example of four flips, 6 * 0.0625 = 0.375... this means that if I flip a fair coin four times, I am 37.5% likely to get a head:tail ratio of 1:1, correct?

Please do correct me on any of the above if I am misunderstanding or (unknowingly) misrepresenting what these expressions mean or imply.

Now, the reason I am confused is when I increase the number of flips and still am only looking for a ratio of 1:1, it seems that I get less likely to see that over time. In fact, only with either one or two flips do I get a probability of 50% heads. After that, with each successive flip it seems that the product of the first and second expressions I posted only gets smaller and smaller. This is why I believe I must still not be getting exactly what I'm looking at, if, as you said, the probability will approach 0.5 as the number of flips increase.

Or did I just not increase the flips enough? In other words, does the probability drop near zero, then after a sufficiently large number of flips, begin to go back up towards 0.5? If so, how can I calculate just when that is?

Also, I did not find any calculators that would calculate what I was originally asking about. Either that, or I didn't know them when I was looking at them (since I wasn't sure what some of the variables represented and what some of the solutions meant. And we haven't even touched on the confidence interval yet.

Again, I apologize for being the layperson asking the possibly bizarrely specific question. But I appreciate the help and politeness I have been so far afforded. Thanks again!

#### Dason

##### Ambassador to the humans
The probability of EXACTLY 50% decreases and gets arbitrarily close to 0. But the probability of being in the range [0.50 - c, 0.50 + c] for some positive value of c increases as the same sample size increases. Basically the probability of being "close" to .5 increases.

#### Dason

##### Ambassador to the humans
To be 99% confident that you will get a sample proportion of heads in the range [.49, .51] you would need a sample size of about 16589.

#### Phi618

##### New Member
Thanks Dason! I guess I was completely neglecting that that was my original question (a range as opposed to exactly 0.5). Is there anyway you could explain how one arrives at that number (16589)? If you used an online calculator or formula, that would be sufficient, although I am very interested in reading into why it is 16589. Is there a few terms that I should search for (the way 'binomial distribution' in a search engine got me pointed in the generally correct direction)? Or do I already have everything I need in those two expressions I posted, and I just do not yet see how to use them to find what I'm asking about (ranges instead of just one specific probability and varying confidence levels of those)?

So far, I have to say I'm very pleased that I came to a great place to get these questions answered. Thanks again, both of you, and to anyone else that can help! Keep it coming!

#### Dason

##### Ambassador to the humans
Do you know the typical way that a confidence interval is created for a proportion?

#### Phi618

##### New Member
I do not. I do apologize for my ignorance. If the subject matter is too broad to explain here, it is okay to link me things I should read or research. I might be coming back for clarification in such a case though.

#### Dason

##### Ambassador to the humans
No need to apologize. If you're interested in learning the logic behind the calculations I would highly suggest reading up on how to make a confidence interval for a proportion. There is a lot of material on that topic and it's the basis behind the sample size calculation I did. Once you've seen how to make the interval I'd be happy to help walk you through how to go from that to a sample size calculation.

#### Phi618

##### New Member
Alright, I read this:

http://onlinestatbook.com/2/estimation/proportion_ci.html

and I feel as though I have a decent handle on that. If you have any better recommendations for a page on confidence intervals for proportions, I'm more than willing to look at them.

So where should I go from here? I can see how this is useful after an event has been measured (like in a case where I did flip a coin 1000 times and was analyzing the results). I am having trouble seeing how I can use this for my question, which is more theoretical, right? Does that make sense?

#### ArtK

##### New Member
I suggest that you download NIST Special Publication 800-22 Rev 1a April, 2010. Study 2.1 the
monobit test and also study 1.15 "Testing".

Art

#### Phi618

##### New Member
Thanks Art, but that NIST paper went right over my head. I wasn't even sure how most of what you suggested to read fit into what I'm trying to learn. As I said, I'm a beginner. Anyone else have any suggestions for what I should next read or look into, or can walk me through a solution? Thanks!

#### Dason

##### Ambassador to the humans
Hi Phi - sorry for taking so long to get back. This probably isn't what you wanted to see but it's where I want to start.

Can you provide the formula/equation you know to get a confidence interval for a proportion?

For the quantities in that formula which ones did you provide in your original post and which ones are unknown?

#### Phi618

##### New Member
The formula I found for the confidence interval for a proportion is:

p +/- zσ

where p is the proportion in the sample;
z will vary depending upon the confidence desired;
σ is the standard error of a proportion.

σ = sqrt{[p(1 - p)]/N}

N is the sample size.

As for which values I know and which I don't, based on my original post, mostly don't know. After all, I didn't actually do an experiment or sample, such as "500 flips, 265 heads, 235 tails." I suppose I could use a z table to find its value at 99%, which would be from the original post. So I'm still pretty stuck right now... sorry. Can you show me what I must not be seeing?

#### Dason

##### Ambassador to the humans
I suppose I could use a z table to find its value at 99%, which would be from the original post.
Good!

What else did you specify that you wanted to be true?

#### Phi618

##### New Member
I wanted to be x% [for example, 99%] sure that I am within y% [for example, 1%] of the expected proportion, for a fair coin which is 0.5.

#### Dason

##### Ambassador to the humans
Ok so you know how that 99% plays into it. You said you wanted to be "within 1%". Is there a certain part of the confidence interval equation that is directly related to the "within y%".

#### Phi618

##### New Member
Is it, perhaps, " +/- zσ " such that zσ = 0.01?

#### Phi618

##### New Member
I think I got it. Solve for N in formula for standard error, which was 1/(4σ^2). Solve for zσ by equating it with 0.01 and inserting the z value for 99% (2.58), which results in σ = 0.00387597...

Finally, sub that back into the first equation to get N = 16641.

Is that all correct?