Illustrating Variance of Sample Mean in Very Small Population

magphy

New Member
I'm working through some stats background I never got in highschool to make sure I fully understand some concepts on the fundamental level. While exploring the central limit theorem, and trying to understand why Var(X¯)= σ^2/n, I hit some trouble. I'm working with concrete numbers in a super small population to see it in action, but when I calculate the variance of the sample mean in the way that's more intuitive to me, it does not give the same result as the sigma-squared-over-n calculation.

Let's say we have a very small population of {2, 4, 9}. We could imagine those are ages of three children. So first, the population stats:
μ = (2+4+9)/3 = 5
σ^2 =
((2-5)^2 + (4-5)^2 + (9-5)^2) / 3
= ((-3)^2 + (-1)^2 + 4^2) / 3
= (9 + 1 + 4) / 3
= 8.67

This part is clear. But now, let's say I take a sample of n=2 from that population. There are only 3 possible samples:
• {2, 4}
• {2, 9}
• {4, 9}
If I'm looking for Var(X¯), I assume I would find the mean of each sample (which is X¯), then find the variance of of those values. These are the means:
• {2, 4}; mean = 3
• {2, 9}; mean = 5.5
• {4, 9}; mean = 6.5
That is, X¯ = {3, 5.5, 6.5}. And the average among those, or the E(X¯), is 5, which I understand is the same as μ.

This is where things stop making sense. If I apply my "normal" approach to variance among these sample means, I get this:

Var(X¯) =
((3-5)^2 + (5.5-5)^2 + (6.5-5)^2) / 3
= ((-2)^2 + (0.5)^2 + (1.5)^2) / 3
= ( 4 + 0.25 + 2.25) / 3
= 2.17

BUT if I use the formula σ^2/n, the result is = 8.67/2 = 4.33. That's twice the result I got when applying my "normal" approach to variance. What is wrong in my math?

Any guidance would be VERY appreciated!

Dason

Ambassador to the humans
Variance rules assume what you have is a random variable. For a finite population this means it assumes you're sampling *with replacement*. If you go through your exercise again but assume that you can sample the same values more than once when creating your samples you'll get 9 possible samples and things should work when you use this sample space. When you have a large enough population it doesn't actually matter if you don't sample with replacement but the typical rule of thumb is that if you're sampling more than 5% of the population without replacement you need to use a finite sample correction factor.