Help on understanding sum of independent variables(new to statistics)

#1
Hello everybody. Im new to stat. Lets say i have an independent random variable X and Im having difficulties on wrapping around the idea on the difference between say 5X in comparison to (x1+x2+x3+x4+x5) on their difference.

Like for the derivation of the standard error although i understand in when i look at it but in my mind theres this other scenario playing like why cant it be like var((x1+x2+x3+x4+x5)/5)=var(5x/5) though it doesnt make sense since on the first one youll get the SE= (sigma)/sqrt(5) on the other hand what will be left on var(5x/5)= just var(x) which will be just SE = sigma. I know its wrong but i feel im having difficulty finding the second one as not true because i might be missing or have not learned a fundamental principle.

Comments and feedbacks will be highly appreciated. Thanks in advance :)
 

rogojel

TS Contributor
#2
hi,
if I understand your question correctly the answer is that the equation x1+x2+x3+x4+x5 = 5*x is not valid. the x1 , x2 etc are the same random variable in the sense that they have the same distribution, but the actual values will be different - as in running the same random number generator 5 times. Obviously you will get 5 different values. 5*x would mean that you generate a single random value and add that one value 5 times - something completely different.

regards
 

Dason

Ambassador to the humans
#3
Let's take this to the extreme. Consider flipping a coin. So x1 can either be 0 or 1. Now let's flip a thousand coins. Your first case is where we sum the result for the thousand flips. Now how would we get a result of 1000? We would need to flip heads (1) every time. That's highly unlikely. More likely we flip heads approximately half of the flips. Is a thousand or zero possible? Yes. But it isn't likely.

Now consider your second situation where we decide to save some time and just flip one coin and multiply the result by 1000. What are the possibilities? Just either 0 or 1000. We can't even get something like 500. Our results are always the extremes.

So hopefully this illustrates 1) why we need to treat these situations differently and 2) why that second scenario gives a larger variance.