X1,X2,...,XN are independently identically exponentially distributed with expected value of 5. How can I compute X[bar]n when n=20 and N=1000? Then compute the proportion of values of X[bar]n that lie between 6.99 and 7.01.
repeat the above question with n=100
My thoughts
so basically i am using code in R software to do this
and basically this question means that suppose there are 1000 iid exp. dist. with Expected value of each X is 5. Then get the mean of each X where the number of observations is 20 then 100.
i used
a=(1:1000) <--makes a vector [1,2,3,...,1000]
for(i in a){a[i]=mean(rexp(20,5))} <---so 20 observations and expected value of 5, each slot in the vector gets replaced by a mean
plot(a)
and
a=(1:1000)
for(i in a){a[i]=mean(rexp(100,5))}
plot(a)
i get the y-axis to be 0.1 to 0.35 for n=20 and 0.14 to 0.26 for n=100, what am I doing wrong? I dont get how to get a proportion of values of X[bar]n that lie between 6.99 and 7.01.
It didn't say anything about using normal approximation but how would you do it if I had to?
so the formula is
x = (mu*z) / sqrt(n) + mu
so then i get
(x-mu)sqrt(n) / mu = z
so
(6.99-5)sqrt(20) / 5 = z
z =~1.77991011
(7.01-5)sqrt(20) / 5 = z
z =~1.797795654
(6.99-5)sqrt(100) / 5 = z
z =3.98
(7.01-5)sqrt(100) / 5 = z
z =4.02
So here i just get the area under the curve between the two points for each one, from the *standard normal distribution* ?
so for n=20 i get 0.0014406382
and for n=100 i get 0.0000053585
Does this seem right?
Yes, I checked your calculation for n=100, and it is correct.
However, note that this is a normal approximation to the distribution of the sample means.
I also checked empirically (based on n=100 columns with 1000000 rows) for n=100 and I obtained a proportion of 0.00065. So the normal approximation is going to be a bit off.
The reason is because the sampling distribution of the mean will still be somewhat positively skewed and have some kurtosis. More precisely, the skew of the sampling distribution will be 2/Sqrt[n] and the kurtosis will be 6/n.
Tweet |