- Thread starter B_Miner
- Start date
- Tags sigma^2 unbiased estimator

Hi All-

I am trying to figure out how to prove that MSE = SSE/n-2 is an unbiased estimator of sigma^2 in simple linear regression.

Am I going about this the right way?

Thanks!

I am trying to figure out how to prove that MSE = SSE/n-2 is an unbiased estimator of sigma^2 in simple linear regression.

Am I going about this the right way?

Thanks!

Here goes, we know that

(1) Y_i = Beta0 + Beta1X_i + u_i

Thus,

(2) Ybar = Beta0 + Beta1Xbar + ubar.

Subtracting (2) from (1) gives

(3) (Y_i – Ybar) = Beta1(X_i – Xbar) + (u_i – ubar)

It is also true that

(4) e_i = (Y_i – Ybar) – b1(X_i – Xbar)

As such, substituting (3) into (4) yields

(5) e_i = Beta1(X_i – Xbar) + (u_i – ubar) – b1(X_i – Xbar)

Now, squaring and summing will give

(6) Sum[e^2_i] = (b1 – Beta1)^2 *Sum[X_i – Xbar)^2 + Sum[u_i – ubar]^2 – 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)]

Take expectations on both sides

(7) E[Sum[e^2_i]] =E[ (b1 – Beta1)^2 Sum[X_i – Xbar)^2 + Sum[u_i – ubar]^2 – 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)] ].

Next, while taking expectations, you have to impose the classical regression assumptions and this will yield

(8) E[Sum[e^2_i]] = Sigma^2 + (N – 1)Sigma^2 – 2*Sigma^2 = (N – 2)*Sigma^2.

Define the MSE as

(9) MSE = Sum[e^2_i] / (N – 2).

Thus,

(10) E[MSE] = E[Sum[e^2_i]] / (N – 2) = Sigma^2

which shows that the MSE is an unbiased estimate.

I can follow your response quite well up to (7), however I'm having some trouble moving from (7) to (8) with the regression principles...

Here's where I am so far

E[(b1 – Beta1)^2 Sum[X_i – Xbar)^2]

= Sum[X_i – Xbar)^2E[(b1 – Beta1)^2]

= Sxx Var(b1)

= Sxx (sigma^2/Sxx)

= sigma^2

...am I following correctly here?

I can't seem to figure out how/why

E[Sum[u_i – ubar]^2] = (N – 1)Sigma^2, and 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)] = 2*Sigma^2

Can you give me a few tips on how you went about that?

Thank you so much for your help so far!

I can follow your response quite well up to (7), however I'm having some trouble moving from (7) to (8) with the regression principles...

Here's where I am so far

E[(b1 – Beta1)^2 Sum[X_i – Xbar)^2]

= Sum[X_i – Xbar)^2E[(b1 – Beta1)^2]

= Sxx Var(b1)

= Sxx (sigma^2/Sxx)

= sigma^2

...am I following correctly here?

I can't seem to figure out how/why

E[Sum[u_i – ubar]^2] = (N – 1)Sigma^2, and 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)] = 2*Sigma^2

Can you give me a few tips on how you went about that?

Thank you so much for your help so far!

2. Just think of the usual explanation for the expected value of variances:

e.g. Average[ Sum[X – Xbar]^2 / N] = E[ Sum[X – Xbar]^2 / N ] =

3. This is a bit trickier.

The term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) can be expressed as (removing b1 and Beta1 terms)

–2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

Taking expectations while noting that

–2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

= -2*E[u_i^2] = –2*Sigma^2

since the u_i are assumed to have constant variance of Sigma^2.

How can you just remove the b1 and Beta1 terms? Also, where did the ubar go from the first term? (when I try and manipulate the first expression to get the second, I seem to end up with a ui and ubar still...), also, how did we end up with a (X_i-Xbar) term in the denominator...

I understand why the term –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i results in -2*Sigma^2, I just can't seem to get the expression into that form...

How can you just remove the b1 and Beta1 terms? Also, where did the ubar go from the first term? (when I try and manipulate the first expression to get the second, I seem to end up with a ui and ubar still...), also, how did we end up with a (X_i-Xbar) term in the denominator...

I understand why the term –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i results in -2*Sigma^2, I just can't seem to get the expression into that form...

Okay let’s take the (B1 – Beta1) term. I’ll try to make things more clear.

We know that B1 can be computed as:

B1 = (Sum(X_i – Xbar)*Y_i) / Sum[X_i – Xbar ]^2 = Sum[k_i*Y_i]

where k_i = (X_i – Xbar) / Sum[X_i – Xbar ]^2.

Next, substitute the population regression function in for Y_i as:

B1 = (Sum(k_i*(Beta0 + Beta1*X_i + u_i ) )

Expand,

B1 =Beta0*Sum(k_i) + Beta1*Sum(k_i*X_i )+ Sum(k_i*u_i)

Simplify,

B1=Beta1 + Sum(k_i*u_i)

because Beta0*Sum(k_i) = 0 since Sum[X_i - Xbar]=0 and Sum(k_i*X_i ) = Sum(k_i*(X_i – Xbar)) = 1.

Now, in the term (above) –2*E[ (B1 – Beta1)…] substitute in for B1 as

–2*E[ (Beta1 + Sum(k_i*u_i) – Beta1) …]

which is equal to

–2*E[ (Sum(k_i*u_i)) …]

which is what I gave above when you substitute back in the expression for k_i .

And, you should end up with what I gave in my previous post (above)...I hope

Note: ubar is zero, by assumption.

I hope this notation helps.

but when you say that...

–2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

= -2*E[u_i] = –2*Sigma^2

...wouldn't you end up with -2*E[u_i^2] (rather than -2*E[u_i]), because of the ^2 in the previous expression?

but when you say that...

–2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

= -2*E[u_i] = –2*Sigma^2

...wouldn't you end up with -2*E[u_i^2] (rather than -2*E[u_i]), because of the ^2 in the previous expression?

Oh Yes, that's correct, I just forgot to put it in. I'll change it. Thanks.

1. Yes, that’s fine.

2. Just think of the usual explanation for the expected value of variances:

e.g. Average[ Sum[X – Xbar]^2 / N] = E[ Sum[X – Xbar]^2 / N ] =**(N – 1)Sigma^2 / N**. Note: We don't have N in the denominator. And, remember why we divide by N – 1 instead of N when we compute the sample variance.

3. This is a bit trickier.

The term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) can be expressed as (removing b1 and Beta1 terms)

–2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

Taking expectations while noting that**the X_i are nonstochastic **gives

–2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

= -2*E[u_i^2] = –2*Sigma^2

since the u_i are assumed to have constant variance of Sigma^2.

2. Just think of the usual explanation for the expected value of variances:

e.g. Average[ Sum[X – Xbar]^2 / N] = E[ Sum[X – Xbar]^2 / N ] =

3. This is a bit trickier.

The term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) can be expressed as (removing b1 and Beta1 terms)

–2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

Taking expectations while noting that

–2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

= -2*E[u_i^2] = –2*Sigma^2

since the u_i are assumed to have constant variance of Sigma^2.

2. Just think of the usual explanation for the expected value of variances:

e.g. Average[ Sum[X – Xbar]^2 / N] = E[ Sum[X – Xbar]^2 / N ] =

3. This is a bit trickier.

The term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) can be expressed as (removing b1 and Beta1 terms)

–2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

Taking expectations while noting that

–2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

= -2*E[u_i^2] = –2*Sigma^2

since the u_i are assumed to have constant variance of Sigma^2.

I'm really struggling to see how

–2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ] = -2*E[u_i^2].

Is there 'cancellation' involved???

Isn't the u_i part of the summation, as below?

-2*E{(Sum[(X_i - Xbar)*u_i])^2 / Sum[(X_i – Xbar)^2]}

Thanks!