+ Reply to Thread
Results 1 to 15 of 15

Thread: E[MSE] simple linear regression

  1. #1
    Points: 3,380, Level: 36
    Level completed: 20%, Points required for next Level: 120

    Posts
    33
    Thanks
    0
    Thanked 0 Times in 0 Posts

    E[MSE] simple linear regression




    Hi All-

    I am trying to figure out how to prove that MSE = SSE/n-2 is an unbiased estimator of sigma^2 in simple linear regression.

    I have that (1/(n-2))E{SUM[Yi^2-2Yib1Xi-2boYi+bo^2+b1^2Xi^2]}

    Are bo and b1 random variables?

    Am I going about this the right way?

    Thanks!

  2. #2
    Super Moderator
    Points: 13,151, Level: 74
    Level completed: 76%, Points required for next Level: 99
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    2,014
    Thanks
    0
    Thanked 223 Times in 192 Posts
    Quote Originally Posted by B_Miner View Post
    Hi All-

    I am trying to figure out how to prove that MSE = SSE/n-2 is an unbiased estimator of sigma^2 in simple linear regression.

    Am I going about this the right way?

    Thanks!
    No, you have to bring the parameters (Beta0, Beta1, u_i) and the estimates (b0, b1, e_i) in together. I’ll sketch the proof and then you can do the rest.

    Here goes, we know that
    (1) Y_i = Beta0 + Beta1X_i + u_i

    Thus,
    (2) Ybar = Beta0 + Beta1Xbar + ubar.

    Subtracting (2) from (1) gives
    (3) (Y_i – Ybar) = Beta1(X_i – Xbar) + (u_i – ubar)

    It is also true that
    (4) e_i = (Y_i – Ybar) – b1(X_i – Xbar)

    As such, substituting (3) into (4) yields
    (5) e_i = Beta1(X_i – Xbar) + (u_i – ubar) – b1(X_i – Xbar)

    Now, squaring and summing will give
    (6) Sum[e^2_i] = (b1 – Beta1)^2 *Sum[X_i – Xbar)^2 + Sum[u_i – ubar]^2 – 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)]

    Take expectations on both sides
    (7) E[Sum[e^2_i]] =E[ (b1 – Beta1)^2 Sum[X_i – Xbar)^2 + Sum[u_i – ubar]^2 – 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)] ].

    Next, while taking expectations, you have to impose the classical regression assumptions and this will yield
    (8) E[Sum[e^2_i]] = Sigma^2 + (N – 1)Sigma^2 – 2*Sigma^2 = (N – 2)*Sigma^2.

    Define the MSE as
    (9) MSE = Sum[e^2_i] / (N – 2).

    Thus,
    (10) E[MSE] = E[Sum[e^2_i]] / (N – 2) = Sigma^2

    which shows that the MSE is an unbiased estimate.

  3. The Following 2 Users Say Thank You to Dragan For This Useful Post:

    Englund (02-02-2013), Omar Bassiuony (11-22-2015)

  4. #3
    Points: 3,380, Level: 36
    Level completed: 20%, Points required for next Level: 120

    Posts
    33
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks so much for this Dragan! It has helped me a lot!

  5. #4
    Points: 2,300, Level: 28
    Level completed: 50%, Points required for next Level: 150

    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hey Dragan,

    I can follow your response quite well up to (7), however I'm having some trouble moving from (7) to (8) with the regression principles...

    Here's where I am so far

    E[(b1 – Beta1)^2 Sum[X_i – Xbar)^2]
    = Sum[X_i – Xbar)^2E[(b1 – Beta1)^2]
    = Sxx Var(b1)
    = Sxx (sigma^2/Sxx)
    = sigma^2

    ...am I following correctly here?

    I can't seem to figure out how/why
    E[Sum[u_i – ubar]^2] = (N – 1)Sigma^2, and 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)] = 2*Sigma^2

    Can you give me a few tips on how you went about that?

    Thank you so much for your help so far!

  6. #5
    Super Moderator
    Points: 13,151, Level: 74
    Level completed: 76%, Points required for next Level: 99
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    2,014
    Thanks
    0
    Thanked 223 Times in 192 Posts
    Quote Originally Posted by statgirl11 View Post
    Hey Dragan,

    I can follow your response quite well up to (7), however I'm having some trouble moving from (7) to (8) with the regression principles...

    Here's where I am so far

    E[(b1 – Beta1)^2 Sum[X_i – Xbar)^2]
    = Sum[X_i – Xbar)^2E[(b1 – Beta1)^2]
    = Sxx Var(b1)
    = Sxx (sigma^2/Sxx)
    = sigma^2

    ...am I following correctly here?

    I can't seem to figure out how/why
    E[Sum[u_i – ubar]^2] = (N – 1)Sigma^2, and 2*(b1 – Beta1)*Sum[(X_i – Xbar)*(u_i - ubar)] = 2*Sigma^2

    Can you give me a few tips on how you went about that?

    Thank you so much for your help so far!
    1. Yes, that’s fine.

    2. Just think of the usual explanation for the expected value of variances:
    e.g. Average[ Sum[X – Xbar]^2 / N] = E[ Sum[X – Xbar]^2 / N ] = (N – 1)Sigma^2 / N. Note: We don't have N in the denominator. And, remember why we divide by N – 1 instead of N when we compute the sample variance.


    3. This is a bit trickier.

    The term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) can be expressed as (removing b1 and Beta1 terms)

    –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

    Taking expectations while noting that the X_i are nonstochastic gives

    –2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

    = -2*E[u_i^2] = –2*Sigma^2

    since the u_i are assumed to have constant variance of Sigma^2.

  7. #6
    Points: 2,300, Level: 28
    Level completed: 50%, Points required for next Level: 150

    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I'm still very confused about how we can express the term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) as –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

    How can you just remove the b1 and Beta1 terms? Also, where did the ubar go from the first term? (when I try and manipulate the first expression to get the second, I seem to end up with a ui and ubar still...), also, how did we end up with a (X_i-Xbar) term in the denominator...

    I understand why the term –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i results in -2*Sigma^2, I just can't seem to get the expression into that form...

  8. #7
    Super Moderator
    Points: 13,151, Level: 74
    Level completed: 76%, Points required for next Level: 99
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    2,014
    Thanks
    0
    Thanked 223 Times in 192 Posts
    Quote Originally Posted by statgirl11 View Post
    I'm still very confused about how we can express the term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) as –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

    How can you just remove the b1 and Beta1 terms? Also, where did the ubar go from the first term? (when I try and manipulate the first expression to get the second, I seem to end up with a ui and ubar still...), also, how did we end up with a (X_i-Xbar) term in the denominator...

    I understand why the term –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i results in -2*Sigma^2, I just can't seem to get the expression into that form...

    Okay let’s take the (B1 – Beta1) term. I’ll try to make things more clear.

    We know that B1 can be computed as:

    B1 = (Sum(X_i – Xbar)*Y_i) / Sum[X_i – Xbar ]^2 = Sum[k_i*Y_i]

    where k_i = (X_i – Xbar) / Sum[X_i – Xbar ]^2.

    Next, substitute the population regression function in for Y_i as:

    B1 = (Sum(k_i*(Beta0 + Beta1*X_i + u_i ) )

    Expand,

    B1 =Beta0*Sum(k_i) + Beta1*Sum(k_i*X_i )+ Sum(k_i*u_i)

    Simplify,

    B1=Beta1 + Sum(k_i*u_i)

    because Beta0*Sum(k_i) = 0 since Sum[X_i - Xbar]=0 and Sum(k_i*X_i ) = Sum(k_i*(X_i – Xbar)) = 1.

    Now, in the term (above) –2*E[ (B1 – Beta1)…] substitute in for B1 as

    –2*E[ (Beta1 + Sum(k_i*u_i) – Beta1) …]

    which is equal to

    –2*E[ (Sum(k_i*u_i)) …]

    which is what I gave above when you substitute back in the expression for k_i .

    And, you should end up with what I gave in my previous post (above)...I hope

    Note: ubar is zero, by assumption.

    I hope this notation helps.

  9. #8
    Points: 2,300, Level: 28
    Level completed: 50%, Points required for next Level: 150

    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts
    That notation is much much clearer, thank you so much! I understand how you did that now...(sorry, my math is a little rusty)

    but when you say that...

    –2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

    = -2*E[u_i] = –2*Sigma^2

    ...wouldn't you end up with -2*E[u_i^2] (rather than -2*E[u_i]), because of the ^2 in the previous expression?

  10. #9
    Super Moderator
    Points: 13,151, Level: 74
    Level completed: 76%, Points required for next Level: 99
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    2,014
    Thanks
    0
    Thanked 223 Times in 192 Posts
    Quote Originally Posted by statgirl11 View Post
    That notation is much much clearer, thank you so much! I understand how you did that now...(sorry, my math is a little rusty)

    but when you say that...

    –2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

    = -2*E[u_i] = –2*Sigma^2

    ...wouldn't you end up with -2*E[u_i^2] (rather than -2*E[u_i]), because of the ^2 in the previous expression?

    Oh Yes, that's correct, I just forgot to put it in. I'll change it. Thanks.

  11. #10
    Points: 2,300, Level: 28
    Level completed: 50%, Points required for next Level: 150

    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Phew, okay, then I've got it now!

    Thank you so much again for all your help!

  12. #11
    Points: 781, Level: 14
    Level completed: 81%, Points required for next Level: 19

    Location
    Rochester, MI
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: E[MSE] simple linear regression

    in my book, it states that the expectation of the sum of the squared error terms is equal to the error variance times n-1, but wouldnt it be instead the error variance times n, since by definition the expectation of a sum is the sum of the expectations and the expectation of the squared error term is sigma^2 by assumption? Im a bit confused here!

  13. #12
    Points: 781, Level: 14
    Level completed: 81%, Points required for next Level: 19

    Location
    Rochester, MI
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: E[MSE] simple linear regression

    never mind i just figured it out, the book was expressing the data in deviations form, with the assumption that sample mean of error terms was zero. Therefore, upon taking the expected value of the sum of the squared error terms, they had to account for the sample variance by adjusting for degrees of freedom.

  14. #13
    Points: 318, Level: 6
    Level completed: 36%, Points required for next Level: 32

    Location
    Massachusetts
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: E[MSE] simple linear regression

    Var(b1)= (sigma^2/Sxx)

    Can someone explain this for me? Thanks for the help of you guys!

  15. #14
    Points: 2, Level: 1
    Level completed: 3%, Points required for next Level: 48

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: E[MSE] simple linear regression

    Quote Originally Posted by Dragan View Post
    1. Yes, that’s fine.

    2. Just think of the usual explanation for the expected value of variances:
    e.g. Average[ Sum[X – Xbar]^2 / N] = E[ Sum[X – Xbar]^2 / N ] = (N – 1)Sigma^2 / N. Note: We don't have N in the denominator. And, remember why we divide by N – 1 instead of N when we compute the sample variance.


    3. This is a bit trickier.

    The term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) can be expressed as (removing b1 and Beta1 terms)

    –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

    Taking expectations while noting that the X_i are nonstochastic gives

    –2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

    = -2*E[u_i^2] = –2*Sigma^2

    since the u_i are assumed to have constant variance of Sigma^2.
    hello Dragan, I have been browsing the internet, looking for a proof of the unbiased indicator, and I just cant find anything to hang my hat on. They are all a little different, and there is at least one thing that I just dont understand. some of them seem circular to me. anyway, I am going back to school for a phd in math, 25 years removed from my last math course. this may seem like simple algebra, and I not only want to follow each step, but want to understand it to the point where perhaps not completely intuitive, I can make 'some' logical sense of why it works as it does. anyway, where you say this.... I am not sure what you mean....Note: We don't have N in the denominator. And, remember why we divide by N – 1 instead of N when we compute the sample variance..... do you mean that you put in N in denom and inadvertantly did so? I doubt it, because you would have just taken it out.... anyway, I appreciate what you wrote here 3 years ago, and I sure hope you are still with this site, but you have a nice delivery about you.. I just dont see all the pieces. thanks

  16. #15
    Points: 54, Level: 1
    Level completed: 8%, Points required for next Level: 46

    Posts
    3
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: E[MSE] simple linear regression


    Quote Originally Posted by Dragan View Post
    1. Yes, that’s fine.

    2. Just think of the usual explanation for the expected value of variances:
    e.g. Average[ Sum[X – Xbar]^2 / N] = E[ Sum[X – Xbar]^2 / N ] = (N – 1)Sigma^2 / N. Note: We don't have N in the denominator. And, remember why we divide by N – 1 instead of N when we compute the sample variance.


    3. This is a bit trickier.

    The term –2*[b1 – Beta1]*Sum[X_i – Xbar]*(u_i – ubar) can be expressed as (removing b1 and Beta1 terms)

    –2*((Sum [ X_i – Xbar]*u_i )/ (Sum[ X_i – Xbar]^2)) * Sum[X_i – Xbar]*u_i

    Taking expectations while noting that the X_i are nonstochastic gives

    –2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ]

    = -2*E[u_i^2] = –2*Sigma^2

    since the u_i are assumed to have constant variance of Sigma^2.

    I'm really struggling to see how
    –2*E [((Sum [ X_i – Xbar]*u_i )^2/ (Sum[ X_i – Xbar]^2)) ] = -2*E[u_i^2].

    Is there 'cancellation' involved???

    Isn't the u_i part of the summation, as below?
    -2*E{(Sum[(X_i - Xbar)*u_i])^2 / Sum[(X_i – Xbar)^2]}

    Thanks!

+ Reply to Thread

           




Similar Threads

  1. Please help with simple linear regression!
    By Tommy1005 in forum Statistics
    Replies: 2
    Last Post: 02-04-2010, 01:57 PM
  2. Simple Linear Regression with GUM Uncertainties
    By Tom La Bone in forum Regression Analysis
    Replies: 0
    Last Post: 07-23-2009, 04:31 PM
  3. estimate by simple linear regression
    By szpengchao in forum Statistics
    Replies: 0
    Last Post: 03-06-2009, 12:10 PM
  4. This simple linear regression isn't adding up..
    By Math Adventures in forum Regression Analysis
    Replies: 2
    Last Post: 02-18-2009, 10:19 PM
  5. simple linear regression question
    By jk010786 in forum Statistics
    Replies: 1
    Last Post: 03-11-2008, 03:36 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats