+ Reply to Thread
Results 1 to 8 of 8

Thread: residuals against Y

  1. #1
    Points: 3,380, Level: 36
    Level completed: 20%, Points required for next Level: 120

    Posts
    33
    Thanks
    0
    Thanked 0 Times in 0 Posts

    residuals against Y




    Hi all-

    I'm looking for an argument / example in simple linear regression, if you plot the residuals against Y, there appears to be a relationship (undefined, just know the plot is not a random scatter) but if you plot residuals against the predicted value (or X) the plot shows only random scatter.

    Any ideas? I keeping thinking it will come to me but im waiting a long time ;-)

  2. #2
    Points: 2,352, Level: 29
    Level completed: 35%, Points required for next Level: 98

    Location
    Paris
    Posts
    64
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi B Miner,

    Assume that Y = X1b1 + X2b2 + e where the independent variables are i.i.d and the residuals e is orthogonal to X=(X1,X2). Assume that you do not have access to X2. Hence, let u = X2b2 + e. Finally, assume that b1 = 0.

    Estimate the following equation : Y = X1b1 + u. There will be a relationship between u_hat and Y because u is orthogonal to X1 but no relationship between u_hat and Y_hat = X1b1_hat.

    More generally, I believe that this will be the case whenever you plot the residuals against some orthogonal variable - i.e. orthogonal to the variables that generate variation in your Y's.

    Etienne

    Am I right ?

  3. #3
    Points: 3,380, Level: 36
    Level completed: 20%, Points required for next Level: 120

    Posts
    33
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I dont know....I hope not because I dont fully understand your answer!

    Can you explain in a little more detail, Im just not catching it I'm afraid...

  4. #4
    Points: 2,352, Level: 29
    Level completed: 35%, Points required for next Level: 98

    Location
    Paris
    Posts
    64
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Okay here is an example :

    I have a dependent variable Y which is obtained as :

    Y = X1b1 + X2b2 + e
    where X1 and X2 are i.i.d. N(0,1). The coefficient b1 is not different from 0 : you can think of b1 as a N(0,1) variable.

    Now you don't have X2 in your dataset. Let u = X2 + e. You estimate :

    Y = X1b1 + u

    The residuals that you obtain are orthogonal to the predicted value of Y because X1 is orthogonal to Y. On the other hand, when you plot the residuals against Y, there is a clear relationship because X2 is embedded in u.

    Take a look at the attached figures : it is an example of this with 1000 observations.
    Attached Images  

  5. #5
    Points: 3,380, Level: 36
    Level completed: 20%, Points required for next Level: 120

    Posts
    33
    Thanks
    0
    Thanked 0 Times in 0 Posts
    This may be a very naive question but how can X1 be independent of Y since it is part of its generation?

  6. #6
    Points: 2,352, Level: 29
    Level completed: 35%, Points required for next Level: 98

    Location
    Paris
    Posts
    64
    Thanks
    0
    Thanked 0 Times in 0 Posts
    In this example the coefficient b1 is itself a random variable. For this reason there is no variation in Y which originates in X1.

    This is an extreme example but it illustrates a general point : when your independent variables have a very low explanatory power your plot of the residuals against the predicted value shows only random scatter.

  7. #7
    Points: 3,380, Level: 36
    Level completed: 20%, Points required for next Level: 120

    Posts
    33
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thank you! you have been very helpful. Do you use R? If so check this out:
    The phenomenon appears when there is an omitted variable, regardless of low explanatory power (in my simulation b1 is very significant).



    x1<-rnorm(1000, mean = 0, sd = 1)
    x2<-rnorm(1000, mean = 0, sd = 1)
    y<-x1+x2
    result<-data.frame(y,x1,x2)
    result

    obj<-lm(y~x1,data=result)
    summary(obj)
    plot(x,y)
    abline(obj)

    plot(obj$fitted.values,obj$residual)
    plot(result$y,obj$residual)

  8. #8
    Points: 2,352, Level: 29
    Level completed: 35%, Points required for next Level: 98

    Location
    Paris
    Posts
    64
    Thanks
    0
    Thanked 0 Times in 0 Posts

    You're welcome.

    By low explanatory power, I meant low R-square. Indeed, the following propositions are equivalent :

    (i) your plot of the residuals against the predicted value shows only random scatter

    (ii) for any given value of your Xs, there are still large variations in Y

    (iii) a large part of the variations in your Ys is left unexplained

    (iv) your R-square is low

    Yes I use R ; I'll take a look at your example. I did the example I gave you with Stata.

    Bye

+ Reply to Thread

           




Similar Threads

  1. Sum of the residuals
    By val92 in forum Statistical Research
    Replies: 2
    Last Post: 02-04-2011, 01:40 PM
  2. Mean of residuals = 0
    By wenhotbabe2 in forum Statistics
    Replies: 2
    Last Post: 05-12-2009, 09:28 AM
  3. Residuals
    By jeaninevdv in forum Statistics
    Replies: 0
    Last Post: 07-07-2006, 02:30 PM
  4. Residuals
    By jeaninevdv in forum Statistics
    Replies: 0
    Last Post: 07-07-2006, 02:15 PM
  5. adjusted residuals or standardized residuals?
    By rydingray in forum Statistics
    Replies: 2
    Last Post: 06-27-2006, 04:58 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats