
Originally Posted by
CowboyBear
Hi guys,
but it probably should be stressed that it's going to be pretty hard for residuals to be normally distributed when the responses and/or IV's are not
The book is lying, or at least being far from complete in its statements. I agree with terzi.
Regression requires the residuals to be normally distributed. It very easy for the residuals to be normally distributed when the DV and IV are not. We actually expect the IV & DV to not be normally distributed in this case, thats why you so often here people saying 'we corrected for the linear trend by using a regression ect ect bla bla'.. here's why:
[Very roughly explained because I'm not going into details]
Lets say there is a strong linear relationship between Y and X, with some error ofcourse.
X shouldn't be normal as logically you controlled for it. Textbook example is a linear increasing variable e.g. ten measurements if Y at each interval (of eg. time, temperature) thus if anything it will be uniform.
Y is the variable related to X as Y ~ b + ax + error, when X is uniform and the relationship is strong then b + ax is uniform as well. Your linear model will pick up on the first part, b + ax (hence people say they controlled for it), what remains is error and thats the only part that should be normal.
Is this making sense??
Here the same story as a R simulation: