Why do you subtract the mean of a predictor variable from that variable..

noetsi

Fortran must die
#1
An author using temperature as a predictor subtracted its mean value from each temperature data point to deal with "scaling issues."

I don't understand what that is, or why you do this. Is this to make the intercept make more substantive sense?
 

hlsmith

Omega Contributor
#2
i have seen and done this to get a model to converge. i just assumed it get large numbers out of equations, say during factorials or something like that but in realty i was being ignorant.
 

noetsi

Fortran must die
#3
I thought there might be some theoretical reason like when you center the data to make the intercept make substantive sense.
 

Jake

Cookie Scientist
#4
If the predictor that is being mean-centered is involved in an interaction with another predictor, then mean-centering can help a lot with interpreting the regression coefficient for that other predictor. If the predictor that is being mean-centered is NOT involved in interactions with any other variables, then the only effect this will have, as you note, will be to change the value of the estimated intercept -- which is probably not of much interest in many cases, but there certainly may be cases where the intercept is of interest.
 

rogojel

TS Contributor
#5
Hi,
to add to what Jake said, often the intercept is not interesting exactly because it is the meyn Y value at X=0 where X=0 might not make sense or be of any larticular interest. If the variable is centered then the intercept is the mean Y at the mean X - which, at least makes practical sense and has more chance of being a practically interesting value as we sampled around it.

Also, I think but am not sure, the requirement to build hierachical models is moot if the IVs are centered. Any opinion on that?

regards
 

Dragan

Super Moderator
#6
I thought there might be some theoretical reason like when you center the data to make the intercept make substantive sense.
Simply put, it's a way to deal with the so-called problem of multicollinearity, which is what Jake was alluding to when you have an interaction (or moderating variable). Outside of that, subtracting the mean of X from each value of X_i just makes the intercept term equal to the means of both the predicted values of Y (Y-hats) and the actual observations of Y. In short, without subtracting the mean of X from the values of the X_i just changes the intercept term so that it still ensures that the mean of the predicted scores of Y (Y-hats) is still equal to the mean of the actual observations of Y. That really all there is to say.