log transformations

noetsi

Fortran must die
#1
This has confused me for years.

Commonly in financial analysis you log Y or sometimes an X. I have never been sure if when you log Y you have to log a given X [or for that matter all X in the model]. Similarly I am not sure if you log a given X if you have to log Y.

Another basic question. :p
 

Miner

TS Contributor
#2
Transformations of X can linearize the relationship with Y, but often do not change the distribution of error terms (i.e., does not correct heteroskedascity).
Transformations of Y can linearize the relationship with X, and will change the distribution of error terms (i.e., can correct heteroskedascity).

I base my decision on which to transform on whether I must correct for heteroskedascity or not. Log transforming both X and Y results in a stronger impact in linearizing the relationship. Note: This is strictly to linearize a relationship. In other cases the choice of log-linear versus log-log has a specific interpretation.
 

hlsmith

Omega Contributor
#3
Miner is hitting on a couple of points, but I will add that in economics many of their variables are naturally exponential (e.g., exponential growth or exponential decay). So this is one reason why they are log happy. Secondarily, Miner almost mentions this, but they (economists) like presenting results on the relative scale instead of using values (e.g., percent increase instead of increase given a specific unit increase).


Lastly, this will be redundant to Miner, you are just working to get the error distributions acceptable, so log y or log x or log them both.
 

noetsi

Fortran must die
#4
I take this from your answer miner that you can, depending on specific, need change X when you change Y [or change Y when you change X] but you do not have to. It will change the interpretation of the slope whichever you do.
 

Miner

TS Contributor
#5
In industrial statistics, we are just trying to develop a predictive model that we can used to optimize a response. Most of the time we do not have to interpret what the slope means. In other words we are after the mundane, practical application of the model, not for any ground-breaking, publishable research. In many ways, it makes my job much easier. If I can't meet an assumption, I can usually analyze it anyway, and try it out on the process to validate the results.
 

noetsi

Fortran must die
#6
I don't publish either [my formal research and articles were in a non-quant field]. My focus is, except for time series, nearly entirely on relative impact of predictors or interpreting slopes.