# 4PL Non-Linear Regression, Transforming observed y values

#### BrainExploding

##### New Member
Good Afternoon,

I'm currently teaching my self how to fit response curves, and have run into some trouble. I currently need assistance on utilizing proper methodologies to "massage" the observed y values.

Is it wise to log transform the y values, knowing that this changes the vertical distances within the data? If log transforming the y-values is not wise, then is there an alternative method to "massage" the data? Or, if log transforming the observed y's is okay; what are the other steps I have to take to adjust the EC50(C variable)?

Any assistance would be appreciated. I've been research the forum for similar questions.

Thanks,
BrainExploding

#### noetsi

##### No cake for spunky
Logging is common used to transform data most obviously in economics; whether you should do so depends on what is wrong with the data in the first place. It spreads out the data and is commonly useful for making a relationship more linear [although some data is inhereantly non-linear and this won't effect that] as well as dealing with some forms of unequal error variance. You can't use it with values that are negative or 0, but you can always add a constant to make all values greater than 0.

Logging is just one of a set of transformations, Tukey's ladder of power list a range of others. I have not heard logging criticized for changing the distance - that is sort of the point. I have heard it criticized for being hard to intepret.

#### BrainExploding

##### New Member
Hi Noetsi,

Thanks for your feedback. I think I'm a little bit confused. I was under the assumption that log transforming the y values for a drug response curve (EC50 analysis) changes the vertical distance of the data, which results in skewing the standard errors of the curves vertical attributes. The curve fitting is performed by iteratively changing the 4 parameters to reduce sum residual squared.

If I'm mistaken about transforming y values, what steps do I need to take to readjust the EC50 value?

For reference information, I have to use excel to fit this curve; and am using the SOLVER add-in to do this. I'm okay with normalizing the concentrations and adjusting the fitted EC50 value based upon the concentration's adjustment factor.

Thanks,
BrainExploding

#### noetsi

##### No cake for spunky
I don't know anything about EC50. My comments were specifically about why you log data. While logging does change the distance between points, it spreads them out, I have never heard it suggested that it skews standard errors and it is a very common transformation particularly in econometrics so I would think that would come up if it did this. Where did you read that logging skews the standard errors.

I work with SAS rather than Excel. Solver is used not for statistical analysis commonly, but for linear programing and optimising. I am not sure exactly what method you are using so it is hard to comment on it.

Actually logging data makes skewed distributions more normal not less.

http://www.ma.utexas.edu/users/mks/statmistakes/skeweddistributions.html

Last edited:

#### BrainExploding

##### New Member
Hi Noetsi,

I read it in a curve fitting guide written by GraphPad for their Prism software. Here is a link:

http://www.mcb5068.wustl.edu/MCB/Lecturers/Baranski/Articles/RegressionBook.pdf

My understanding of logging the x values, is that it removes the exponential(growth) aspect that resides within some systems, like technology growth and biological populations; this can make the data more linear. But with fitting sigmoid curves, the vertical distances are what constitutes the sum residual squares of observed y values. The fitting is an iterative approach where the 4 parameters are adjusted in order to minimize/reduce the y sum residual squares. The fitting methodology follows the method of Marquardt & Levenberg, which blends the method of steepest decent and the method of Gauss-Newton.

I'm currently being asked create an application that fits sigmoid curves and log transforms the observed y values, but I keep on finding data that says to not transform the y values. I can see the application of it, where if the dose response's growth rate is extremely high or some other scenario that I'm not knowledgeable about. I think I'm looking for a method of best practices or formulas that describe the s-curves 4 parameters when the y values are log transformed.

#### noetsi

##### No cake for spunky
In social sciences including economics the concerns you raise apparently do not apply or at least are not raised. Probably because they rarely do specteral analysis (sine waves do exist in some ARIMA applications and certain ecometrics, but the focus is largely on either stocastic applications or factors such as linear time trends not wave analysis). I have never seen the concerns you raise with logging discussed in this literature (the problem with logs is seen as one of interpretation or that in some cases they don't address the reasons they were used such as linearity or normality).

Normally you would turn to something like Box-Cox or Tukey's ladder to determine which transformation to do, but given that the application you are dealing with has issues that do not come up much in the social science analysis they were created for I do not know if they would help. You might look at them and try to find out if they are a reasonable alternative for your analysis, but given your comments I tend to doubt they are.

#### BrainExploding

##### New Member
Hi Noetsi,

Thanks for your input, I'll definitely look into Box-Cox and Tukey's ladder. If your interested in the section that discussing nonlinear transformations to y values for nonlinear regression, you can find it on page 22. The section is title "Think carefully about nonlinear transforms".

@hssmith, thanks, it's what my brain feels like when I think about Stats, LOL. BTW, great avatar picture. I'm all about "The Walking Dead".

Thanks Everyone,
BrainExploding