I have been reviewing a study in which Multiple Linear Regression (MLR) was used to evaluate the relationship between the concentration of a pollution indicator (response variable, Y), distance from source of pollution (explanatory variable, X1), and time since cessation of pollution (explanatory variable, X2). The author finds no significant interaction between X1 and X2, and then reruns the MLR without the interaction term. She then finds that Y shows a significant linear decrease as X1 increases, and as X2 increases. Everything seems fine to me up to this point.

Then the author asks “How much time will be needed before Y reaches a certain low concentration (i.e. background level)?” To answer this, she enters a value of Y into the multiple regression equation and calculates X2. She then reports X2 without a confidence interval. I have the following questions and concerns:

1. Is it valid to use the regression equation to predict the value of X2 for a particular value of Y? I’m wondering if the MLR should be rerun with X2 as the response variable, and Y and X1 as the explanatory variables, since she was trying to predict the value of X2, not Y.

2. If the above really is valid, is there a way to calculate a confidence interval for the predicted value of X2? I know there is a formula for calculating the confidence interval for a value of Y, but I don’t know of one for X2 or X1.

Thanks for any insights you can provide.

AquaMan.