Lines of regression

#1
Im little confused with the following.

there are 12 patients, each is assigned assigned an index I when first administered depending on the severity of their symptoms. Each then administered a drug and 30 days later they are given another index F which will hopefully be lower than I if they are improving.

Calc the equation of the regression line of F on I.

if i take I = x and F = y then i can calc the equation using y = a + bx where b = Sxy/Sxx i get the right answer.

if i switch them ie I = y and F = x, with x = C + Dy where D = Sxy/Syy then i get the wrong answer.

How am i suppose to know which should be x and which should be y?

Thanks very much for your help.
 

Dason

Ambassador to the humans
#2
It's true that you get a slightly different regression line if you switch the predictor/response. However you should know ahead of time which is the predictor and which is the response. If you don't then regression probably isn't what you want to do. Correlation might be more appropriate and that doesn't depend on labeling one variable predictor and one as response.
 
#3
Thanks very much for taking the time to reply to my post!

The book im reading states that you can use regression even when both the x and y (I and F in the above example) variables are not controlled, which is still a bit confusing in light of what you have said "you should know ahead of time which is the predictor and which is the response".

I guess I is the predictor in my example? (i had assumed both were not controlled).

Thank you for the heads up regarding the use of correlation as an alternative.
 

Dason

Ambassador to the humans
#4
Yes you can use regression when both aren't "controlled" but you still might want to use one to predict the other. That's the point of regression. For instance I might be interested in predicting a bear's weight based off of easier to obtain variables (maybe I just want to use height because I can take a picture or something and then calculate the bear's height fairly easily). You can see that obtaining a bear's weight isn't necessarily an easy task but obtaining its height is quite a bit easier. If I did have a sample of bears heights and weights then even though I can't control either of those things I can set up a regression of bear weight on bear height so that in the future I only need to obtain the height and I can get an estimate of the bear's weight.

If you really don't care about prediction and you just want a measure of association then correlation is a little easier to interpret.
 
#5
:( i had made a mistake and the difference between the two was significant. ive just redone the calc and the answers are the same to 2 decimal places.

Thanks very much for clarifying the concept though, i really appreciate it.