# Thread: multiple linear regression: negative Y ?!

1. ## multiple linear regression: negative Y ?!

So I have my equation and it looks weird. The intercept is a small positive value and a lot of the co-efficients have large negative values. So that means my outcome, dose, is going to be negative?! How's that supposed to make sense? Or do I have it wrong?

Thanks.

2. ## Re: multiple linear regression: negative Y ?!

Not necessarily. You can't say until you have final predictions. The predictions are result of not just parameters but sum-product of parameters & IVs. What if IVs are also negative? It'd help if you provide more information regarding your equation. And I'd say that first of all make predictions & see if the outcome is really negative?

3. ## The Following User Says Thank You to jrai For This Useful Post:

StatsClue (02-21-2012)

4. ## Re: multiple linear regression: negative Y ?!

What negative slopes mean is that the DV will go down as the IV goes up. If the slopes are small (meaning Y won't fall much) the slopes can be negative and Y not be negative even with a small intercept.

If you find predictions of Y that are nonsensical (like negative dosage) several factors could be in play. If your Y is not interval like and you are using a method like OLS regression this can generate nonsensical results. Also if an IV can not take on a value of 0 (say the value is height of an adult - no adult has a zero height) that can make your intercept essentially meaningless as all it is the value of Y when all X are 0. I don't know for sure how that effects predictions of Y, although it is common to center an X that can not take on a meaningful value of 0.

5. ## The Following User Says Thank You to noetsi For This Useful Post:

StatsClue (02-21-2012)

6. ## Re: multiple linear regression: negative Y ?!

Originally Posted by noetsi
What negative slopes mean is that the DV will go down as the IV goes up. If the slopes are small (meaning Y won't fall much) the slopes can be negative and Y not be negative even with a small intercept.

If you find predictions of Y that are nonsensical (like negative dosage) several factors could be in play. If your Y is not interval like and you are using a method like OLS regression this can generate nonsensical results. Also if an IV can not take on a value of 0 (say the value is height of an adult - no adult has a zero height) that can make your intercept essentially meaningless as all it is the value of Y when all X are 0. I don't know for sure how that effects predictions of Y, although it is common to center an X that can not take on a meaningful value of 0.

What do I do now? My Y is continuous, not interval. A couple of slopes have a LARGE negative value. What do I need to remedy? The model seemed to have evolved okay based on p-values and general sense.

7. ## Re: multiple linear regression: negative Y ?!

First of all let's see if the predictions are actually negative or not? Do you remember, I told you to use output statement to calculate residuals & cook's D statistics? Similarly you can calculate the predicted values as well:

Output out=statsclue1 p=predicted_dv;

Do a univariate analysis on variable predicted_dv & see what % of the values are negative?

proc univariate data=statsclue1;
var predicted_dv;
run;

8. ## The Following User Says Thank You to jrai For This Useful Post:

StatsClue (02-21-2012)

9. ## Re: multiple linear regression: negative Y ?!

Umm..don't think I get it. Do you mean:

proc print data=statsclue out=statsclue1 p=predicted_dose;
run; /*this actually didn't run*/

proc univariate data=statsclue1; /*and how is this taking into account the regression equation that I have?*/
var predicted_dose;
run;

Working on it.

10. ## Re: multiple linear regression: negative Y ?!

You were supposed to add the output statement to the proc statement when you fit your model...

11. ## The Following User Says Thank You to Dason For This Useful Post:

StatsClue (02-21-2012)

12. ## Re: multiple linear regression: negative Y ?!

Just to clarify what Dason said, use output statement in proc GLM.

13. ## The Following 2 Users Say Thank You to jrai For This Useful Post:

Dason (02-21-2012), StatsClue (02-21-2012)

14. ## Re: multiple linear regression: negative Y ?!

proc glm data=stats out=stats1 p=predicted_dose;
class gender;
model dose=gender bmi A B C A*C/solution clparm;
run;

ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, DATA, MANOVA, MULTIPASS,
NAMELEN, NOPRINT, ORDER, OUTSTAT, PLOTS.
ERROR 76-322: Syntax error, statement will be ignored.

Trying to figure.

15. ## Re: multiple linear regression: negative Y ?!

Perhaps you can use a suitable link function to force your prediction to be on the upper right hand quadrant. That is map [-Inf,Inf] -> [0,Inf]. The log function comes to mind.

16. ## The Following User Says Thank You to dmancevo For This Useful Post:

StatsClue (02-21-2012)

17. ## Re: multiple linear regression: negative Y ?!

Originally Posted by dmancevo
Perhaps you can use a suitable link function to force your prediction to be on the upper right hand quadrant. That is map [-Inf,Inf] -> [0,Inf]. The log function comes to mind.
Let's not jump quite there yet when the OP hasn't even successfully been able to see if the predictions are negative or not.

18. ## The Following User Says Thank You to Dason For This Useful Post:

StatsClue (02-21-2012)

19. ## Re: multiple linear regression: negative Y ?!

Oh thanks! I got an output. Had to put the output statement AFTER model.

It gives a table of extreme observations. TWO of the lowest observations are NEGATIVE. ?

Trying to figure.

edit: and there's ONE missing value...dunno why. All my dose cells had a value though some values for other variables were missing.
edit: no, the missing count is 40.

20. ## Re: multiple linear regression: negative Y ?!

Don't get worried about missing now, see how many negative values do you have & what % they are of total predicted values.

21. ## Re: multiple linear regression: negative Y ?!

All predicted values aren't printed. There are quantiles and there are extreme observations. The extreme observations show only TWO negative values and the Quantiles show one big negative value at 0% and another negative value at 1%. At all other levels---5%, 10%,25%,50%,90%,95%,99%,100%, the values are ALL POSITIVE.

So I guess negatives are 1% of the predicted values..?

PS: figured the missing thing, I think. The other variables had some data missing and so those observations were deleted for the analysis. Values used were 40 less than values read. Makes sense.

22. ## Re: multiple linear regression: negative Y ?!

That means somewhere between 1 and 5% of your values are negative. You can also find the exact number by:
Code:
``````proc sql;
select count(predicted_dose)
from stats1
where predicted_dose<0;
quit;``````
So this shows that only very small % of your values are predicted negative, contrary to your initial thought . If this is a concern to you then you'll have to think transformations. But first decide that is this small number a concern?

23. ## The Following User Says Thank You to jrai For This Useful Post:

StatsClue (02-21-2012)