+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 18

Thread: multiple linear regression: negative Y ?!

  1. #1
    Points: 3,076, Level: 34
    Level completed: 18%, Points required for next Level: 124

    Posts
    147
    Thanks
    79
    Thanked 1 Time in 1 Post

    multiple linear regression: negative Y ?!



    So I have my equation and it looks weird. The intercept is a small positive value and a lot of the co-efficients have large negative values. So that means my outcome, dose, is going to be negative?! How's that supposed to make sense? Or do I have it wrong?

    Thanks.

  2. #2
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: multiple linear regression: negative Y ?!

    Not necessarily. You can't say until you have final predictions. The predictions are result of not just parameters but sum-product of parameters & IVs. What if IVs are also negative? It'd help if you provide more information regarding your equation. And I'd say that first of all make predictions & see if the outcome is really negative?

  3. The Following User Says Thank You to jrai For This Useful Post:

    StatsClue (02-21-2012)

  4. #3
    IBM Rules
    Points: 12,863, Level: 74
    Level completed: 4%, Points required for next Level: 387

    Posts
    2,503
    Thanks
    110
    Thanked 368 Times in 356 Posts

    Re: multiple linear regression: negative Y ?!

    What negative slopes mean is that the DV will go down as the IV goes up. If the slopes are small (meaning Y won't fall much) the slopes can be negative and Y not be negative even with a small intercept.

    If you find predictions of Y that are nonsensical (like negative dosage) several factors could be in play. If your Y is not interval like and you are using a method like OLS regression this can generate nonsensical results. Also if an IV can not take on a value of 0 (say the value is height of an adult - no adult has a zero height) that can make your intercept essentially meaningless as all it is the value of Y when all X are 0. I don't know for sure how that effects predictions of Y, although it is common to center an X that can not take on a meaningful value of 0.
    "Facts are stubborn things, but statistics are more pliable." Mark Twain

  5. The Following User Says Thank You to noetsi For This Useful Post:

    StatsClue (02-21-2012)

  6. #4
    Points: 3,076, Level: 34
    Level completed: 18%, Points required for next Level: 124

    Posts
    147
    Thanks
    79
    Thanked 1 Time in 1 Post

    Re: multiple linear regression: negative Y ?!

    Quote Originally Posted by noetsi View Post
    What negative slopes mean is that the DV will go down as the IV goes up. If the slopes are small (meaning Y won't fall much) the slopes can be negative and Y not be negative even with a small intercept.

    If you find predictions of Y that are nonsensical (like negative dosage) several factors could be in play. If your Y is not interval like and you are using a method like OLS regression this can generate nonsensical results. Also if an IV can not take on a value of 0 (say the value is height of an adult - no adult has a zero height) that can make your intercept essentially meaningless as all it is the value of Y when all X are 0. I don't know for sure how that effects predictions of Y, although it is common to center an X that can not take on a meaningful value of 0.

    What do I do now? My Y is continuous, not interval. A couple of slopes have a LARGE negative value. What do I need to remedy? The model seemed to have evolved okay based on p-values and general sense.

  7. #5
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: multiple linear regression: negative Y ?!

    First of all let's see if the predictions are actually negative or not? Do you remember, I told you to use output statement to calculate residuals & cook's D statistics? Similarly you can calculate the predicted values as well:

    Output out=statsclue1 p=predicted_dv;

    Do a univariate analysis on variable predicted_dv & see what % of the values are negative?

    proc univariate data=statsclue1;
    var predicted_dv;
    run;

  8. The Following User Says Thank You to jrai For This Useful Post:

    StatsClue (02-21-2012)

  9. #6
    Points: 3,076, Level: 34
    Level completed: 18%, Points required for next Level: 124

    Posts
    147
    Thanks
    79
    Thanked 1 Time in 1 Post

    Re: multiple linear regression: negative Y ?!

    Umm..don't think I get it. Do you mean:

    proc print data=statsclue out=statsclue1 p=predicted_dose;
    run; /*this actually didn't run*/

    proc univariate data=statsclue1; /*and how is this taking into account the regression equation that I have?*/
    var predicted_dose;
    run;

    Working on it.

  10. #7
    RotParaTon
    Points: 46,248, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,080
    Thanks
    211
    Thanked 1,607 Times in 1,377 Posts

    Re: multiple linear regression: negative Y ?!

    You were supposed to add the output statement to the proc statement when you fit your model...
    "His programming is malfunctioning. It begins! Get your weapons, he's going to become a killbot!!!" - bryangoodrich

  11. The Following User Says Thank You to Dason For This Useful Post:

    StatsClue (02-21-2012)

  12. #8
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: multiple linear regression: negative Y ?!

    Just to clarify what Dason said, use output statement in proc GLM.

  13. The Following 2 Users Say Thank You to jrai For This Useful Post:

    Dason (02-21-2012), StatsClue (02-21-2012)

  14. #9
    Points: 3,076, Level: 34
    Level completed: 18%, Points required for next Level: 124

    Posts
    147
    Thanks
    79
    Thanked 1 Time in 1 Post

    Re: multiple linear regression: negative Y ?!

    proc glm data=stats out=stats1 p=predicted_dose;
    class gender;
    model dose=gender bmi A B C A*C/solution clparm;
    run;

    ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, DATA, MANOVA, MULTIPASS,
    NAMELEN, NOPRINT, ORDER, OUTSTAT, PLOTS.
    ERROR 76-322: Syntax error, statement will be ignored.

    Trying to figure.

  15. #10
    Points: 380, Level: 7
    Level completed: 60%, Points required for next Level: 20

    Location
    Toronto
    Posts
    4
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Re: multiple linear regression: negative Y ?!

    Perhaps you can use a suitable link function to force your prediction to be on the upper right hand quadrant. That is map [-Inf,Inf] -> [0,Inf]. The log function comes to mind.

  16. The Following User Says Thank You to dmancevo For This Useful Post:

    StatsClue (02-21-2012)

  17. #11
    RotParaTon
    Points: 46,248, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,080
    Thanks
    211
    Thanked 1,607 Times in 1,377 Posts

    Re: multiple linear regression: negative Y ?!

    Quote Originally Posted by dmancevo View Post
    Perhaps you can use a suitable link function to force your prediction to be on the upper right hand quadrant. That is map [-Inf,Inf] -> [0,Inf]. The log function comes to mind.
    Let's not jump quite there yet when the OP hasn't even successfully been able to see if the predictions are negative or not.
    "His programming is malfunctioning. It begins! Get your weapons, he's going to become a killbot!!!" - bryangoodrich

  18. The Following User Says Thank You to Dason For This Useful Post:

    StatsClue (02-21-2012)

  19. #12
    Points: 3,076, Level: 34
    Level completed: 18%, Points required for next Level: 124

    Posts
    147
    Thanks
    79
    Thanked 1 Time in 1 Post

    Re: multiple linear regression: negative Y ?!

    Oh thanks! I got an output. Had to put the output statement AFTER model.

    It gives a table of extreme observations. TWO of the lowest observations are NEGATIVE. ?

    Trying to figure.



    edit: and there's ONE missing value...dunno why. All my dose cells had a value though some values for other variables were missing.
    edit: no, the missing count is 40.
    Last edited by StatsClue; 02-21-2012 at 11:02 PM.

  20. #13
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: multiple linear regression: negative Y ?!

    Don't get worried about missing now, see how many negative values do you have & what % they are of total predicted values.

  21. #14
    Points: 3,076, Level: 34
    Level completed: 18%, Points required for next Level: 124

    Posts
    147
    Thanks
    79
    Thanked 1 Time in 1 Post

    Re: multiple linear regression: negative Y ?!

    All predicted values aren't printed. There are quantiles and there are extreme observations. The extreme observations show only TWO negative values and the Quantiles show one big negative value at 0% and another negative value at 1%. At all other levels---5%, 10%,25%,50%,90%,95%,99%,100%, the values are ALL POSITIVE.

    So I guess negatives are 1% of the predicted values..?

    PS: figured the missing thing, I think. The other variables had some data missing and so those observations were deleted for the analysis. Values used were 40 less than values read. Makes sense.

  22. #15
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: multiple linear regression: negative Y ?!


    That means somewhere between 1 and 5% of your values are negative. You can also find the exact number by:
    Code: 
    proc sql;
    select count(predicted_dose)
    from stats1
    where predicted_dose<0;
    quit;
    So this shows that only very small % of your values are predicted negative, contrary to your initial thought . If this is a concern to you then you'll have to think transformations. But first decide that is this small number a concern?

  23. The Following User Says Thank You to jrai For This Useful Post:

    StatsClue (02-21-2012)

+ Reply to Thread
Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats