+ Reply to Thread
Results 1 to 12 of 12

Thread: Relative impact of regressors on Y.

  1. #1
    Fortran must die
    Points: 41,972, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,293
    Thanks
    640
    Thanked 896 Times in 855 Posts

    Relative impact of regressors on Y.




    A question I get asked a lot is, if we have these three predictors of Y, which of the 3 has the most, next most and least impact. I have tried various ways and never come up with an approach I am really happy with.

    I need to do this for both interval and binary DV.
    "The difference between genius and stupidity is that genius has its limits."

  2. #2
    Points: 574, Level: 11
    Level completed: 48%, Points required for next Level: 26

    Posts
    102
    Thanks
    8
    Thanked 26 Times in 24 Posts

    Re: Relative impact of regressors on Y.

    Quote Originally Posted by noetsi View Post
    A question I get asked a lot is, if we have these three predictors of Y, which of the 3 has the most, next most and least impact. I have tried various ways and never come up with an approach I am really happy with.

    I need to do this for both interval and binary DV.
    The problem with doing this is that it's usually hard to justify the ranking based purely on the size of estimated beta coefficient. Assume we regress Price of a used car (Y) on mileage, number of previous owners, and transmission type (X1, X2, X3).

    The classic slope interpretation would be: For every 1 unit increase in X(n), we expect Y to increase/decrease by |beta(n)|, holding all else constant.

    The issue arises because you can't easily say that increasing mileage by 1 mile is equivalent to a 1 person increase in previous owners. The units are different, so it doesn't really make sense to say which has the "most impact" on the DV. Sure, one may elicit a larger change in the DV, but that comes from a given change in X(n), which might not be equal to that same change in another X variable.

    I think one (partial) solution is to standardize (at least) the predictors. This way, you can say that a 1 SD change in X1 causes a larger change in Y than a 1 SD change in X2, but again, the standard deviations have units of measure, so it's not a perfect solution, but it does help in a small way (I think, anyway, because it puts these 1 unit increases on a scale of "statistical un-usualness" within their respective distributions).

    Thoughts?

  3. The Following User Says Thank You to ondansetron For This Useful Post:

    noetsi (01-30-2017)

  4. #3
    Instagram oscar.olvera100
    Points: 18,427, Level: 86
    Level completed: 16%, Points required for next Level: 423
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,060
    Thanks
    160
    Thanked 501 Times in 402 Posts

    Re: Relative impact of regressors on Y.

    If by "impact" you mean something like "which predictor contributes the most to the R-squared measure" you could use something like Pratt's relative importance measure or Budescu's dominance analysis. Unless you have something weird going on (e.g. suppression, multicollinearity, etc.) they usually agree quite a bit and they break down the R-squared into the percentage of explained variance that each predictor contributes towards the overall model fit.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  5. The Following User Says Thank You to spunky For This Useful Post:

    noetsi (01-30-2017)

  6. #4
    Omega Contributor
    Points: 31,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,277
    Thanks
    347
    Thanked 1,043 Times in 1,009 Posts

    Re: Relative impact of regressors on Y.

    Look up partial r**2, omega square. That is what I usually use. A more intensive way may be to see which variable has the highest var importance using LASSO reg or elastic net, but these aren't really accessible in base SAS.
    Stop cowardice, ban guns!

  7. The Following User Says Thank You to hlsmith For This Useful Post:

    noetsi (01-30-2017)

  8. #5
    Points: 574, Level: 11
    Level completed: 48%, Points required for next Level: 26

    Posts
    102
    Thanks
    8
    Thanked 26 Times in 24 Posts

    Re: Relative impact of regressors on Y.

    I guess it would depend which way you want to say it "impacts" Y (explained variation vs magnitude of change in Y). The latter is the one I've heard people try to do more commonly, which is why I phrased my response that way, but the former is shown in many stat packages, and I think it's less controversial.

  9. #6
    Omega Contributor
    Points: 31,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,277
    Thanks
    347
    Thanked 1,043 Times in 1,009 Posts

    Re: Relative impact of regressors on Y.

    I will throw this out there, just to add to the overall list. There are standardized estimates in linear regression.


    Also, much like the LASSO suggestion. If you have a sufficient amount of data, you can run cross-validation and see how good variables perform in other subsamples.
    Stop cowardice, ban guns!

  10. #7
    Fortran must die
    Points: 41,972, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,293
    Thanks
    640
    Thanked 896 Times in 855 Posts

    Re: Relative impact of regressors on Y.

    For linear models I have used standardized betas as suggested. The problem with that approach is that there is significant question whether standardized betas make sense when some of the predictors are dummy variables and there are almost always dummy variables in my model. Commonly, given what I analyze, there are more of them than interval predictors.

    Using impact on R squared is an interesting idea although obviously it does not work with categorical DV. I have used for categorical DV, based on suggestions here years ago, the magnitude of the Wald value each predictor has to rank impact. SAS does something very similar with one of its inherent functions for binary DV.

    I don't know much about LASSO although I will look into it. I don't understand what this means (what do you do to do this)?

    If you have a sufficient amount of data, you can run cross-validation and see how good variables perform in other subsamples
    "The difference between genius and stupidity is that genius has its limits."

  11. The Following User Says Thank You to noetsi For This Useful Post:

    hlsmith (01-30-2017)

  12. #8
    Omega Contributor
    Points: 31,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,277
    Thanks
    347
    Thanked 1,043 Times in 1,009 Posts

    Re: Relative impact of regressors on Y.

    Yes noetsi, I was just throwing STB out there to add to possible options. It too has weaknesses.


    I think you can still use partial R^2 with categorical variables. It would be intuitive with binary variables, though with more groups you would just have to make sure you mention what the reference groups is when explaining.
    Stop cowardice, ban guns!

  13. The Following User Says Thank You to hlsmith For This Useful Post:

    noetsi (01-30-2017)

  14. #9
    Instagram oscar.olvera100
    Points: 18,427, Level: 86
    Level completed: 16%, Points required for next Level: 423
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,060
    Thanks
    160
    Thanked 501 Times in 402 Posts

    Re: Relative impact of regressors on Y.

    Quote Originally Posted by noetsi View Post
    Using impact on R squared is an interesting idea although obviously it does not work with categorical DV.
    Well... not exactly. I mean, if you're willing to make a few assumptions about the categorical nature of your DV the Pratt index has been extended to logistic regression. And Azen extended dominance analysis for logistic regression as well. I'm almost sure they even have a SAS macro somewhere, but then again I don't use SAS so me doesn't know.

    The 'relaimpo' package in R executes all these R-squared partition measures and a few more. But I don't like the other ones
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  15. The Following User Says Thank You to spunky For This Useful Post:

    noetsi (01-30-2017)

  16. #10
    Fortran must die
    Points: 41,972, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,293
    Thanks
    640
    Thanked 896 Times in 855 Posts

    Re: Relative impact of regressors on Y.

    I am going to look those approaches up spunky, I know neither. My comment on R square is that there is no generally accepted pseudo R square for logistic models, last time I looked there were like 33 of them which differed significantly from each other
    "The difference between genius and stupidity is that genius has its limits."

  17. #11
    Instagram oscar.olvera100
    Points: 18,427, Level: 86
    Level completed: 16%, Points required for next Level: 423
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,060
    Thanks
    160
    Thanked 501 Times in 402 Posts

    Re: Relative impact of regressors on Y.

    Quote Originally Posted by noetsi View Post
    I am going to look those approaches up spunky, I know neither. My comment on R square is that there is no generally accepted pseudo R square for logistic models, last time I looked there were like 33 of them which differed significantly from each other
    You are absolutely right. Which is why I covered my basis by saying " if you're willing to make a few assumptions about the categorical nature of your DV" because the R-squared-type measure that is proposed in that article does require you to buy into a few things or else it is nonsensical. I do not quite remember all of them but a big one for me as that it requires you to assume that the binary observed variable arose from the discretization of a continuous, latent variable. Now, if you have a DV where your observed variable is something like " correct/incorrect answer to a test" then sure, I'm willing to believe that maybe there is a latent aptitude score that can only be measured as the response of a test. But if your variable is something more... concrete like, oh I dunno, "man/woman, dead/alive, etc." then yeah, I'd have trouble buying into the latent variable model, in which case the R-squared is nonsensical and the Pratt index is not appropriate.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  18. The Following User Says Thank You to spunky For This Useful Post:

    noetsi (01-30-2017)

  19. #12
    Fortran must die
    Points: 41,972, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,293
    Thanks
    640
    Thanked 896 Times in 855 Posts

    Re: Relative impact of regressors on Y.


    I do not quite remember all of them but a big one for me as that it requires you to assume that the binary observed variable arose from the discretization of a continuous, latent variable.
    Some, but by no means all interpretations of logistic regression assume exactly that.

    BTW when you get your PHD are you going to continue to be humble or become an arrogant jerk ...
    "The difference between genius and stupidity is that genius has its limits."

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats