+ Reply to Thread
Results 1 to 15 of 15

Thread: Comparing coefficients in Logistic Regression

  1. #1
    TS Contributor
    Points: 40,715, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Downloads
    gianmarco's Avatar
    Location
    Italy
    Posts
    1,369
    Thanks
    232
    Thanked 302 Times in 226 Posts

    Comparing coefficients in Logistic Regression




    Hello !
    I would like to have guidance on what follows.

    I was fitting a Logistic Regression model, and I got the coefficients for the model's significant predictors.
    Now, for the sake of a report I ma writing, is it correct/sound to rank the coefficients (i.e., ordering them from greatest to smallest) to provide an idea of their relative contribution to the prediction of the outcome of the (binary) dependent variable?

    It is my understanding that would not be a viable option since coefficients could refer to variables measured by different scales (as indeed happens in my model). So, I am wondering if it would be sound to standardize them? Is it a viable strategy? On the other hand, I have also read that the interpretation of the standardized coefficient is not much straightforward....

    If the latter strategy would be not viable, could the 'percentage change' be put to work instead. As for the percentage change, I got it from Allison's book on Logistic Regression in SAS. It can be calculated from the Odd Ratio of each coefficient: (OR-1)*100. This would indicate the percentage of change in the odds for the positive outcome of the dependent variable for each 1-unit increase in the independent variable. May be that ordering significant predictors by percentage change would make more sense in the context of coefficients comparison.

    Cheers
    Gm
    http://cainarchaeology.weebly.com/

  2. #2
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Comparing coefficients in Logistic Regression

    Yeah, this isn't an easy task and one I have not resolved myself. What you are missing is that one covariate may have a bigger effect, but also a giant standard error. You would think hey, standandize - but as you mentioned this process is highly debated in logistic regression as to its interpretation.

    I think the percentage change seems like a good idea, unless others have suggestions.

    P.S., How would you handle categorical variables in comparison to continuous, given your above description of percentage change?
    Stop cowardice, ban guns!

  3. The Following User Says Thank You to hlsmith For This Useful Post:

    gianmarco (07-29-2014)

  4. #3
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Comparing coefficients in Logistic Regression

    I spent a lot of time studying this because we do satisfaction surveys and want to rank variables impact on it. Using coefficients, odds ratios etc is not a good idea of getting at relative impact. Dozens of standardized coefficients have been created to accomplish this purpose (SAS has one of them built in) which serve the same function as beta weights in linear regression. Unfortunately there are major differences in these coefficients and there appears to be significant disagreement on which to use. After discussions here I chose to use the Wald statistic associated with each parameter (the higher the more important).

    Scott Richter wrote an extensive article on the standardized coefficients in logistic regression with his recommendations. I will try to find it and send you the link.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  5. The Following User Says Thank You to noetsi For This Useful Post:

    gianmarco (07-29-2014)

  6. #4
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Comparing coefficients in Logistic Regression

    "Dominance analysis" is one approach to this.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  7. #5
    TS Contributor
    Points: 22,428, Level: 93
    Level completed: 8%, Points required for next Level: 922
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Comparing coefficients in Logistic Regression

    or the extension of the Pratt index done to logistic regression
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  8. #6
    TS Contributor
    Points: 40,715, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Downloads
    gianmarco's Avatar
    Location
    Italy
    Posts
    1,369
    Thanks
    232
    Thanked 302 Times in 226 Posts

    Re: Comparing coefficients in Logistic Regression

    Quote Originally Posted by hlsmith View Post

    P.S., How would you handle categorical variables in comparison to continuous, given your above description of percentage change?
    I won't, because I do not have categorical predictors :-)

    Anyway, thank Noetsi, Jake, and Spunky for your comments. I am looking into dominance analysis, but I guess I will not have the time to get my hear around it for the time being...too much pressure in preparing a paper for a presentation, so percentage change will suffice for now.

    As for Noetsi suggestion, while I grasp the 'meaning' of Allison's percentage change, I am wondering what the Wald statistic is actually communicating. I understand that it should be equal to the square of the Betas divided by the square of their Standard Errors...

    Cheers
    Gm
    http://cainarchaeology.weebly.com/

  9. #7
    TS Contributor
    Points: 22,428, Level: 93
    Level completed: 8%, Points required for next Level: 922
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Comparing coefficients in Logistic Regression

    I'm pretty sure budescu and azen (the people involved in creating dominance analysis) have freely available excel and SAS Macros that do this. Or you can download the relaimpo package from R. It's been a few years so I can't quite remember whether or not it does logistic regression. But I'm pretty sure there are SAS macros.
    With that being said I resent the fact that you didn't consider using the Pratt index :-P
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  10. #8
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Comparing coefficients in Logistic Regression

    Quote Originally Posted by gianmarco View Post
    I won't, because I do not have categorical predictors :-)

    Anyway, thank Noetsi, Jake, and Spunky for your comments. I am looking into dominance analysis, but I guess I will not have the time to get my hear around it for the time being...too much pressure in preparing a paper for a presentation, so percentage change will suffice for now.

    As for Noetsi suggestion, while I grasp the 'meaning' of Allison's percentage change, I am wondering what the Wald statistic is actually communicating. I understand that it should be equal to the square of the Betas divided by the square of their Standard Errors...

    Cheers
    Gm
    To be fair here it is not my suggestion (I am not that clever). Either Dason or Jake suggested it to me although they both expressed doubts about the value of standardizing the variables for relative contribution period. But since I needed to do it, this was the approach that seemed best. I never asked why substantively it worked, I honestly never though about it till now
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  11. #9
    Points: 4,358, Level: 42
    Level completed: 4%, Points required for next Level: 192

    Posts
    143
    Thanks
    3
    Thanked 37 Times in 34 Posts

    Re: Comparing coefficients in Logistic Regression

    Quote Originally Posted by spunky View Post
    I'm pretty sure budescu and azen (the people involved in creating dominance analysis) have freely available excel and SAS Macros that do this. Or you can download the relaimpo package from R. It's been a few years so I can't quite remember whether or not it does logistic regression. But I'm pretty sure there are SAS macros.
    With that being said I resent the fact that you didn't consider using the Pratt index :-P
    I frequently use the relaimpo package in R, and although it is easy to use, it's not based on logistic regression. You can read more about it here: http://cran.r-project.org/web/packag...o/relaimpo.pdf

  12. The Following User Says Thank You to Injektilo For This Useful Post:

    spunky (07-29-2014)

  13. #10
    TS Contributor
    Points: 40,715, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Downloads
    gianmarco's Avatar
    Location
    Italy
    Posts
    1,369
    Thanks
    232
    Thanked 302 Times in 226 Posts

    Re: Comparing coefficients in Logistic Regression

    Quote Originally Posted by spunky View Post
    With that being said I resent the fact that you didn't consider using the Pratt index :-P
    I will be glad to consider it, provided that you put me on the right track elaborating a bit more on that and/or providing further links. :-)
    http://cainarchaeology.weebly.com/

  14. #11
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Comparing coefficients in Logistic Regression

    I went back to the tome I created to help me deal with logistic regression issues

    This is a sage monograph now dated. Page 52-56 deals with standardized logistic regression coefficients.

    http://books.google.com/books?id=EAI...ession&f=false

    A conference paper I have, but do not have a link might also be of value (I am sure I found it online): "Standardized Coefficients in Logistic Regression" by Jason King from Baylor University circa 2007

    This is an excellent article that I once had but no longer have access to. You might look it up.

    http://sf.oxfordjournals.org/content/89/4/1409.abstract

    I believe this is the lead in to that article..


    There is little consensus on how best to rank predictors in logistic regression. This paper describes and illustrates six possible methods for ranking predictors: 1) standardized coefficients, 2) p‐values of Wald chi‐square statistics, 3) a pseudo partial correlation metric for logistic regression, 4) adequacy, 5) c‐statistics, and 6) information values. There are many other ways, these were chosen because the author used them or saw others do so this way.
    Another view...

    Another solution might be to report the Wald statistics or R-values from logistic regression. They're also scale independent measures that indicate strength and direction of an effect, and have the advantage that they're available for categorical variables as well. On the downside, they're conservative estimates, they tend to be a little lower than the actual likelihood ratio for an effect, and this is stronger as effects are larger.
    https://groups.google.com/forum/?fro...ss/W4Ri--ySjN8

    These are my own comments based on readings so take them with a large grain of salt...

    There is little agreement on how to compare the relative impact of variables in logistic regression or even if you should. The unstandardized parameters and odds ratios can not be directly compared due to differing variation and scale issues. The fact that variation matters here is another reason not to estimate results with different sample sizes for specific questions.
    and purely for amusement [you had to be involved in the chat discussion to know how little enthusiaism Jake and Dason actually had for my question or my understanding of the issues at hand]

    Jake suggested (with limited enthusiasm) the following. Use the higher Wald statistic to show which has more impact although he and Dason actually thought bivariate comparisons made more sense (for reasons that I don’t understand).
    And probably never will....
    Last edited by noetsi; 07-29-2014 at 05:38 PM.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  15. #12
    TS Contributor
    Points: 22,428, Level: 93
    Level completed: 8%, Points required for next Level: 922
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Comparing coefficients in Logistic Regression

    Quote Originally Posted by gianmarco View Post
    I will be glad to consider it, provided that you put me on the right track elaborating a bit more on that and/or providing further links. :-)
    well, for the sake of simplicity on your part, you're probably better off working with budescu & azen's dominance analysis. mostly because there is a "plug-and-use" readily available SAS macro for your to use. i thought the relaimpo package had been extended to logistic regression as well already, but it Injektilo correctly mentioned it only works on multiple regression right now.

    the relevant article for this is here. it's basically extending the concept of Pratt measure for relative "importance" of variables from standard OLS multiple regression to logistic regression. the Pratt measure is basically multiplying the correlation coefficient between one specific predictor and the criterion variable times the R-squared from the regression equation of your model. if you do that with all your predictors and add up all the numbers you'll notice that they add up to the model R-squared so they are considered measures of 'importance' of each predictor since they tell you how much variance can be accounted for by every covariate (kind of like the squared semi partial correlation but better).

    for logsitic regression to work out, an extension based on weight least squares had to be done to the R-squared measure. this is probably more math than you have the time to look over so you'll maybe consider going over it when you have more time? i'm kind of just throwing it out there though because my advisor was one of the inventors of that measure, so i kinda have to market it
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  16. #13
    Phineas Packard
    Points: 16,013, Level: 81
    Level completed: 33%, Points required for next Level: 337
    Lazar's Avatar
    Location
    Sydney
    Posts
    1,159
    Thanks
    198
    Thanked 336 Times in 299 Posts

    Re: Comparing coefficients in Logistic Regression

    You could use a similar approach to Boruta algorithm http://cran.r-project.org/web/packag...uta/Boruta.pdf. i.e. extending the logic rom random forests to glm models. i.e. compare the contribution of a variable to ROC compared to a set of "shadow" variables (random permutations).
    "I have done things to data. Dirty things. Things I am not proud of."

  17. #14
    TS Contributor
    Points: 40,715, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Downloads
    gianmarco's Avatar
    Location
    Italy
    Posts
    1,369
    Thanks
    232
    Thanked 302 Times in 226 Posts

    Re: Comparing coefficients in Logistic Regression

    Quote Originally Posted by noetsi View Post
    I spent a lot of time studying this because we do satisfaction surveys and want to rank variables impact on it. Using coefficients, odds ratios etc is not a good idea of getting at relative impact. Dozens of standardized coefficients have been created to accomplish this purpose (SAS has one of them built in) which serve the same function as beta weights in linear regression. Unfortunately there are major differences in these coefficients and there appears to be significant disagreement on which to use. After discussions here I chose to use the Wald statistic associated with each parameter (the higher the more important).

    Scott Richter wrote an extensive article on the standardized coefficients in logistic regression with his recommendations. I will try to find it and send you the link.
    Resuming an old thread just to ask how in R I can get the Wald statistics from the glm() summary output.

    gm
    http://cainarchaeology.weebly.com/

  18. #15
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Comparing coefficients in Logistic Regression


    The Wald statistics are given as part of the default output, as Wald z-statistics. If you want, you can square them and then they are Wald chi-squares.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  19. The Following User Says Thank You to Jake For This Useful Post:

    gianmarco (06-02-2015)

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats