+ Reply to Thread
Page 1 of 4 1 2 3 4 LastLast
Results 1 to 15 of 56

Thread: Logistic regression

  1. #1
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Logistic regression




    As far as I know logistic regression does not have beta weights, that is slopes which show which variable has the relatively greatest impact on the DV. Can you interpret odds ratios to show which variable is most important? So if variable X1 has a odds ratio of 1.6, X2 1.9 and X3 2.2 X3 would have more impact on the DV than X2 and X2 more of an impact on the DV than X1.

    I have never seen odds ratios interpreted this way (that is getting at the relative influence of a IV on a DV) so I doubt you can.

    While I am at it, homskedacity and multivariate normality are not assumptions of logistic regression, correct? So you don't need to test for this. But linearity between the logit and the IV are? If so how do you test for the later.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  2. #2
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    Anyone know how you test for this?

    Fifthly, logistic regression assumes linearity of independent variables and log odds. Whilst it does not require the dependent and independent variables to be related linearly, it requires that the independent variables are linearly related to the log odds. Otherwise the test underestimates the strength of the relationship and rejects the relationship to easily, that is being not significant (not rejecting the null hypothesis) where it should be significant. A solution to this problem is the categorization of the independent variables. That is transforming metric variables to ordinal level and then including them in the model.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  3. #3
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Location
    Saint Louis, MO
    Posts
    77
    Thanks
    2
    Thanked 13 Times in 13 Posts

    Re: Logistic regression

    I would generate predicted probabilities for high and low values of your independent variables (while holding all other variables at mean or median or mode). Then I would subtract the predicted probabilities to find the change in predicted probability of the dv equaling a one as a result of a change in each IV.

    You could generate uncertainty for these differences by simulating your beta coefficients.

    Gary King has software to do this. Zelig in R or Clarify in Stata.

  4. The Following User Says Thank You to threestars For This Useful Post:

    noetsi (01-23-2013)

  5. #4
    Omega Contributor
    Points: 38,289, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,992
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Logistic regression

    You don't need normality in that regard, but you are suppose to make sure the IV are not too related.
    Stop cowardice, ban guns!

  6. The Following User Says Thank You to hlsmith For This Useful Post:

    jessireebob (05-02-2013)

  7. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    My IV will be related, because they are dimensions of satisfaction. I might do FA and use the latent variables identified rather than the raw variables.

    Which raises another point. VIF and tolerance (for multicolinearity) work for OLS. Can you use them for logistic regression? I assume so because the distribution of the IV don't matter in regression and its their relationship not that of the IV to the DV that matters for multicolinearity
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  8. #6
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Logistic regression

    Quote Originally Posted by noetsi View Post
    VIF and tolerance (for multicolinearity) work for OLS. Can you use them for logistic regression?
    Yes, they work much the same in logistic regression as in normal regression.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  9. The Following User Says Thank You to Jake For This Useful Post:

    noetsi (01-24-2013)

  10. #7
    TS Contributor
    Points: 7,081, Level: 55
    Level completed: 66%, Points required for next Level: 69

    Location
    Copenhagen , Denmark
    Posts
    515
    Thanks
    71
    Thanked 123 Times in 116 Posts

    Re: Logistic regression

    You can also use the follwing argument in interpreting betas:

    \frac{d}{d \beta_j}  \pi =  \frac{d}{d \beta_j}\frac{exp(x_i^T \beta)}{1+exp(x_i^T \beta)}



    \frac{exp(x_i^T \beta)}{(1+exp(x_i^T \beta))^2}  \beta_j = \frac{exp(x_i^T \beta)}{1+exp(x_i^T \beta)}  \frac{1}{exp(x_i^T \beta)}  \beta_j  = \pi (1- \pi) \beta_j.

    Then define point of equal opportunity as \pi = 0.5. This allows you to interpret beta from the point of equal opportunity - but offcourse this could be realized as many different point on the independent variables.
    However the max of \pi (1-\pi) = 0.25.This will give you a maximal effect of a certain independent variable.

    As a robustness check of coefficient on can estimate probit and check whether coefficients of the logit is approx 1.6 times higher.

  11. The Following 2 Users Say Thank You to JesperHP For This Useful Post:

    Donald (02-01-2013), noetsi (01-24-2013)

  12. #8
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    Jake pointed out to me, on chat, that probably the best way to compare the relative value (importance) of IV in logistic regression is to look at the Wald value (the higher the relatively more important). I wanted to say thanks to him on that.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  13. #9
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Logistic regression

    Just wanted to clarify that I said that approach would probably be adequate in scenarios where you have no other prior information about the variables at hand... not that it is a great approach in general.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  14. #10
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    Good point....
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  15. #11
    Points: 1,468, Level: 21
    Level completed: 68%, Points required for next Level: 32

    Posts
    12
    Thanks
    10
    Thanked 2 Times in 2 Posts

    Re: Logistic regression

    That's a good question.
    Maybe you can just multiply the unstandardized beta by its standard deviation (I have read it before). I don't know, maybe someone here can confirm what I said. I personally believe that it does not work because when the independent variable was, say, a "family income" variable, your unstandardized beta is something like 0.000.
    Usually, what I did in a logistic regression is to recode the independent variable so that they have only 2 categories. Again, I'm not sure if such results are interpretable, this is why i rarely use logistic regression.

    "Jake pointed out to me, on chat, that probably the best way to compare the relative value (importance) of IV in logistic regression is to look at the Wald value (the higher the relatively more important)."

    I could be wrong (and I hope so) but it seems to me that the Wald value is tied to the unstandardized beta, which is unreliable in interpreting the relative importance of each IV.

  16. The Following User Says Thank You to Donald For This Useful Post:

    noetsi (02-01-2013)

  17. #12
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    I could be wrong (and I hope so) but it seems to me that the Wald value is tied to the unstandardized beta, which is unreliable in interpreting the relative importance of each IV.
    Do you know a source for this as I have not seen it before?

    I have read that standardized beta have limited value in logistic regression because of the levels of the DV are restricted (there is only two levels). This would be much like the problem with using beta weights with dummy variables in OLS. For this reason beta weights are not generated for logistic regression by any commerical software (they may in fact not even exist).
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  18. #13
    Points: 1,468, Level: 21
    Level completed: 68%, Points required for next Level: 32

    Posts
    12
    Thanks
    10
    Thanked 2 Times in 2 Posts

    Re: Logistic regression

    Do you know a source for this as I have not seen it before?
    I have read it here.

    "Equation (8.11) shows how the Wald statistic is calculated and you can see it’s basically identical to the t-statistic in linear regression ... it is the value of the regression coefficient divided by its associated standard error."

    For this reason beta weights are not generated for logistic regression by any commerical software (they may in fact not even exist).
    Hmm, I don't understand. Without beta weights, how could you even begin to interpret the results ?

  19. #14
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    I was actually asking why you thought the unstandardized beta was unreliable or why you thought there were standardized betas in logistic regression.

    The issue here may be the term beta weights. Some use that to refer to unstandardized slopes. Logistic regression does have those. What it does not have, and what beta weights as compared to betas refer to, is standardized slopes as in OLS. Because when you only have two levels, the use of standard deviations which is central to standardized slopes is essentially meaningless.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  20. The Following User Says Thank You to noetsi For This Useful Post:

    Donald (02-01-2013)

  21. #15
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Logistic regression


    Quote Originally Posted by Donald View Post
    I could be wrong (and I hope so) but it seems to me that the Wald value is tied to the unstandardized beta, which is unreliable in interpreting the relative importance of each IV.
    What exactly do you mean by "tied to"? All you seem to have shown is that the Wald can be expressed in terms of the unstandardized beta (plus some other stuff). It's not clear how this is a problem as surely any statistic for relative importance of a variable will be in some way expressible in terms of the unstandardized beta plus some other stuff.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  22. The Following 2 Users Say Thank You to Jake For This Useful Post:

    Donald (02-01-2013), noetsi (02-01-2013)

+ Reply to Thread
Page 1 of 4 1 2 3 4 LastLast

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats