+ Reply to Thread
Page 4 of 4 FirstFirst 1 2 3 4
Results 46 to 56 of 56

Thread: Logistic regression

  1. #46
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Location
    Saint Louis, MO
    Posts
    77
    Thanks
    2
    Thanked 13 Times in 13 Posts

    Re: Logistic regression




    You should take a look at those packages I described. They (1) do not require knowledge of calculus and (2) are very easy to explain. The typical results from those models is a predicted probability with a confidence interval.

    When I go to a client I say, here is the predicted probability of this event occurring for this type of individual. Or, if you were to do this activity, here is how that probability would change. I don't mention statistics or derivatives.

    Those ways of ranking coefficients are all problematic and rarely give you a good answer. I've seen people try to use those methods and it always leads to strange and/or unstable answers. It's an exercise that's probably not worth doing to be frank.

  2. #47
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    I was actually just referring to the article you cited

    There is strong disagreement obviously about whether standardized coefficients are a good or bad idea (in linear regression as well). I have yet to see an emperical study (or even simulations) that show whether they do or do not work. I suspect from what Jake has indicated this is one of those areas where people simply don't know for sure.

    Without that data its impossible to judge. I will add your suggestion to the list I look at in making this assessment (my guess is that people use what is simplest to calculate as with the standarized coefficients in SAS or Wald statistics). Thanks.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  3. #48
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    Some authors argue that logistic regression is more effected than linear regression (including comparing odds ratios between models) by the impact of variables left out of the model. This is tied to the concept of unobserved heterogenity. Anyone have a good source (or comment on that) and how it effects logistic regression models?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  4. #49
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    Here is another good question (well two). F&T argue that if there is evidence of good model fit you really don't need to worry about outliers. I am not sure why this would be so. They also say that large parameters and SE are signs of a variety of problems such as MC. My question here is (and I know this is relative not absolute in nature) what does large really mean for either?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  5. #50
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Logistic regression

    MC = multicollinearity?

    I would think, guessing of course, that this could be translated over to moderately large Odds Ratios with wide Confidence Intervals.

    But of course everything in statistics seems to be arbitrary to some degree.
    Stop cowardice, ban guns!

  6. #51
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Location
    Saint Louis, MO
    Posts
    77
    Thanks
    2
    Thanked 13 Times in 13 Posts

    Re: Logistic regression

    Large parameters in models for binary dvs (and others) are typically a sign of separation (a bad thing). I usually solve that problem by putting Cauchy priors (centered at zero with precision 2.5).

    Models with big coefficients typically don't validate well.

    And outliers aren't necessarily a problem, so long as you make sure that there isn't severe measurement error. If you model validates well, that's all that matters really.

  7. The Following User Says Thank You to threestars For This Useful Post:

    noetsi (04-09-2013)

  8. #52
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Logistic regression

    Can you elaborate on "putting Cauchy priors (centered at zero with precision 2.5)! What does this consist of?
    Stop cowardice, ban guns!

  9. #53
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    I understand why data separation or partial data separation would be serious (althoug that is the first I heard of that approach to addressing it). But the sense I got in the B and F book is that these are occuring even without separation issues (something SAS normally warns you off in a different way).

    I guess my question stil is, what is "big"

    MC = Multicolinearity. My shorthand.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  10. #54
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Location
    Saint Louis, MO
    Posts
    77
    Thanks
    2
    Thanked 13 Times in 13 Posts

    Re: Logistic regression

    Given my understanding of MC, I don't understand why that would necessarily cause the coefficients in a logit/probit model to be overly large. If you have the right model, the coefficient estimate itself shouldn't be biased by MC, instead the standard errors should be too large. So in that sense, I could see the standard errors being bigger than usual. However, I don't really see why the coefficients would be larger (without separation).

    As for what's too large. It depends partially on the range of the IV in question. If the range is 0-.2 then all cards are off the table. But assuming you have a continuous IV, I would say anything larger than 2-3 will draw my attention.

    Given the way the likelihood is maximized for a logit/probit model, the algorithm will typically stop making the coefficient bigger only because the likelihood (which while still increasing) is not increasing to a large enough extent. As everyone knows I'm sure, the coefficient (given separation) is really infinity.

    The way I place those priors is by estimating a Bayesian logit (which allows prior distributions for the model parameters). I use R for this and specifically the MCMClogit function in MCMCpack. There's a good article by Gelman and others about this approach which I've attached below.

    http://arxiv.org/pdf/0901.4011.pdf

  11. #55
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression

    I was not trying to say that the coefficients would be large because of MC. Only the SE will be. I was simplying pointing out large coefficients or SE can occur without data seperation. They are seperate issues (actually I am only sure of this with SE).

    Sadly I don't know Baysian statistics so that part is beyond me. Allison and others suggest a series of solutions to data seperation (although they mainly work with semi-separate best) but for myself I usually don't run models where they occur because they commonly reflect too little variation in the DV for me to run the models. That is that is where I have encountered this problem.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  12. #56
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic regression


    I was not trying to say that the coefficients would be large because of MC. Only the SE will be. I was simplying pointing out large coefficients or SE can occur without data seperation. They are seperate issues (actually I am only sure of this with SE).

    Sadly I don't know Baysian statistics so that part is beyond me. Allison and others suggest a series of solutions to data seperation (although they mainly work with semi-separate best) but for myself I usually don't run models where they occur because they commonly reflect too little variation in the DV for me to run the models. That is that is where I have encountered this problem.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread
Page 4 of 4 FirstFirst 1 2 3 4

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats