+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 18

Thread: How can this be true [linear versus logistic regression]

  1. #1
    Fortran must die
    Points: 50,358, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,472
    Thanks
    681
    Thanked 910 Times in 869 Posts

    How can this be true [linear versus logistic regression]




    This comes from a government study that was used to generate something of great importance, funding and related factors for a major government program. It contradicts pretty much everything I have read in the last decade (and learned in class).

    For simplicity and speed and because of the large number of models estimated, the models were estimated using linear probability models, even when the dependent variable was binary. Logit and probit estimation techniques are generally recommended for estimating equations with zero-one dependent variables. However, the authors of the methodology reported that using logit or probit made it more difficult to interpret the results and created some complexities in calculating adjustments.
    Interpretation of odds ratios or slopes from logistic models are more difficult to interpret, but interpreting linear models with a binary DV are simply wrong - or so I have always read


    For example, they stated that because logit and probit are non-linear models, the adjustment factor could not be calculated using sample means but rather required calculating probabilities for all observations using the full set of data.
    I don't understand what this means. What they were doing was estimating slopes of variables which they then used with other data for the X to estimate requirements for agencies to meet. That is in this first part they, I think, were creating slopes then in the second part they used these slopes and current data on the IV to estimate what the goals of the agency [the DV] should be.

    Further, the argument was made that econometricians had
    shown that the drawbacks of using linear probability models, compared with logit and probit techniques, were minimal.
    That is news to me. I have read the exact opposite.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  2. The Following User Says Thank You to noetsi For This Useful Post:

    jbwettergreen (01-10-2017)

  3. #2
    Fortran must die
    Points: 50,358, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,472
    Thanks
    681
    Thanked 910 Times in 869 Posts

    Re: How can this be true [linear versus logistic regression]

    They go onto to say this (I think this involves estimating the slopes that are used in the second stage, although I am not certain).

    In order to test the sensitivity of the estimates to this simplification, both techniques for entered employment and retention performance measures for the WIA Adult program were estimated. The coefficients estimates were found to be quite similar if not virtually identical in most cases.
    So why do we do logistic regression if there is no difference between it and linear regression according to the US government for binary DV
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  4. #3
    TS Contributor
    Points: 17,899, Level: 85
    Level completed: 10%, Points required for next Level: 451
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,043
    Thanks
    118
    Thanked 422 Times in 324 Posts

    Re: How can this be true [linear versus logistic regression]

    Try to think it through yourself instead of worrying about what authorities say. So to start:

    When you have a binary DV, which assumptions of the linear OLS model are breached?
    What properties of the OLS estimator are those assumptions required for?
    Matt aka CB | twitter.com/matthewmatix

  5. #4
    TS Contributor
    Points: 11,162, Level: 69
    Level completed: 78%, Points required for next Level: 88
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,402
    Thanks
    156
    Thanked 323 Times in 303 Posts

    Re: How can this be true [linear versus logistic regression]

    hi,
    from a practical POV, isn't the argument that in the middle range (probabilities relatively far from 0 or 1 ) the OLS will lerform well, the problem being that it can predict senseless values at the extremes?

    regards

  6. #5
    Fortran must die
    Points: 50,358, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,472
    Thanks
    681
    Thanked 910 Times in 869 Posts

    Re: How can this be true [linear versus logistic regression]

    Quote Originally Posted by CowboyBear View Post
    Try to think it through yourself instead of worrying about what authorities say. So to start:

    When you have a binary DV, which assumptions of the linear OLS model are breached?
    What properties of the OLS estimator are those assumptions required for?
    I don't know all the violations, but two I remember. First the data will be always heteroscedastic. Second, nonsensical slopes can be found.

    Since I don't consider myself particularly good at statistics, what experts say matters to me And more to the point, this is not just a theoretical matter. It involves the setting of goals that my agency, and most DOL and DOE organizations will have to meet - or there will be major consequences. So if the metrics was set wrong, presumably by real statisticians, that is sort of important.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  7. #6
    Omega Contributor
    Points: 35,194, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,675
    Thanks
    382
    Thanked 1,125 Times in 1,088 Posts

    Re: How can this be true [linear versus logistic regression]

    They obviously trend in the same way. I would say deviating away from logistic seems sketchy to me, in that you run the risk of model misspecification. They probably made that statement so everyone would be on the same "scale" per se and to make it easy for those that are not familiar with logistic. Seems lazy and if their staff can't run both, then maybe they aren't the right people. They just need to come up with boil plate language how to interpret both for the stats illiterate people who use the results.


    I bet it revolves around the difficulties of conveying results to politicians and them using the results.
    Stop cowardice, ban guns!

  8. #7
    Fortran must die
    Points: 50,358, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,472
    Thanks
    681
    Thanked 910 Times in 869 Posts

    Re: How can this be true [linear versus logistic regression]

    The analysis is highly complex, these are clearly expert econometricians.

    It appears that econometricians, some of them anyhow, have decided that since results in logit and OLS [linear probability models when predicting binary variables] often are very similar its ok to use OLS. Part of this involves when you're estimating certain range of results apparently, the more results are near extreme the less well linear probability does. But in many cases you are not estimating extreme values so that is not an issue. Second, they argue that the inherent heteroscedastcity can be eliminated with White SE [not sure that is true, but they believe it]. Finally, they argue that while linear probability models are sometimes wrong, so are logistic models [that is wrong in predicting binary variables without nonsensical results - but this may also deal with mispecification].
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  9. #8
    TS Contributor
    Points: 12,501, Level: 73
    Level completed: 13%, Points required for next Level: 349

    Posts
    951
    Thanks
    0
    Thanked 103 Times in 100 Posts

    Re: How can this be true [linear versus logistic regression]

    Maybe this is an econometrics thing.

    Here's an article discussing the issues: http://statisticalhorizons.com/linear-vs-logistic
    All things are known because we want to believe in them.

  10. #9
    Omega Contributor
    Points: 35,194, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,675
    Thanks
    382
    Thanked 1,125 Times in 1,088 Posts

    Re: How can this be true [linear versus logistic regression]

    noetsi,


    Do you have a link to the source of what you are referencing so we can better put it into context?
    Stop cowardice, ban guns!

  11. #10
    Fortran must die
    Points: 50,358, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,472
    Thanks
    681
    Thanked 910 Times in 869 Posts

    Re: How can this be true [linear versus logistic regression]

    It is a pdf sent me for which I have no link. This is the pertinent comment by the authors.

    For simplicity and speed and because of the large number of models estimated, the models were estimated using linear probability models, even when the dependent variable was binary10. Logit and
    probit estimation techniques are generally recommended for estimating equations with zero-one dependent variables. However, the authors of the methodology reported that using logit or probit
    made it more difficult to interpret the results and created some complexities in calculating adjustments. For example, they stated that because logit and probit are non-linear models, the adjustment factor
    could not be calculated using sample means but rather required calculating probabilities for all observations using the full set of data. Further, the argument was made that econometricians had
    shown that the drawbacks of using linear probability models, compared with logit and probit techniques, were minimal. In order to test the sensitivity of the estimates to this simplification, both
    techniques for entered employment and retention performance measures for the WIA Adult program were estimated. The coefficients estimates were found to be quite similar if not virtually identical in
    most cases.
    I do have a link to the econometric book that establishes to the authors linear probability models are satisfactory equivalents to logistic regression.

    https://pdfs.semanticscholar.org/6bd...e5a0763289.pdf

    If you can use linear probability models for binary variables, why ever run logistic regression? Slopes in logistic regression are very difficult to interpret, you get no true R square, and many test that exist for linear models do not exist with logistic regression [including diagnostics].
    Last edited by noetsi; 01-09-2017 at 02:28 PM.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  12. #11
    Omega Contributor
    Points: 35,194, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,675
    Thanks
    382
    Thanked 1,125 Times in 1,088 Posts

    Re: How can this be true [linear versus logistic regression]

    Don't forget that binary outcomes can also be put on the risk scale and used for relative risks and risk differences. These allow you to calculate relative risk reduction, absolute risk reduction, number needed to harm, and number need to treat (e.g., how many people do you have to intervene on to get another outcome of interest compare to the other group).
    Stop cowardice, ban guns!

  13. #12
    Fortran must die
    Points: 50,358, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,472
    Thanks
    681
    Thanked 910 Times in 869 Posts

    Re: How can this be true [linear versus logistic regression]

    The problem with that is that I have not found, and I tried really hard to do so several years ago, to calculate relative risk in SAS. Do you know a way to generate relative risk in SAS?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  14. #13
    Omega Contributor
    Points: 35,194, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,675
    Thanks
    382
    Thanked 1,125 Times in 1,088 Posts

    Re: How can this be true [linear versus logistic regression]

    Yes, if i remember I will send links tomorrow. It likely uses the GLM procedure.
    Stop cowardice, ban guns!

  15. #14
    Omega Contributor
    Points: 35,194, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,675
    Thanks
    382
    Thanked 1,125 Times in 1,088 Posts

    Re: How can this be true [linear versus logistic regression]

    Stop cowardice, ban guns!

  16. The Following User Says Thank You to hlsmith For This Useful Post:

    noetsi (01-10-2017)

  17. #15
    TS Contributor
    Points: 17,899, Level: 85
    Level completed: 10%, Points required for next Level: 451
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,043
    Thanks
    118
    Thanked 422 Times in 324 Posts

    Re: How can this be true [linear versus logistic regression]


    Quote Originally Posted by noetsi View Post
    I don't know all the violations, but two I remember. First the data will be always heteroscedastic. Second, nonsensical slopes can be found.
    What are the consequences of violation of the assumption of homoscedasticity? What other assumptions are there? Is it true that odds ratios are always harder to interpret than linear slopes? How might the usefulness of logistic vs linear regression differ depending on whether the goal is explanation or prediction?

    Since I don't consider myself particularly good at statistics, what experts say matters to me
    Basically I'm trying to get you to think things through critically yourself - you're perfectly capable of this Simply asking what the experts conclude works only when they're all in agreement (i.e., never!) But we can critically evaluate the arguments being put forward by experts and think about when they are and aren't valid. That critical authority-questioning attitude is an essential part of a scientific mindset (regardless of where you're trying to do science).
    Matt aka CB | twitter.com/matthewmatix

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats