+ Reply to Thread
Results 1 to 5 of 5

Thread: comparing linear vs logistic regression for a simple model with binary variables

  1. #1
    Points: 7,329, Level: 56
    Level completed: 90%, Points required for next Level: 21

    Posts
    26
    Thanks
    7
    Thanked 0 Times in 0 Posts

    comparing linear vs logistic regression for a simple model with binary variables




    Let's suppose Y and X are two binary variables. If I do a linear regression model Y=B0 + B1*X and estimate the predicted values for Y, I will get the same results than if I use a logistic regression. Is there an intuitive explanation for this?

  2. #2
    Omega Contributor
    Points: 38,284, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,991
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: comparing linear vs logistic regression for a simple model with binary variables

    Post your code or output. One model uses least squares and the other maximum likelihood estimates. They are optimizing two different things, correct?
    Stop cowardice, ban guns!

  3. #3
    Points: 7,329, Level: 56
    Level completed: 90%, Points required for next Level: 21

    Posts
    26
    Thanks
    7
    Thanked 0 Times in 0 Posts

    Re: comparing linear vs logistic regression for a simple model with binary variables

    Here is an example, but other data would give similar results:


    tab Y X

    | X
    Y | 0 1 | Total
    -----------+----------------------+----------
    0 | 271 45 | 316
    1 | 142 42 | 184
    -----------+----------------------+----------
    Total | 413 87 | 500


    . reg Y X

    Source | SS df MS Number of obs = 500
    -------------+------------------------------ F( 1, 498) = 6.01
    Model | 1.38710662 1 1.38710662 Prob > F = 0.0146
    Residual | 114.900893 498 .230724685 R-squared = 0.0119
    -------------+------------------------------ Adj R-squared = 0.0099
    Total | 116.288 499 .233042084 Root MSE = .48034

    ------------------------------------------------------------------------------
    Y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    X | .138933 .0566627 2.45 0.015 .0276055 .2502604
    _cons | .3438257 .0236359 14.55 0.000 .2973873 .390264
    ------------------------------------------------------------------------------

    . predict p1
    (option xb assumed; fitted values)

    . logistic Y X

    Logistic regression Number of obs = 500
    LR chi2(1) = 5.81
    Prob > chi2 = 0.0159
    Log likelihood = -326.03428 Pseudo R2 = 0.0088

    ------------------------------------------------------------------------------
    Y | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    X | 1.781221 .4243795 2.42 0.015 1.11665 2.841307
    _cons | .5239852 .0542832 -6.24 0.000 .4276981 .6419494
    ------------------------------------------------------------------------------

    . predict p2
    (option pr assumed; Pr(Y))

    . sum p1 p2

    Variable | Obs Mean Std. Dev. Min Max
    -------------+--------------------------------------------------------
    p1 | 500 .368 .0527235 .3438257 .4827586
    p2 | 500 .368 .0527235 .3438257 .4827586

    . tab p1 p2

    Fitted | Pr(Y)
    values | .3438257 .4827586 | Total
    -----------+----------------------+----------
    .3438257 | 413 0 | 413
    .4827586 | 0 87 | 87
    -----------+----------------------+----------
    Total | 413 87 | 500


    So p1 created with linear model
    p2 created with logistic model
    p1 = p2

  4. #4
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: comparing linear vs logistic regression for a simple model with binary variables

    There is actually a dispute about this. The historic argument was that the results of a linear model when you had a binary DV would generate incorrect results. Among economists at least, many have now come to disagree on that point. They argue that so called linear probability models [which is using a linear estimator to estimate a binary DV] are as accurate as logistic regression if you use robust standard errors. I have a serious problem with that based on my training, but it appears to be a common view at least among economists. To some extent it depends on the nature of the specific model you are estimating,[ if most of the estimated probabilities are in a range of .3 to .7 I think the LPM will probably be ok].

    Remember that the slopes of a logistic regression is not the same thing conceptually as the slopes in linear regression. The logistic regression deals with changes in the logit not the raw level of the DV.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  5. The Following User Says Thank You to noetsi For This Useful Post:

    Alex C (02-07-2017)

  6. #5
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: comparing linear vs logistic regression for a simple model with binary variables


    And yeah note that it really doesn't matter in your case. If both are binary then either way you parameterize your model you're basically going to be saying "let's fit a value when x=0 and another value when x=1".
    I don't have emotions and sometimes that makes me very sad.

  7. The Following User Says Thank You to Dason For This Useful Post:

    CowboyBear (02-06-2017)

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats