+ Reply to Thread
Results 1 to 3 of 3

Thread: Multicollinearity problem in binary logistic regression

  1. #1
    Points: 450, Level: 8
    Level completed: 50%, Points required for next Level: 50

    Posts
    8
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Multicollinearity problem in binary logistic regression




    I'd like to ask for some help with a binary logistic regression. In SPSS I am trying to build a binary logistic regression with 4 independent continuous variables (Sample size - 85).

    I have a dichotomous dependent variable (a clinical form of multiple sclerosis) and quite a few independent variables that are quite good predictors of the dependent variable individually (individually I have 10 variables with AOC > 0,8 and all of them show to be significant if I build a binary logistic regression with only one variable).

    I want to build a regression model with 4 variables that display the best AOC values if taken individually. I would like it to include the 4 variables, because a model with more variables displays a bigger AOC value and should be better in predicting the outcome (clinical form of MS). However, if I add all these 4 variables into the equation, most of their p values and confidence intervals show them to be not significant. I am pretty sure this is a multicollinearity issue as the values change significantly if I remove one or a few of the variables (event though the VIF values are not more than ~3) . The biggest number of variables with which all of the variables in the equation are shown to be significant for the model is two.

    Therefore my question is it possible to build a model with more than two independent variables in this case and somehow overcome the multicollinearity issue or should I stick to only two.

    Thanks in advance!

  2. #2
    TS Contributor
    Points: 40,621, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Downloads
    gianmarco's Avatar
    Location
    Italy
    Posts
    1,368
    Thanks
    232
    Thanked 301 Times in 225 Posts

    Re: Multicollinearity problem in binary logistic regression

    Quote Originally Posted by kranas View Post
    I am pretty sure this is a multicollinearity issue as the values change significantly if I remove one or a few of the variables (event though the VIF values are not more than ~3)
    Did you actually check for multicollinearity (i.e., calculating pairwise correlation coeff among your predictors)? From what you wrote, it seems that you are just guessing the presence of highly correlated predictors.
    http://cainarchaeology.weebly.com/

  3. #3
    Points: 450, Level: 8
    Level completed: 50%, Points required for next Level: 50

    Posts
    8
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: Multicollinearity problem in binary logistic regression


    I'm guessing it should be because of multicollinearity, because individually (or with max two variables) the variables display significant results in the equation, but after adding the 4 variables into one binary logistic regression in SPSS the p values and confidence intervals rise significantly.
    Here are regression results with all 4 variables that display the biggest AUC values individually and the correlation matrix:

    Could this be because of some other issue instead of multicollinearity?

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats