# Thread: Logistic Regression - How to interpret a Exp(b)= 0

1. ## Re: Logistic Regression - How to interpret a Exp(b)= 0

Originally Posted by victorxstc
@ jessireebob: The site I provided above (like many other sites) introduces many methods to reduce multicollinearity. I think a combination of some of them are useful in your case. Although those most useful in your case seem those I already mentioned before. Plus one method in which you create a third composite variable by mixing your correlated variables.
Thanks very much!

2. ## Re: Logistic Regression - How to interpret a Exp(b)= 0

Originally Posted by victorxstc
I agree but a combination of very big betas together with extremely large SEs is more likely to be caused by multicollinearity. Once it is eliminated, the same dataset would show reasonable betas and reasonable SEs (as far as the test power allows for SEs).

@ jessireebob: try removing the highly correlated variables and see if the betas are still that high in your new model. Since almost all of your variables have such a problem, you might also test them in single IV models and check their new betas. If their betas improved considerably in new models, it is more likely a case of multicollinearity. Otherwise, if their betas were as high as the ones in the original mode, it is due to overdispersion. You can also test all of your independent variables against your dependent one using a bivariate correlation test and check for their correlation coefficient (not as an official reference, but as a good help). But note that these can happen together as well, since all variables have almost always some level of correlation, and all datasets have some level of dispersion.

Thanks noetsi. I agree large SEs are indicative of overdispersion but are you sure large coefficients are as well pointing to overdispersion? But in a multivariate model, that can make sense too.

About data separation which is a nice learning for me, I don't think in a dataset, 4 out of 5 variables in the model are such in nature that are coded like the one Dason mentioned. It can be possible for few variables, but when many variables have this problem, is it still the case?

I have another question on data separation: does data separation lead to broad 95% CIs as well? Ok it is understandable that it can result in extreme ORs. But what about 95% CIs? Can it cause strangely broad CIs?
This issue is unclear to me since I am new to overdispersion. However, one of the methods that addresses overdispersion (by Williams) adjusts the coefficients as well as the SE so I assume so.

Data seperation leads to either 1) no ML estimates at all (or even a failure to converge) or 2) results that are meaningless. Given data seperation its strongly reccomended not to use any estimates that are generated (SPSS won't generate results when this is detected, SAS gives you a warning when they can be generated which sometimes they can not be). The estimates may be infinite or not unique.

The point is, what the CI are is besides the point when you have data seperation because the point estimates either don't exist or should not be used.

3. ## The Following User Says Thank You to noetsi For This Useful Post:

victorxstc (05-03-2013)

4. ## Re: Logistic Regression - How to interpret a Exp(b)= 0

Originally Posted by noetsi
This issue is unclear to me since I am new to overdispersion. However, one of the methods that addresses overdispersion (by Williams) adjusts the coefficients as well as the SE so I assume so.

Data seperation leads to either 1) no ML estimates at all (or even a failure to converge) or 2) results that are meaningless. Given data seperation its strongly reccomended not to use any estimates that are generated (SPSS won't generate results when this is detected, SAS gives you a warning when they can be generated which sometimes they can not be). The estimates may be infinite or not unique.

The point is, what the CI are is besides the point when you have data seperation because the point estimates either don't exist or should not be used.
Thanks noetsi, especially for mentioning that overdispersion can change the slopes (although I think it is inconsistent, because pure overdispersion should change only the SE, in my opinion). Perhaps in a multivariate design, a the slope is shifted by other slopes, when its SE changes, and perhaps that is why overdispersion indirectly leads to slope changes... But I think this is not the case in Jessie's project.

But I was trying to say this artifact Jessie is experiencing is not due to data separation (because the possibility of having many variables all with this pattern is very low). Besides, I tried to say if data separation was the culprit, perhaps 95% CIs were not as this broad.

ps. I hope you have seen that website which describes that slopes can change considerably in a multicollinearity-affected model (my above post, in reply to hlsmith). I hope it can clarify some issues.