- Thread starter Steve_t
- Start date
- Tags logistic regression

Checking for multicollinearity of independent variables is necessary in linear regression since multicollinearity increases the standard error, which in turn affects t-stats and p-values. But in logistic regression, p-values are based on Khi2, therefore multicollinearity has no effect on the p-value (this is at least my understanding, am I wrong?).

If so: Question 1 - Why check for multicollinearity in logistic regression?

It is usually considered one of the things you are supposed to do in logistic.

Next, Question 2 - If checking for multicollinearity is necessary, then should the check be run for continuous as well as for dummy variables, or instead should multicollinearity be checked for continuous variables only?

For continuous and for ordinal categorical variable you would check for MC.

Question 3 - If the check is to be run for dummies as well, then is it OK to calculate such association coefficients as Pearson's Phi, Tschuprow's T and Cramer's V (as the dummies in question are nominal)?

Check for MC by using a linear model and running the VIF and Tolerance statitics. Ordinal categorical variables can be converted to 1,2,3,etc. You can use the linear model since you don't car about the outcomes just the MC tests, all depends what software you are using.

And finally, Question 4 - If calculating association coefficients is OK, can it be considered that there is no serious multicollinearity risk insofar as the coefficient does not exceed 60%? Or is there any alternative rule of thumb on this?

I don't understand this question.

I also think for nominal variables you would check with collinearity with other variables (not necessarily dummies within the same variable being a problem, but you would expect high relation between the dummies).

Basically when you have Multicolinearity you can not separate two predictors unique impact on a DV. I think you might be confusing linearity, with Multicolinearity. Logistic regression does not assume linearity for the raw levels of the predictors and response variable. It does assume Multicolinearity is not occurring. All categorical variables will have a linear relationship with the response variable. They might or might not show multicolinarity.