Hi everyone,

Currently I'm writing my thesis on the effects of divorce on children. It's a complex design so I guess I'll be posting more questions here

My dependent variable is whether or not the respondent has ever been in a relationship. My independents are 2 dichotomous variables : Gender and parental marital status (0 = together 1 = divorced). I found dat the marital status of the parents has an effect, gender of the respondent has not. Now i want to check if the interaction between gender and parental marital status has an effect. --> Y = b1Maritalstatus + b2Gender + b3MaritalstatusxGender
Both gender as marital status correlate highly with the interaction term. I solved this by first recoding the variables (0 to -0,5 and 1 to 0,5) and then centering these and making the interaction with the two centered variabled. No longer high correlations, so I guess I solved this. However, some people say it's not okay to center categorical data, so is my solution wrong? Is there a better solution?

Second question, still the same dependent. I want to see if the age at the time of parental divorce (continuous) interacts with gender. Again the multicollinearity issue, so I center the age-variable. This reduces the correlation between gender and the interaction term, but not the correlation between age and the interaction term. How can this be? What did I do wrong?

Both gender as marital status correlate highly with the interaction term. I solved this by first recoding the variables (0 to -0,5 and 1 to 0,5) and then centering these and making the interaction with the two centered variabled.
You cannot center binary variables.

And you neither said what you exactely mean by "correlate highly", and
why you consider this as a problem for your data analysis. By the way,
it is always necessary to mention the sample size.

With kind regards

K.

I have 844 respondents.
When I do a logistic regression with the variables Gender, Marital status and Gender x Marital status I get the following correlation matrix :

Gender and Marital status : .519
Gender and Gender x Marital status : -.911
Marital status and Gender x Marital status : -.591

Because the interaction is made from the two variables, logically it correlates high with them. As i understand, this is a problem because the coefficient estimates are unstable. So how could I deal with the collinearity and still test my interaction?

 Tweet