# Interaction Terms: Logistic Regression

#### rclukey

##### New Member
Dear forum,
trying again...

I would really like some advice on the proper construction and interpretation of a small experiment and analysis using Logistic Regression. I had a sample of 50 people evaluate three different products (X1) and generate a score (X2). Then they indicated how likely they would be to purchase the product (DV - binary). Now, I want to model the relationship between X1, X2 and the DV. I'm running into difficulty setting up the model - I've read lots of forum posts about the use of interaction terms, but I cannot find a clear enough answer on the topic. Take the following model:

DV = Const. + X1 + X1*X2

X1 = product (categorical)
X2 = Centered Product score (continuous)

[assume all effects are significant at p < .000]

What's the problem with this model? if any? I've read some discussions which say this is fine, and other discussions that say you must have the main effect of X2 present in the model. Given inclusion of X2 as a main effect, what is the interpretation? Especially in the presence of the interaction term. ... my brain is stuck.
If anyone can help with a very clear explanation of what's happening in the model, I would really appreciate it.

Thank you kindly,
Ryan

#### noetsi

##### No cake for spunky
I am not sure what X2 actually is. Generate a score of what? If X1 and X2 are seperate variables, and you are generating an interaction between them then it violates one of the "rules" of regression not to estimate X2 in your model when you specify an interaction between X1 and X2. All main effects of an interaction have to remain in the model if you are specifying a model. I am also unclear why you removed the intercept. I assume it was because you centered the value in X2, but that is not clear to me from your notation.

So the model should be DV = B1X1 + B2X2 +B3X1X2 plus an error term.

I would be interested in seeing any comments by a statisician that says you can exclude a main effect - every such comment I have seen says the opposite.

Generally you should not interpret main effects when you have signficant interaction - or if you do it should be in the form of simple effects. That is the impact of X1 on the DV at a specific level of X2. Some would argue you can interpret X1 when the interaction of X1 and X2 is ordinal (so the relative importance of the levels of X1 do not change at levels of X2, only the distance between those levels) but not when it is disordinal (so that a level fo X1 would have more impact on a DV than a second level of X1 at one level of X2, but that this would flip flop when you conducted the analysis at another level of X2). But not all would agree with this.

#### rclukey

##### New Member
Thank you for your response. Let me see if I can clarify:

1. my notation is probably wrong; the intercept has not been removed from the model.
2. X1 is a categorical variable consisting of 3 different products (i.e. let's say three different cell phones to make it easy), X2 is a score of "appeal" derived by respondents rating the product according to a battery of 25 questions; so X2 is a mean computed across 25 rating questions.
So X2 (appeal score) is intrinsically related to the product (X1).

does that help?

I did not know that bit about not interpreting main effects when there is an interaction. That's good to know.

#### noetsi

##### No cake for spunky
What is the advantage in your theory for having X1? That is what does it add to the predictive model given what you already learn in X2? I can't see any real advantage to rate a product, and also have that product as a seperate variable. Ignoring the greatly increased complexity of interaction, multicolinearity would seem to be a major issue. That will play havoc with any Wald chi square test you do to determine the significance of X1 and X2.

You may of course have a strong reason to want it there, but if you don't having two strongly related IV in the model is generally not a good thing.

#### rclukey

##### New Member
My assumption was that because my data is sampled across three different products, that in order to say something about which product is better or worse (in terms of the DV), I would need to include it in the model as a main effect. I also made the assumption that a product appeal score varies across these products, and the goal is to model this differential effect. This is similar to what I read in this paper:
http://www.princeton.edu/~slynch/soc504/expanding_ols.pdf

here's the quote I'm referring to:
"The simplest interactions involve the interaction between a dummy variable and a continuous
variable. In the education, gender, and depressive symptoms example, we may expect that
education’s eﬀect varies across gender, and we may wish to model this diﬀerential eﬀect. A
model which does so might look like:
Yˆ = b0 + b1Education + b2Male + b3(Education × Male)"

"education" is a continuous variable measured in years.

So, I drew the parallel that education is a kind of score for an individual, and male / female is categorical like different products. So, if education varies across gender, why wouldn't a product score vary across products? perhaps I've got some incorrect assumptions going on here.

Another way, which I just thought about, would be to remove the categorical variable from the model and just evaluate the main effect of "appeal" and generate the probabilities. Then, compare the mean probability between each of the products included in the sample. I'm not sure what's commonly done, so I guess I'm also looking for some coaching on how to do this kind of analysis...

thank you kindly for your time.

#### noetsi

##### No cake for spunky
In that example education is not neccessarily strongly multicolinear with gender. In your example your variables are almost certainly strongly related which is a very different reality. The issue is not how the independent variable is measured, that is categorical versus interval, it is if they are strongly related to each other.

A product score may well vary across products, the question is does that help you predict the dependent variable given that both are in the model. Are both statistically signficant in your model (unfortunately high multicolinearity may mean one is not statistically signficiant, but should be. That is a basic reason you don't want items strongly related to each other and the DV in your model in the first place). The question to ask is 1) does this add predictive ability and 2) do I really need to know it. If either is no it is not usually a good idea to have it in your model.

Another way, which I just thought about, would be to remove the categorical variable from the model and just evaluate the main effect of "appeal" and generate the probabilities. Then, compare the mean probability between each of the products included in the sample. I'm not sure what's commonly done, so I guess I'm also looking for some coaching on how to do this kind of analysis...
I don't know enough to comment on that, although I would suggest running the model with X1 and X2 and just X2 by itself and see for which the AIC value is lower. That is the better model, the one with the lower AIC. This is in terms of estimating the model of course, it may well be that having both variables in the model adds something you need to know period. Of course if there is a signficant main effect understanding what X1 and X2 is telling you is not simple.

I would read more on interaction effects, especially what simple effects are. If there is significant interaction, this is the best way to analyze X1 and X2. You should also run your model, with X1, X2 and the interaction effect against your dependent variable and do a multicolinearity test (VIF, or tolerance). In most statistical software this will have to be done in linear regression. You can just ignore all the other linear results, which will be nonsensical with a dummy DV, and look at what the multicolinearity diagnostics tell you.

#### rclukey

##### New Member
Thank you for the explanation. I did the check of MC using LR. and guess what... big time, VIF > 32. so, that explains it. I think it finally dawned on me where I went wrong, and thank you for your time and explanation. In fact, it is like you said at the beginning... there's no need for the categorical variable because the measurements are taken on the categories anyway... so its redundant redundant to include it in the model. I can just use the single "appeal" variable an estimate probabilities like a normal LR. Done. Thank you very much!

#### noetsi

##### No cake for spunky
You welcome.

Note that I did not mean to say you went wrong substantively. I don't know the area you are analyzing so I have no idea. But in terms of methods it makes it very difficult to determine the statistical signficance of a specific variable and variables that are highly multicolinear and which you really don't need are a good canidate for axing...

32 is pretty high