Chi square as sneak peek into logistic regression


New Member
Let's say I have these variables...

Dependent var: cancelled_flag
Independent vars: region, gender, etc

My ultimate goal is to do a logistic regression resulting in a model that will predict the likelihood of someone cancelling. Is it appropriate to run chi square tests as a sneak peek into which variables will end up in the model?

For example, if the chi square test shows a relationship between region and cancelled_flag, then I might expect it to be a contributor to the logistic regression model.

To be clear, my question is, would this be a REASONABLE process? We wouldn't rely on the chi square to make any decisions, just looking for a quick gut check before modeling.


New Member
Logistic regression variables may act different

Hi, when you enter multiple variables they might change the way another variable behaves and not act as expected. As an example, I built a model predicting whether a customer will buy extended service for computers.

In this model, I had a variable indicating whether a customer was a business customer or a private one. I also had a variable indicating whether a customer had a main frame server.

When I tried different models with different variables and in the process I notices that when each one of these variables is inserted separately it has a strong positive effect on purchasing extended service (since both indicate that the user is a heavy hardware consumer with a need for support). On the contrary when both variables were in the model, the business variable had a negative effect on the odds while the server variable had positive effect on the odds. This phenomena occurred since the business variable was "dissolved" in the main frame server variable.

As a conclusion, when adding multiple variables to a logistic regression it is very hard to predict the effect of each variable on the final regression model.

The Analysis Studio software handles this by giving a logistic regression wizard. This way you can go back and forth so your model makes sense both statistically and realistically.

Analysis Studio can be found at :
Last edited: