Chi Square and Regression

#1
Hi all,

First time post so please go easy on me :p

I have a bunch of categorical variables that I am thinking of using as predictor variables in a logit regression. I have conducted some chi-square tests between these variables and my outcome variable, some of which have come back as insignificant. Should I be advised to drop these variables from my regression or should is there any rationale to these variables in?

Many thanks for any input you can provide!
 

hlsmith

Omega Contributor
#2
There are many argument to drop or keep. Perhaps by insignificant they were 0.06, underpowered, but of interest in analyses (a known predictor). In such a scenario, I would argue that keeping them may be a feasible option.
 
#3
that really depends, and to be honest, you do not know the multi-variable structure of the model yet either.

i.e. males might be insignificant but male teacher (the interaction between those two binary variables) is significant.

you just dont know until you look at the data.


From another perspective, you should really let theory guide you somewhat in what you include in your model. I am a strong proponent of grounding my statistical work in the literature, especially when it comes to model construction. And if you are sending this out to a journal, they are most certainly going to ask why you did not control for such and such variable.

In summation, including a non-significant factor in a theoretical model is ok because it controls for that variable. Easier to show causation (not prove, but at least provide more evidence).
 

noetsi

Fortran must die
#4
Although I know this is commonly not done in practice, you really are supposed to base what variables go in the model on theory not such tests anyhow :p And if theory supports a variable leaving it in the model if the p value is close to say .05 may make sense ( the use of .05 rather than .06 is essentially an artificial convention). Moreover, univariate models, which it looks like what you ran, commonly does not predict multivariate relationships anyhow.

If you have some theory or even hunch to support using the variables I would not drop them simply because you ran a chi square test on individual predictors. In honesty I don't think you gain anything from running univariate tests period.