What to do about violations of assumptions

noetsi

Fortran must die
#1
This is one example of a larger topic

Say you are doing logistic regression and you have influential data points. You can't use robust regression - so what do you use? Non-parametrics?

Does cook's d even work with logistic regression?
 

noetsi

Fortran must die
#2
It is easy to test for multicolinearity, but I have found no solution with broad support if you can not gather more data and don't want to combine multiple variables into one [neither are realistic solutions in my analysis].

So how does one deal with MC?
 

noetsi

Fortran must die
#3
While I am at it, do the rules of thumb for Cook's D apply to predictor (covariate) values for logistic regression, or are these rules different for logistic and linear regression. So far I have found little on this [what I have found leads me to suspect the rules of thumb are the same for the two approaches].
 
#4
IIRC, I think Cook's D is based off the Hat Matrix, which involves only the X's (covariates). In this case, you can run a regular OLS regression with your variables and inspect them as usual.
 

noetsi

Fortran must die
#5
That makes sense although some comments on logistic regression I read suggested different rules of thumb than I saw for linear regression. This may simply be different authors with their own experiences leading them to different rules...

This is certainly the case with Hosmer and Lemeshow who base their recommendations on their own experience with logistic regression.
 

hlsmith

Omega Contributor
#6
Multicolinearity can be addressed by doing nothing :))), dropping a variable, or creating a new variable (e.g, construct based).
 

noetsi

Fortran must die
#8
Where MC comes up is if you are trying to get the unique impact of a predictor. And commonly I am. The problem with collapsing two variables together is that tells you what the collapsed variables do. Not what the individual variables do by themselves. I was wondering if partial regression plots get at this, but I am guessing not...

I guess you could argue if MC makes it impossible to get at the unique impact of a variable creating solutions really distorts reality. :p
 

hlsmith

Omega Contributor
#9
You still get at the unique impact of the variables, either through odds ratios or partial R^2, but the standard errors get wonky. You know this.
 

noetsi

Fortran must die
#10
But you can't test if those are statistically significant and, at least in my line of work that is pretty important. People care a lot about test of statistical significance. This is true in journals as well, at least in social science ones where few pay attention to specific effect size which is hard to interpret substantively. They want to know if this a real effect, that is statistically significant.

In rarely look at odds ratios for linear regression actually, I use slopes. :)