mishery
10-25-2005, 12:56 PM
I saw a suggestion on a web page for dealing with collinearity I had not seen before.
http://www2.chass.ncsu.edu/garson/pa765/tacq.htm
"Treat the common variance as a separate variable and decontaminate each covariate by regressing them on the others and using the residuals. That is, analyze the common variance as a separate variable."
Does anyone know how to actually do this? How do I calculate the common variance? PCA? The web page does not say.
I calculated the residuals as described, residual of X1 regressed on X2, X3 and X4, residual of X2 regressed on X1, X3 and X4 etc. but they seem to correlate very highly. Am I doing something wrong?
Thanks from a newbie.
quark
10-25-2005, 01:46 PM
The common solution for collinearity is to remove some variables, since you have two or more variables measuring the same thing. I am not sure how to work on the variance. If the variables correlate highly, I doubt that you can "decontaminate" them.
Just my two cents. :)
JohnM
10-25-2005, 04:31 PM
Multiple regression is one of those procedures that requires developing a strong theory prior to its use. The reason is that it is VERY sensitive to the idiosyncracies of the sample.
You need to understand how the independent variables may determine the dependent variable (in a general, non-mathematical way - i.e., strong positive, weak positive, strong negative, weak negative), and then build your model.
You can also use step-wise procedures to help you build or reduce your model step-by-step, and monitor R^2 for significant changes.
If you do see collinearity, don't just rely on a statistical technique for dealing with it - try to understand why the independent variables may be related....
mishery
10-26-2005, 05:19 AM
In my initial post, I used four IV's to illustrate things but in fact I have only three.
I have a clear understanding of why the IVs are related for this sample.
I have evidence from other work, where there is less intercorrelation, that IV1, IV2 and IV3 have independent effects on the DV. Typically IV1 has the strongest effect and the other two account for small but usually significant amounts of variance.
For this particular sample, the three IVs correlate around 0.6. I have an a priori hypothesis that for this sample that after fitting the first two IV's then IV3 should not influence the DV.
If I had just two correlated IVs I would fit IV1 and then IV2 adjusted for IV1 (the residual for IV2 from a regression with IV2 as the DV and IV1 as the IV).
I vaguely remember someone saying that I couldn't do the following hierarchical regression...
1. IV1
2. IV2 adjusted IV1
3. IV3 adjusted for IV1 and IV2
I can't do this, right?
JohnM
10-26-2005, 09:37 AM
To be honest, hierarchical is beyond my area of knowledge/expertise (I've never used it), and any advice I might offer would probably lead you astray.
I'm going to defer to anyone else who feels comfortable commenting on it...