range of possible values for regression coefficients given another one?

spunky

Super Moderator
#1
i feel like the answer to my question is a 'no' but i'll ask it anyway just to be absolutely sure.

say you have 3 variables X, Y and Z each one with some correlation [MATH]r_{xy}[/MATH], [MATH]r_{yz}[/MATH], [MATH]r_{xz}[/MATH]. we know from the formula of the determinant of the correlation matrix that if, for instance, [MATH]r_{xy}[/MATH] and [MATH]r_{xz}[/MATH] are fixed, then [MATH]r_{yz}[/MATH] must necessarily fall within the interval:

[MATH]r_{xy}r_{xz}-\sqrt{(1-r^{2}_{xy})(1-r^{2}_{xz})}\leq r_{yz}\leq r_{xy}r_{xz}+\sqrt{(1-r^{2}_{xy})(1-r^{2}_{xz})}[/MATH]

so the question now becomes... if we consider the OLS multiple regression models [MATH]Y=b_{0}+b_{1}X[/MATH] and [MATH]Y=b_{0}+b_{1}X+b_{2}Z[/MATH], is there some way to calculate the range of values that [MATH]b_{1}[/MATH] can have when [MATH]Z[/MATH] gets introduced into the model? in general, the [MATH]b_{1}[/MATH] will not be the same in the first and in the second model. i was hoping maybe some function of maybe the correlations/covariances and variances of the constituting variables could give me a range of values...

thaaanks!
 

spunky

Super Moderator
#3
hello. yes, it is a small part of a wider problem here (i'll try to be brief).

here in social-sciency land we have a regression-based method called 'mediation' where you have three variables (a predictor/independent variable X, a response/dependent variable Y and a mediator Z). the way it works if by first running the regression:

[MATH]Y=b_{0} + b_{1}X[/MATH] and you look to see if [MATH]b_{1}[/MATH] is significant

then you do other regressions (you predict Z from X, you predict Y from Z, nothing too important for this question).

what matters, however is that when then you run this regression:

[MATH]Y=b_{0} + b_{1}X + b_{2}Z[/MATH] you need to see see how the coefficient [MATH]b_{1}[/MATH] changes. if it becomes non-significant then you say Z "fully mediates" the relationship between X and Y (which rarely happens). if [MATH]b_{1}[/MATH] is still significant but it's reduced (the most common case) then that means Z "partially mediates" X and Y.

we reviewed these concepts in class last tuesday and i was thinking to myself "well, it seems like in the most common case of partial mediation (i.e. [MATH]b_{1}[/MATH] is still significant but smaller once Z is introduced in the regression equation) it would be useful to know the range of values [MATH]b_{1}[/MATH] could have. that made me think about Dragan's formulas for regression coefficients and the bounds that correlations impose each other to keep the correlation matrix as positive-definite. that's when i thought "what if i could find a way to provide a range of values that [MATH]b_{1}[/MATH] can have when Z is introduced verus absent in the regression equation? which prompted my question.

but the formulas that Dragan posted have too much going on within them. i'm thinking there could always be a way that if something changes any potential range of values i could generate for [MATH]b_{1}[/MATH] could be violated.
 

spunky

Super Moderator
#4
for future reference, i was able to find someone who articulated more or less what i wanted to say in this thread. although my original question is wrong (i.e. that specific b-coefficient has no limits in its range) there ARE limits in the range, imposed by the correlation/covariance matrix's property of positive definiteness, that some of these coefficients together can have.

the source is here. it starts on page #12 of the PDF:

http://quantpsy.org/pubs/preacher_kelley_2011.pdf