Postive correlation but negative coefficient in regression??

#1
Im a little confused by some results I have.....

I have two predictor variables and an outcome variable. The first predictor is positively correlated with the outcome variable (r = 0.8, p < 0.05) and the second predictor negatively correlated with the outcome variable (r = -0.6, p < 0.05). Simple linear regression also identified a positive and negative relationship for the first and second predictors, respectively.

However, when conducing a multiple regression the beta coefficients for the predictors are BOTH positive. How is this possible? The relationship between the second predictor and the outcome variable is clearly negative so how would I get positive coefficients?

Any explanation/advice would be most welcome.....
 
#3
It would be helpful if you also reported the correlation between the first and second predictors.
Sorry, the predictor variables show a strong negative correlation (r = -0.75) however I have used collinearity diagnostics (VIF, tolerance levels, condition indices) and think all was ok to continue with both variables despite reasonable correlation.......
 

Dragan

Super Moderator
#4
You should be able to determine the standardized regression weight using the following formula:

[math]\beta _{2}=\frac{r_{y2}-r_{y1}r_{12}}{1-r_{12}^{2}}[/math]

where the subscript y denotes the outcome variable.
 

jpkelley

TS Contributor
#7
This is interesting. I haven't personally run into this, but I cannot imagine why not. Statto, can you publish your beta estimates, beta standard errors and p-values? I'm curious what this suppression looks like.

Dragan and others, wouldn't you see some correlation between the two independent variables in this case? Not being argumentative here. I've been trying to simulate a good example of this kind of suppression, and I can't work through how such results would occur without a correlation between the two IVs.
 
#8
As requested......

CORRELATION RESULTS
Predictor one and outcome variable (r = 0.8, p < 0.05)
Predctor two and outcome variable (r = -0.6, p < 0.05)
Predctor one and predictor two (r = -0.75, p < 0.05)

REGRESSION RESULTS (beta coefficient with SE in parentheses and t and p values)

LINEAR (predictor one and outcome)

Constant 0.101 (0.325), t = 0.311
Predictor one 0.090 (0.008), t = 10.827, p < 0.05)

LINEAR (predictor two and outcome)

Constant 8.748 (1.412), t = 6.194
Predictor two -1.463 (0.327), t = -4.468, p < 0.05)

MULTIPLE REGRESSION (predictor one and two)
Constant -3.17 (1.62), t = -1.96
Predictor one 0.11 (0.01), t = 9.14, p < 0.05
Predictor two 0.66 (0.32), t = 2.06, p < 0.05
 

Jake

Cookie Scientist
#10
As Dragan wrote, this is suppression at its finest. I find that it helps to visualize the relationships between the 3 variables when trying to understand this phenomenon:
Code:
    Y
    /\
 + /  \ +
  /    \
 X1----X2
    -
So X1 and X2 are both positively related to Y, but negatively related to each other. Now let's think about what happens when either of these predictors go up or down. (Note: I do not intend to imply that these variables are causally related, but I do find it helpful to pretend that they are for a moment when reasoning through these examples.) When X1 goes up, Y goes up. But when X1 goes up, X2 goes down. And when X2 goes down, Y goes down. So you have X1 affecting Y through two separate paths: the direct effect of X1 is positive, but the indirect effect of X1 via X2 is negative. Often these two influences will cancel each other out, leaving you with a nonsignificant simple correlation/regression between X1 and Y. But if one "path" is sufficiently stronger than the other, that will manifest itself in the simple correlation as well!

In your case, the direct relationship between the outcome variable and the second predictor is slightly positive, but the indirect relationship between the outcome variable and the second predictor via the first predictor is strongly negative. So when you look at the simple correlation/regression between the outcome and the second predictor--ignoring the influence of the first predictor--it looks like there is a negative correlation. And there is, in a sense, BUT that is due to the indirect effect of the second predictor via the first predictor, which the analysis does not take into account. The multiple regression, which does factor those covariances into account, reveals that the true direct relationship between the second predictor and the outcome is slightly positive.

Make sense?

I was taught about suppression but have never actually seen strong suppression effects in the data that I've collected and handled either. It's kind of like catching a rare Pokemon... :D
 
#11
Thanks Jake and dragan for your input and explanations.
I guess from your responses that I should interpret the direction of the rels in the multiple regression as they are as this gives a better representation of the true relationship between the outcome and predictors. But should I comment on this 'positive net suppression' in my write up?

Thanks again for your help and I'm glad that we've now all seen such a great example of suppression!
 

Link

Ninja say what!?!
#12
As Dragan wrote, this is suppression at its finest. I find that it helps to visualize the relationships between the 3 variables when trying to understand this phenomenon:
Code:
    Y
    /\
 + /  \ +
  /    \
 X1----X2
    -
So X1 and X2 are both positively related to Y, but negatively related to each other. Now let's think about what happens when either of these predictors go up or down. (Note: I do not intend to imply that these variables are causally related, but I do find it helpful to pretend that they are for a moment when reasoning through these examples.) When X1 goes up, Y goes up. But when X1 goes up, X2 goes down. And when X2 goes down, Y goes down. So you have X1 affecting Y through two separate paths: the direct effect of X1 is positive, but the indirect effect of X1 via X2 is negative. Often these two influences will cancel each other out, leaving you with a nonsignificant simple correlation/regression between X1 and Y. But if one "path" is sufficiently stronger than the other, that will manifest itself in the simple correlation as well!

In your case, the direct relationship between the outcome variable and the second predictor is slightly positive, but the indirect relationship between the outcome variable and the second predictor via the first predictor is strongly negative. So when you look at the simple correlation/regression between the outcome and the second predictor--ignoring the influence of the first predictor--it looks like there is a negative correlation. And there is, in a sense, BUT that is due to the indirect effect of the second predictor via the first predictor, which the analysis does not take into account. The multiple regression, which does factor those covariances into account, reveals that the true direct relationship between the second predictor and the outcome is slightly positive.

Make sense?

I was taught about suppression but have never actually seen strong suppression effects in the data that I've collected and handled either. It's kind of like catching a rare Pokemon... :D
Very detailed explanation Jake. I noticed that you join just last month. I'd also like to welcome you to the forum.
 

jpkelley

TS Contributor
#13
Great explanation, Jake. Admittedly, I skipped statto's second posting where there was mention of the predictors being negatively correlated. I went down the rabbit-hole trying to figure out how one would get suppression if there wasn't a correlation. My jumpiness aside, this was a very stellar explanation. I need constant reminders--as I think many people do-- to go back to the original data and make a flow diagram of all possible upstream and downstream effects.
 

spunky

Super Moderator
#14
I was taught about suppression but have never actually seen strong suppression effects in the data that I've collected and handled either. It's kind of like catching a rare Pokemon... :D
i actually did quite a bit of research on suppression effects last year with another friend of mine, and i'd like to add a little quote from an article that i think it's relevant. it comes from Paulhus, D., Robins, R., Trzesniewski, K., Tracy, J. (2004). Two Replicable Suppressor Situations in Personality Research. Multivariate Behavioral Research, 301-326:

"In sum, we concur with Tzelgov and Henik (1991) as well as Collins and Schmidt (1997) that the number of genuine suppressor situations in behavioral science may be far greater than has been assumed and a more vigorous search for such effects is warranted."

because i've always been kind of interested in causal modeling, suppression effects come out again and again but, as you can read in Paulhus et.al.'s article, research on suppression effects came to a virtual halt around the 1970's because personality researchers were hoping to use them to control for response styles in personality questionnaires but they couldn't... and talking about them didn't come back until path analysis and this whole mediation/moderation craziness in psychology and regression started taking off (as a preamble to structural equation modeling.

Judea Pearl's magnificent book "Causality" also brings it to light as one of the classic problems that needs to be addressed if we are to ever try and attempt to use correlational research to model causal situations...

As soon as we finish, fingers crossed, my friend and i will publish our meta-analysis/quantitative systematic review of suppression effect in the behavioural sciences and how people seem to find this little thing as a cute-yet-mysterious curiosity, but is either left uninterpreted or is commented as an avenue for "future research"... i just think, in general, whenver you see suppression it sorts of challenges the researcher to expand his or her theory to try and explain the results of the analysis which i find fascinating because, even with a little bit of imagination, you can see that these changes in signs or other versions of suppression kind of help support the theory where they come from, but they require a lot more work to interpret....
 

jpkelley

TS Contributor
#15
Interesting, spunky. Good call about the meta-analysis. What fields did you include...human behavioral sciences only, or did you delve into the ecology and animal behavior fields as well? In any case, I'll be interested to see the results (hint, hint...maybe an unpublished manuscript???), as I think many people in my field will be very interested.
 

spunky

Super Moderator
#16
Interesting, spunky. Good call about the meta-analysis. What fields did you include...human behavioral sciences only, or did you delve into the ecology and animal behavior fields as well?
only human behaviour stuff... the other 1/2 of my academic training is in psychology (and this friend's in philosophy) so we have to keep it within areas we understand so we can make sense of whatever people are interpreting... but that quote from Paulhus' article is what kind of prompted this whole research thing and it's like opening a pandora's box... suppression is really out there and it really happens a lot more often that people give it credit for, but i dont really know why most social sciences researchers either choose to ignore it completely (most common solution), blame it on sampling error or present it on the "future research is needed" section... very, very few people actually id it as suppression and, those who do so, do it kind of like "ok, so we have a suppressor variable there. moving on to the F test, we can see that blah, blah, blah..."
 

Jake

Cookie Scientist
#17
Sounds like a very interesting paper, spunky. As you can see, I'm as guilty as the next behavioral researcher in regarding suppression effects as quirky oddities...
 
#18
Hi all,
I'm glad that the results I've presented have stimulated some interesting conversation and have highlighted some important considerations. I now understand the results I am getting and due to the nature of my variables can see how one is 'suppressing' the other.
However I am still none the wiser on how to report this.

I have reported the positive correlation between predictor 1 and outcome, and the negative correlations between predictor 2 and ouctome. Can I then say in my regression write up something along the lines of:

'Despite the negative correlation observed between predictor 2 and the outcome, there is evidence of a strong suppression effect with the results of the multiple regression indicate that both predictors are postively related to the outcome variable.'

and leave it at that? (with further explanation in the discussion)
 
#20
Do your predictor variables and your outcome variables have to be correlated in the original bivariate analsis in order to do a single or multiple regression between the two?