Interpreting coefficients in a multiple regression

I am running a multiple regression with two independent variables. IV#1 has a negative correlation with the dependent variable and on its own an R^2 of 80% a P-value of zero and a negative coefficient (makes sense). IV#2 has a positive correlation with the dependent variable and on its own an R^2 of 90%, a P-value of zero and a positive coefficient (again, makes sense).

When I run a multiple regression with both variables, the R^2 is above 90%, significance F is zero and both variables have P-values below 5%. However, the coefficients for both are now positive. How do I interpret that and is that an issue? Just seems unintuitive to have a positive coefficient for variable 1. Assume there is something about the correlation between the two variables that explains it but if I was to just alter IV#1, the result is unintuitive (i.e., if I increase IV#1, would expect a decrease in the predicted value, not an increase).



Less is more. Stay pure. Stay poor.
Do you have context familiarity with the dataset? If so, do these relationships make sense directionally. As @Dason mentioned there could be correlation, but why. There could be multiple scenarios, the relationship may be partially mediated, since the effect changes but is not totally removed in the total model. One of the variables could also be a common cause of the other variables or an effect of a common cause.

Statistics has a hard time truly discerning the cause, familiarity with the data generating process is the only way to know for sure. You should perhaps look at the bivariate associations and do some visualizations as well (e.g., plots).


Less is more. Stay pure. Stay poor.
But why are they correlated? Is this a practice dataset or a real one?

If it is real, you need to state what the model's purpose is and the underlying causal relationship between the variables. Draw a picture on how you think they are related and let that direct your actions along with your purpose.


TS Contributor
Cpuld you maybe tell us something about the research question(s),
what these variables are, how they were measured, the sample size,
and precisely how large the correlation coefficient between the
independent variables is?

With kind regards

Yes they are definitely correlated. Would it make sense to remove one altogether and find a less correlated variable?
Hi Bentley,
First of all, it is difficult to cover and record ALL variables that might influence a particular outcome variable. Usually, and it seems in your case, that the outcome variable is dependent on more than one variable. Therefore, regressing the outcome on IV1 and IV2 alone will give you biased results (Omitted Variable Bias). Hence, comparing individual regressions with multiple regression is not helpful. You should set your objective of regression first, determine the main regressor and then add control variables.
Secondly, change in signs of co-efficient indicates the use of dummy variables. Are you using any? It can also be that the Outcome depends on the state of one the variables and therefore you may need to add an interaction term. Consider the below regression

Y = a + b1X1 + b2X2 + e
Y = a' + b1'X1 + b2'X2 + b3X1*X2 + e'

If you can provide more details on the variables, it will be helpful.
Hope this helps.