Multiple regression question

#1
Hello everyone, I cannotfind the solution to following question and dont really know how to exclude any alternative.

We regress an outcome variable Y on two continuous variables X1 and X2.

Model 1: predicted y= 10 + 5X1
Model 2: predicted y= 5 + 5X1 + 2X2

Which of the following statements can we deduce from above?

A X1 and X2 are uncorrelated
B X1 has a larger effect on Y than X2
C The mean on X1 is equal to 10
D none of the altenatives above is correct

I"d appreaciate any help. Thanks!
 

hlsmith

Omega Contributor
#2
Well, we aren't going to straight out answer it for you, but we will help you if you provide your own effort first.


I wouldn't pick "D" or perhaps all "A", "B", "C", but which answers come into play then?
 

noetsi

Fortran must die
#3
I don't see how you could know X1 and X2 are correlated from the formula alone. So I don't see how A could be correct.
 
#4
Haha, fair enough!

Well, I would exclude C because I think 10 would be the intercept and not the mean. You cannot say anything about the mean, right?
I think that B is also wrong, because just on the basis of the two regression equations I cannot compare their effects on the dependent variable. For that you would need a coefficient table with the p value or the correlation?
Somehow I think you can see if they are correlated or not but I dont know how. So I cannot really say if it is A or D
 

rogojel

TS Contributor
#5
hi,
I think your argument agains B is not right. You do have the coefficients, and unless the test is really unfair, you can assume that both models are significant. (BTW If you think that the models are not necessarily significant then evidently only D can be right.)

The question is just ambiguous enough for B to be a possibility, imo.

regards
 

ondansetron

TS Contributor
#6
I don't see how you could know X1 and X2 are correlated from the formula alone. So I don't see how A could be correct.
Answer "A" could be correct.The mathematics behind OLS allow us to know if a correlation exists between the two variables or if one of the variables (X2 in this case) has a non-zero coefficient. If X1 and X2 are uncorrelated in this model the parameter estimate of X1 will be unchanged by the addition of X2 to the model (assuming that X1 and X2 parameters are nonzero). The parameter estimates of X1 and X2 would change when the other is added to the model only if there is some "shared" information between them pertaining to Y.


Let:
1) y= bo + b1x1
2) y= Bo + B1x1 + B2x2

b1 = B1 + B2d1
d1 represents the slope from regressing x2 on x1.

For b1 to equal B1, the correlation between x1 and x2 must be zero (since they're not constants, the variances/sds will be strictly positive), or B2 must actually equal zero. We can see various ways that b1 can approximate B1.

I more or less summarized this example from Wooldridge, Introductory Econometrics: A Modern Approach, since he did a pretty good job with it. So, you can do a bit of thinking with the output and the theory and make reasonable conclusions regarding the correlation between x1 and x2 or Xi and all other X's (if the model had more than 2 X variables).

I think B has problems that have come up before: you can purely state that a 1 unit change in x1 changes Y by a greater magnitude than a 1 unit change in x2 does, but it gets hairy to say anything different from a plain statement like this when X1 and X2 are measured on different scales. A 1 inch increase in height may decrease life expectancy by 5 years, while a 1 pound increase in yearly meat consumption my decrease life expectancy by 2 years, but who's to say a 1 inch change is equivalent or as meaningful as a 1 pound change in meat consumption. The most we can do with that, I think, it speak strictly in terms of magnitudes and not importance, and choice B seems to stay close enough to the safe answer of just magnitudes.

Choice A is possible, too, I think, based on the explanation I mentioned from Wooldridge. I think I might be missing something or the question wasn't written as well as it could have been.
 
Last edited:

hlsmith

Omega Contributor
#7
Additional random food for thought: What if X1 and X2 are measured with the exact same units, but then one of them is transformed into different incremental units, so they both could have the exact same effect on Y if formatted the same?
 
#8
Not to read too much into the question, but as presented it says the models "predicted" the relationships listed. It doesn't actually say that the predictions were correct.
 

ondansetron

TS Contributor
#9
Not to read too much into the question, but as presented it says the models "predicted" the relationships listed. It doesn't actually say that the predictions were correct.
I think "predicted" is used to mean "y-hat" as in predicted y-values are computed using the right hand side (RHS) of the equations. Notice this makes sense since there is no expectation on the left hand side nor is there any random error component on the RHS.
 
#10
Thanks for the help. The correct answer turns out to be A. So yeah, I think the slope of first predictor (which is 5) should change when adding the second if they would be correlated. But it does not and thereofre X1 and X2 are uncorrelated.