## Econometrics/Regression/Statistics Help!

Hi guys!

I know this may be a little self-centered of me, asking for homework help as my first post, but I am extremely desperate! My tutor doesn't answer my homework questions when I email her, and we don't have time to discuss this assignment because she only goes through stuff she has planned during class hours!

So here it goes...(I've made my best attempt to answer with the best of my knowledge)

Question 3 [10 marks]
The Gauss-Markov Theorem shows that OLS is BLUE, so we hope and expect that our coefficient estimates will be unbiased and minimum variance. Suppose, however, that you had to choose one or the other.
a. If you had to pick one, would you rather have an unbiased non-minimum variance estimate or a biased minimum variance one? Explain.
For this, I think I would pick unbiased with non minimum variance purely due to the fact that the OLS is mean to be BLUE (Best, Linear, Unbiased Estimate), so thats kind of my explanation behind it. However, I have the slightest hunch that this is completely incorrect, and I am going to be wrong.
b. Are there circumstances in which you might change your answer to part a.? (Hint: Does it matter how biased or less-than-minimum variance the estimates are?)
Since my answer in part a) is likely incorrect, I'm not sure what to put here. The only time it may change is when we aren't dealing with the Gauss Markove Theorem? (And thus the variances aren't BLUE?)
c. Can you think of a way to systematically choose between estimates that have varying amounts of bias and less-than-minimum variance?
No idea here.

Question 4 [10 marks]
Suppose that you estimate a model of house prices to determine the impact of having beach frontage on the value of a house. You do some research and decide to use the size of the lot instead of the size of the house for a number of theoretical and data availability reasons. Your results (with standard errors in parentheses) are:

PRICEi-hat = 40 + 35.0LOTi - 2.0 AGEi + 10.0BEDi
(5.0) (1.0) (10.0)
N = 30
Where:
PRICEi = the price of the ith house (in thousands of dollars)
LOTi = th size of the lot in the ith house (in thousands of square feet)
AGEi = the age of the ith house in years
BEDi = the number of bedrooms in the ith house

a. You expect the variables LOT and BED to have positive coefficients. Create and test the appropriate hypotheses to evaluate these expectations at the 5 percent level.
So for here, do I conduct a hypothesis test where H0 is b2 and b4 > 0 and Ha is either b2 or b4 and (lessthanorequal) to 0? (Also, would I be looking at p-values here to find which hypothesis to reject?)

b. You expect AGE to have a negative coefficient. Create and test the appropriate hypotheses to evaluate these expectations at the 10 percent level.
Similar to above? Except I'm testing H0: b3<0 and alpha = 0.10. Then I need to use the p-value to compare?

c. What is the main problem with your equation? What explanation or solution can you think of for this problem?
I think the main problem is that:
1 - Bed is not significant, when it is expected to be significant (If you think about it, the number of bedrooms will directly impact the price of a house?)
2 - Bedroom will be related to Lotsize, so there is a problem with correlation between independent variables there.

Question 5 [10 marks]
Consider the following estimated regression equation (standard error in parentheses):
Yt-hat = -120 + 0.1Ft + 5.33 Rt
(0.05) (1.0)

Where
Yt = the corn yield (bushels/acre) in year t
Ft = fertilizer intensity (pounds/acre) in year t
Rt = rainfall (inches) in year t

a. Carefully state the meaning of the coefficients 0.10 and 5.33 in the equation in terms of the impact of F and R on Y.
I can do this.
b. Does the constant term of – 120 really mean that negative amounts of corn are possible? Explain.
I was hinted by a classmate to look at the Gauss-Markov Theorem on this question, but I have no clue for what part and how. For b), I would've written that it's impossible to produce a negative amount of corn.
c. Suppose you were true that the true value of is known to be 0.20. Does this show that the estimate is biased? Why or why not?
I'm not sure what to do here. I'm supposed to compare the value of 0.20 with -120?
d. Suppose you were told that the equation does not meet all the classical assumptions and, therefore, is not BLUE. Does this mean that the true is definitely not equal to 5.33? Why or why not?
When a person asked this question in class, the tutor responded saying that "it doesn't necessarily mean it's biased", and left it at that. Do I write that there's not evidence to prove that it is biased, hence it remains unbiased until proven to be biased?

Question 6 [10 marks]
Discuss the following statements briefly giving explanations as to their degree of correctness:

The following questions all required judgment

a. A study that only presents a linear equation is of virtually no use to the researcher or the reader.
I think this really depends on the number of independent variables that would be statistically significant in the prediction/estimation of the dependent variable. (For example, you would have a lot more factors that determine whether a plane will run(mechanics, petrol, etc etc), in comparison to whether you will get out of bed (tired or not)

b. You should run a whole variety of non-linear equations and pick the one with the biggest R squared and the best F-statistic.
This would really depend on the model. The high R-squared could coincidentally be caused by an independent variable which shouldn't even be in the model. We need to take a look at our independent variables and see whether the model is a good fit.

c. A study that only presents a double-log transformed equation is fine because it gives you elasticity’s automatically and that is what most economists really want to see.
Although it automatically gives elasticities, we may want to see incremental changes by making unit changes to the independent variables. Hence double-log functions are not always the most useful.

d. It would not be a good idea to try a quadratic on every variable, in an equation, ‘just to see’ if it would work without taking prior suggestions from economic theory as to which variables should be given a quadratic transformation.
No idea on this question. I thought a model was only quadratic if it behaved that way as we changed the values of independent variables.

Question 7 [10 marks]
Suppose a regression model has been estimated to explain the number of people jogging a kilometre or more on the school running track to help decide whether to build a second track. Two possible explanatory equations are estimated.
A:
B:
Where
Y: the number of joggers on a given day
= centimetres of rain that day
= hours of sunshine that day
= the high temperature for that day (in degrees Celsius)
= the number of classes with an assignment due the next day
a. Which of the two equations do you prefer? Why?
I put the one with a higher R-squared.
b. How is it possible to get different estimated signs for the coefficient of the same variable using the same data?
The coefficient they are talking about is in regards to the number of hours of sun during that day.
I wrote that some people enjoy the sun while running, while others hate it (due to heat). Am I correct in saying so?

Thanks to anybody who can help - every little bit is appreciated!

I'm trying to lift my GPA because it's really bad at the moment, so I'm aiming high in this assignment.

THANK YOU EVERYONE!