Simple Linear Regression Coefficient Estimation Across n Samples
Hello,
I was hoping that somebody could provide some assistance on the following topic:
Consider a random variable with mean: and variance . Assume independent values of are observed and are split into two samples such that the samples can be labelled: and (); furthermore, assume that the observations in both groups share a slope/gradient (), but have differing intercepts (). How can it be shown that:
where and are the estimates for derive from each sample and and are the sums of squares of the values (independent variables) about their mean for each sample.
In addition, what form does have?
Finally, for clarity, following on from the above model, is it correct to assume that, the intercept () for each sample will be given by:
And the gradient/slope () is given by:
(Note: The notation in the above (, , etc.) is intended to denote sample to which each data point belongs -- It is not exponential notation.)
Edit:
I should probably mention that, originally, I had assumed that I could use the method of least squares to sum the terms from each sample, set the partial derivative w.r.t. equal to zero and solve for ; such that:
However, this does not appear to be equivalent to the form given above, so I'm assuming this is not a valid assumption.
Thanks in advance!
Last edited by Rupert; 10-24-2012 at 07:35 AM.
Reason: Improved formatting. Additional Information.
Re: Simple Linear Regression Coefficient Estimation Across n Samples
This is a detailed and very specific series of questions, is this homework? If so, there are policies for submitting homework questions, such as showing your progress and attempts to solve.
If you want to compare to see if the slopes are equal with only potentially varying intercepts you can merge the datasets together, have a new variable dataset_source (1 or 2) then place that variable in the model. If variable dataset_source is significant the slopes differ (cross). Let us know your particular hang-ups in solving this question.
Re: Simple Linear Regression Coefficient Estimation Across n Samples
Hi hlsmith,
Firstly, thanks for your response.
Yes, this question is part of a homework exercise.
As you can see, my original attempts to prove that the can be written in the above form involved using least squares (I'm pretty sure this is how it is supposed to be done); however, setting the partial derivative of the w.r.t. equal to zero and solving for each sample has not yielded the above form. Thus, I have concluded that there is either an error in the algebra or in this reasoning.
My difficulty is, given the information above, showing that can be expressed in the above form.
Re: Simple Linear Regression Coefficient Estimation Across n Samples
Good luck, hopefully others chime in, since I am not the best person to evaluate or proof such equations. Less than 24 hours ago I did not know the flipped looking "e" was actually a rounded "d", but I am hoping to get there.