+ Reply to Thread
Results 1 to 4 of 4

Thread: Simple Linear Regression Coefficient Estimation Across n Samples

  1. #1
    Points: 29, Level: 1
    Level completed: 58%, Points required for next Level: 21

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Simple Linear Regression Coefficient Estimation Across n Samples



    Hello,

    I was hoping that somebody could provide some assistance on the following topic:

    Consider a random variable Y with mean: y = \alpha + \beta x and variance \sigma^{2}. Assume n independent values of Y are observed and are split into two samples such that the samples can be labelled: n_{1} and n_{2} (n_{2} = n-n_{1}); furthermore, assume that the observations in both groups share a slope/gradient (\beta), but have differing intercepts (\alpha). How can it be shown that:

    \hat{\beta} = \frac{w_{1}\hat{\beta_{1}} + w_{2}\hat{\beta_{2}}}{w_{1}+w_{2}}

    where \hat{\beta_{1}} and \hat{\beta_{2}} are the estimates for \beta derive from each sample and w_{1} and w_{2} are the sums of squares of the x values (independent variables) about their mean for each sample.

    In addition, what form does Var(\hat{\beta}) have?




    Finally, for clarity, following on from the above model, is it correct to assume that, the intercept (\alpha) for each sample will be given by:

    \alpha_{j} = \bar{Y}^{(j)}-\beta_{j}\bar{X}^{(j)} \text{, for } j=1,2

    And the gradient/slope (\beta) is given by:

    \hat{\beta_{j}} = \frac {\sum_{i=1}^{n} X_{i}^{(j)}Y_{i}^{(j)}-n\bar{X}^{(j)}\bar{Y}^{(j)}} {\sum_{i=1}^{n} X_{i}^{(j)2}-n\bar{X}^{(j)2}} = \frac{S_{XY}}{S_{XX}} \text{, for } j=1,2

    (Note: The (j) notation in the above (X^{(j)}, Y^{(j)}, etc.) is intended to denote sample to which each data point belongs -- It is not exponential notation.)


    Edit:

    I should probably mention that, originally, I had assumed that I could use the method of least squares to sum the SS_{E} terms from each sample, set the partial derivative w.r.t. \beta equal to zero and solve for \beta; such that:

    \sum_{j=1}^{2} \frac{\partial SS_{E}^{(j)}}{\partial \beta} = \frac{\partial SS_{E}^{(1)}}{\partial \beta} + \frac{\partial SS_{E}^{(2)}}{\partial \beta} = 0

    = \frac{\partial} {\partial \beta} [\sum_{i=1}^{n - n_{1}} (y_{i}^{(1)}-\hat{\alpha_{1}}-\hat{\beta}x_{i}^{(1)})^{2} + \sum _{i=n+1-n_{1}}^{n} (y_{i}^{(2)}-\hat{\alpha_{2}}-\hat{\beta}x_{i}^{(2)})^{2}] =  0

    = -2 \sum_{i=1}^{n - n_{1}} [(y_{i}^{(2)}-\hat{\alpha_{1}}-\hat{\beta}x_{i}^{(2)})(x_{i}^{(2)})] -2 \sum_{i=n+1-n_{1}}^{n} [(y_{i}^{(2)}-\hat{\alpha_{2}}-\hat{\beta}x_{i}^{(2)})(x_{i}^{(2)})] = 0

    However, this does not appear to be equivalent to the form given above, so I'm assuming this is not a valid assumption.

    Thanks in advance!
    Last edited by Rupert; 10-24-2012 at 07:35 AM. Reason: Improved formatting. Additional Information.

  2. #2
    Test of Gnomality
    Points: 8,295, Level: 61
    Level completed: 49%, Points required for next Level: 155
    hlsmith's Avatar
    Posts
    1,514
    Thanks
    99
    Thanked 255 Times in 248 Posts

    Re: Simple Linear Regression Coefficient Estimation Across n Samples

    This is a detailed and very specific series of questions, is this homework? If so, there are policies for submitting homework questions, such as showing your progress and attempts to solve.

    If you want to compare to see if the slopes are equal with only potentially varying intercepts you can merge the datasets together, have a new variable dataset_source (1 or 2) then place that variable in the model. If variable dataset_source is significant the slopes differ (cross). Let us know your particular hang-ups in solving this question.

  3. #3
    Points: 29, Level: 1
    Level completed: 58%, Points required for next Level: 21

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Simple Linear Regression Coefficient Estimation Across n Samples

    Hi hlsmith,

    Firstly, thanks for your response.

    Yes, this question is part of a homework exercise.
    As you can see, my original attempts to prove that the \beta can be written in the above form involved using least squares (I'm pretty sure this is how it is supposed to be done); however, setting the partial derivative of the SS_{E} w.r.t. \beta equal to zero and solving for each sample has not yielded the above form. Thus, I have concluded that there is either an error in the algebra or in this reasoning.

    My difficulty is, given the information above, showing that \beta can be expressed in the above form.

  4. #4
    Test of Gnomality
    Points: 8,295, Level: 61
    Level completed: 49%, Points required for next Level: 155
    hlsmith's Avatar
    Posts
    1,514
    Thanks
    99
    Thanked 255 Times in 248 Posts

    Re: Simple Linear Regression Coefficient Estimation Across n Samples


    Good luck, hopefully others chime in, since I am not the best person to evaluate or proof such equations. Less than 24 hours ago I did not know the flipped looking "e" was actually a rounded "d", but I am hoping to get there.

+ Reply to Thread

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats