+ Reply to Thread
Results 1 to 9 of 9

Thread: hypothesis testing for least squares fitting

  1. #1
    Points: 2,599, Level: 30
    Level completed: 99%, Points required for next Level: 1

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    hypothesis testing for least squares fitting




    I have a spectrum that is composed of several things. I do least squares fitting in order to determine the amount of each component. For example: Spectrum = c1 *x1 + c2*x2+c3*x3+c4*x4+c5*x5. The c's are my percentages that I get through least squares fitting and the x's are the spectra of each individual component. Now, I add a 6th parameter, x6 and want to know if it is in my spectrum. I do linear least squares fitting again, but have 6 variables it is fitting. The fit will improve simply b/c I allow it more freedom. What I want to know is, is the 2nd model with the 6th component statistically significant or can I reject it because it only improved the fit b/c it was another degree of freedom? Essentially, I want to say that x6 is not statistically significant and I can reject it. I believe hypothesis testing is what I want. My null hypothesis is the 5 component model and my alternative is the 6 component model.
    The problem is, I need an average and it doesn't make sense to do an average b/c this is an absorbance spectrum and not something centered on 0 or some other average value. Can I use the residuals to do hypothsis testing? The fit spectrum - the actual? If so, how do I do hypothesis testing on it?

    Thanks,
    Lisa

  2. #2
    Points: 3,371, Level: 36
    Level completed: 14%, Points required for next Level: 129

    Location
    Austin, TX
    Posts
    49
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Running a linear regression (using Excel, for example) is essentially the same as least square fitting. To check the significance of the variables in the model look at the t-statistic calculated as part of the regression. If the absolute value of the t-statistic for a variable is greater than 2 then it is significant. Also check the |t| of the intercept, if not significant then force it to be zero.

  3. #3
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts
    Andre, if the constant is not significant you still keep it. you do not want to force it to be zero.

  4. #4
    Points: 3,371, Level: 36
    Level completed: 14%, Points required for next Level: 129

    Location
    Austin, TX
    Posts
    49
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Masteras - I'm curious, could you please expand on why you would want to keep the intercept in your equation if it is not significantly contributing to the result?

  5. #5
    Super Moderator
    Points: 13,151, Level: 74
    Level completed: 76%, Points required for next Level: 99
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    2,014
    Thanks
    0
    Thanked 223 Times in 192 Posts
    Quote Originally Posted by Andre Smit View Post
    Masteras - I'm curious, could you please expand on why you would want to keep the intercept in your equation if it is not significantly contributing to the result?

    It depends on what one is doing.

    For example, the solution of intercept term is to ensure that the mean of the predicted scores is equal to the mean of the dependent variable. This may, or may not be important.

  6. #6
    Points: 3,371, Level: 36
    Level completed: 14%, Points required for next Level: 129

    Location
    Austin, TX
    Posts
    49
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Got it - i.e. if you'll be using the regression equation for predictions. I guess with an insignificant intercept the prediction error would be very large.

  7. #7
    Points: 2,599, Level: 30
    Level completed: 99%, Points required for next Level: 1

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I already have done the least squares fitting on hundreds of data. Now I want to take that data and determine for a few of them if one model is significantly better than the other. I don't use excel and would like to just do this with what I have already calculated and fit using a mathematical expression. I've been looking at the F-test on wikipedia: http://en.wikipedia.org/wiki/F-test. I already have residuals, so I'd like to use the F=((RSS1-RSS2)/(p2-p1))/(RSS2/(n-p2)) but I'm not sure if I'm doing this correctly. I have 2 models, one with 5 components and 1 with 6. Each spectrum has 229 points in it. RSS is residual sum of squares. So, do I take the fit spectrum - the actual to get residuals, then square each of those residuals (229 of them) and then add them all together. I do this for both model 1 and 2 to get RSS1 and RSS2. p1 and p2 are the parameters in the model, so I think this would be 5 and 6. n is the # of data points, so I believe this is 229. If I'm doing this right, I did it for 2 different experiments and here are the results:
    F = ((4.5248e-5-1.3452e-5)/(1))/(1.3452e-5/(229-6))=527.1135
    2nd one:
    F=((1.3288e-5-1.3197e-5)/1)/(1.3197e-5/(229-6))=1.537697

    I then go to an F-table and look for F(1,223). I used this one: http://www.itl.nist.gov/div898/handb...n3/eda3673.htm for 5% significance.
    It didn't go up to 223, but its not changing much at high #s, so I used 100. THe critical value is 3.936.

    So, in the first F-test, can I say the 2nd model is significant with 95% confidence and for the 2nd F-test I calculated I cannot say it is statistically significant and must use the 1st model with only 5 components?

    Lisa

  8. #8
    Points: 2,726, Level: 31
    Level completed: 84%, Points required for next Level: 24

    Posts
    25
    Thanks
    0
    Thanked 0 Times in 0 Posts
    There's a formula for comparing nested models

  9. #9
    Points: 2,599, Level: 30
    Level completed: 99%, Points required for next Level: 1

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Thanks, I looked up stuff on nested models and it looks like I calculated the F values correctly. Now I need help with how I state the conclusions.
    Case 1: F = 527 for F(1,223). I used this table: http://www.itl.nist.gov/div898/handb...n3/eda3673.htm and for 1% significance (F(1,100)), F critical is 6.9. My F is WAY bigger, so do I say I reject model 1 with 99% significance or a 99% confidence interval? How do I state the conclusion?
    Case 2: F = 3.85 for F(1,223). THis is lower than the F crit for 5% significance but not 10%. What do I state here? I cannot reject model 1 with ??? significance?

    Please help! I'm trying to wrap up my thesis.

    Thanks,
    Lisa

+ Reply to Thread

           




Similar Threads

  1. Hypothesis Testing
    By bugsy in forum Statistics
    Replies: 2
    Last Post: 04-28-2010, 01:14 PM
  2. hypothesis testing help
    By skyblue in forum Statistics
    Replies: 3
    Last Post: 05-03-2009, 06:26 PM
  3. Least Squares Fitting-Exponential
    By SPS in forum Statistics
    Replies: 4
    Last Post: 05-14-2008, 06:50 AM
  4. hypothesis testing
    By FB99 in forum Statistics
    Replies: 1
    Last Post: 01-28-2007, 11:24 AM
  5. Hypothesis Testing
    By thedeath in forum Statistics
    Replies: 26
    Last Post: 12-13-2006, 03:39 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats