+ Reply to Thread
Results 1 to 9 of 9

Thread: Statistically different tops for 2 parabolas with non-normal distributed data

  1. #1
    Points: 839, Level: 15
    Level completed: 39%, Points required for next Level: 61

    Posts
    10
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Statistically different tops for 2 parabolas with non-normal distributed data




    For situation 1 and 2 from my data came two parabola relations between variables X and Y. (the two parabolas have both a maximum)

    My question is whether it is possible to test if the two maxima of the parabolas are significantly different from each other, since the tops of the parabolas are unequal to the means of the data. (for both parabolas, the vast majority of the observations is left from the maximum)

    If it is possible, how can I measure this? (e.g., which test can do this?)

    Thanks a lot

  2. #2
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data

    You could do this a variety of ways. None of which I would describe as 'simple' though. To start with let me as a question - if you fit a quadratic regression (a regression model of the form y_i = \beta_0 + \beta_1 x_i + \beta_2 x_i^2 + \epsilon_i) to each set of data do the residuals appear to be approximately normally distributed?
    I don't have emotions and sometimes that makes me very sad.

  3. The Following User Says Thank You to Dason For This Useful Post:

    rob1 (08-02-2014)

  4. #3
    Points: 839, Level: 15
    Level completed: 39%, Points required for next Level: 61

    Posts
    10
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data

    Yes, I did the quadratic regressions, and the residuals appear to be normally distributed.

    I don't know whether this matters, but I use SPSS for the analysis.
    Last edited by rob1; 07-30-2014 at 06:14 PM.

  5. #4
    TS Contributor
    Points: 22,410, Level: 93
    Level completed: 6%, Points required for next Level: 940

    Posts
    3,020
    Thanks
    12
    Thanked 565 Times in 537 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data

    So I try to continue from Dason's question, with the notation proposed by Dason which is standard.

    From the elementary knowledge about the parabola, we know that the maximum is

    \frac {4\beta_0\beta_2 - \beta_1^2} {4\beta_2}

    Note: I am assuming you are not referring to the axis of symmetry \frac {-\beta_1} {2\beta_2}, but actually the calculations can be similarly applied

    So you will be testing

    H_0: \frac {4\alpha_0\alpha_2 - \alpha_1^2} {4\alpha_2} = \frac {4\beta_0\beta_2 - \beta_1^2} {4\beta_2}

    where \alpha, \beta representing the parameters of the two situations respectively.

    For modelling prospective, you need to specify whether the two situations are independent. If the two variables are the "same" but in two different "situations" then it is likely that the pair in two situations are dependent. Using dummy variable, the model could be written as

    y_i = D_i(\alpha_0 + \alpha_1x_i + \alpha_2x_i^2) + (1 - D_i)(\beta_0 + \beta_1x_i + \beta_2x_i^2) + \epsilon_i

    where D_i = 0, 1 is the dummy variable indicating the two situation. The homogenity of the error can be also addressed as well.


    One way to test it is using the bootstrap; another way is using Delta's method. Let's confirm the model set-up first.

  6. The Following User Says Thank You to BGM For This Useful Post:

    rob1 (08-02-2014)

  7. #5
    Points: 839, Level: 15
    Level completed: 39%, Points required for next Level: 61

    Posts
    10
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data

    Thanks so far

    I read your part again, and I confirm the model set-up, except that I am comparing the axis of symmetry, instead of the maxima.
    Last edited by rob1; 08-01-2014 at 03:21 PM.

  8. #6
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data

    Bootstrapping would probably be the easiest approach but otherwise I would suggest the delta method as bgm mentioned.
    I don't have emotions and sometimes that makes me very sad.

  9. #7
    TS Contributor
    Points: 22,410, Level: 93
    Level completed: 6%, Points required for next Level: 940

    Posts
    3,020
    Thanks
    12
    Thanked 565 Times in 537 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data

    First I must say that I have little experience in implementing the bootstrap procedure in practice, so please correct me if I am wrong.

    Now you want to test

    H_0: \frac {-\alpha_1} {2\alpha_2} = \frac {-\beta_1} {2\beta_2}

    which is equivalent to

    H_0: \alpha_1\beta_2 - \beta_1\alpha_2 = 0

    The basic idea is that as the estimators \hat{\alpha}_1, \hat{\alpha}_2, \hat{\beta}_1, \hat{\beta}_2 are consistent estimators, we may use

    T = \hat{\alpha}_1\hat{\beta}_2 - \hat{\beta}_1\hat{\alpha}_2

    as the test statistic and reject H_0 when it is significantly different from 0.

    To determine whether it is significant or not, you need to determine the distribution of T under H_0 and find out the corresponding quantiles (with the given significance level) and use that to give the rejection/acceptance region.

    The steps could be like the following:

    1. Suppose you have m, n pairs of data for each situation. Now you re-sample from the original sample with replacement with the same sample size for each situation.

    2. Using the generated sample, now estimate all the parameters under the H_0 constraint - which you need to jointly estimate for both situation and you may need to use the Lagrange multiplier if you are seeking a closed-form solution.

    3. Calculate the test statistic T in this sample, and record it.

    4. Return to step 1 and repeat for B times. Use the sample percentile of these recorded T_1, T_2, \ldots, T_B to construct the acceptance region. E.g. if your significance level is 5\% then you will use the 2.5 and 97.5 percentile as the pair of end-points for the acceptance region (interval).

    Once you obtain the acceptance region, you can calculate T for the original sample again and make the decision.

  10. #8
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data

    Actually this would probably be fairly simple to do using non-linear regression... What software are you using?
    I don't have emotions and sometimes that makes me very sad.

  11. #9
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Statistically different tops for 2 parabolas with non-normal distributed data


    If one set up a model of the form

    y_i = (\beta_0 + \alpha_0I_i) + (\beta_1 + \alpha_1I_i)(x_i - (x_0 + a_0I_i))^2 + \epsilon_i

    you could fit that using a non-linear regression routine. If I_i was an indicator that was 0 for group 1 and 1 for group 2 then if you're interested in asking the question "does the max occur at different x-values for these two groups" this translates into testing the null hypothesis a_0 = 0.

    Really all this is is a way of writing a quadratic function in the form
    y = \beta_0 + \beta_1(x - x_0)^2 which is functionally equivalent to the typical way we write quadratics when doing regression but takes a non-linear form in the parameters. We allowed the two groups to have different parameters.
    I don't have emotions and sometimes that makes me very sad.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats