+ Reply to Thread
Page 3 of 3 FirstFirst 1 2 3
Results 31 to 36 of 36

Thread: heteroskedasticity and non normal residuals in linear regression - please help!

  1. #31
    Human
    Points: 12,686, Level: 73
    Level completed: 59%, Points required for next Level: 164
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,363
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: heteroskedasticity and non normal residuals in linear regression - please help!




    So, depth change is finishing depth - starting depth. If you do the regression Change~starting depth there is NO correlation using these simulated values.
    But you have simulated your self and Dason have shown analytically that there is a relation between change and the baseline value. Of course there is a relation!

    But maybe I miss-understand something here.

    To do a pairwise t-test is a repeated measurement study. A very simple, very concrete and often very useful study still a repeated measurement study (where the individual acts as its own control).

    Maybe the baseline should not be used as explanatory variable, but used in a repeated measurement model.

  2. #32
    Points: 5,259, Level: 46
    Level completed: 55%, Points required for next Level: 91
    SiBorg's Avatar
    Posts
    255
    Thanks
    71
    Thanked 25 Times in 22 Posts

    Re: heteroskedasticity and non normal residuals in linear regression - please help!

    There is no relationship if you keep the measurements 'paired'.... but I've got to dash so will post later 2 nite....

  3. #33
    Points: 5,259, Level: 46
    Level completed: 55%, Points required for next Level: 91
    SiBorg's Avatar
    Posts
    255
    Thanks
    71
    Thanked 25 Times in 22 Posts

    Re: heteroskedasticity and non normal residuals in linear regression - please help!

    Dason, can you prove that two completely random variables (from a random and not a normal distribution) when regressed as Y-X~X will have a correlation proportional to -1*X? My simulation of two completely random variables seemed to have this correlation just as you predicted for two normally distributed variables. I was interested in whether this could be proved mathematically.

  4. #34
    Points: 5,259, Level: 46
    Level completed: 55%, Points required for next Level: 91
    SiBorg's Avatar
    Posts
    255
    Thanks
    71
    Thanked 25 Times in 22 Posts

    Re: heteroskedasticity and non normal residuals in linear regression - please help!

    Another question has popped into my head.

    The central limit theorem should suggest that if my sample size is large enough then my residuals should tend to a normal distribution. So, if they don't, then does that mean I can't say that the central limit theorem will help?

    On the other hand, the vast majority of my sample lies on the diagonal line of the QQ plot. Since the confidence intervals are estimated based on a normal distribution, does it matter that, say 40 of 260 points are off the scale when the rest lie on the plot? In other words, if 80% of my residuals are normally distributed, should this be OK for the calculation of confidence intervals?

    How best should I convince a reviewer that my linear model is OK when my residuals do not pass the Shapiro-Wilk normality test?

  5. #35
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: heteroskedasticity and non normal residuals in linear regression - please help!

    Quote Originally Posted by SiBorg View Post
    Since the confidence intervals are estimated based on a normal distribution, does it matter that, say 40 of 260 points are off the scale when the rest lie on the plot? In other words, if 80% of my residuals are normally distributed, should this be OK for the calculation of confidence intervals?
    I am as well eager to know the answer to this question. However, I for one have seen many studies that have published confidence intervals for the differences (including some of my own ones), or have encountered journal editors or reviewers asking for CI for the differences. However, differences between two populations are not necessarily normally distributed. The point is that despite that fact, authors, editors and reviewers (and readers) still seem indifferent to the type of the sample for which the CI is computed. Or, at least, they might have no other option.

    How best should I convince a reviewer that my linear model is OK when my residuals do not pass the Shapiro-Wilk normality test?
    If I were you, rather than mentioning the P value (which can give a biased reviewer the excuse to invalidate your results), I would try to base my model selection on my QQ plot, as its subjective nature leaves some room for you to maneuver. Besides, I think you could use CLT due to the high number of your observations (as stated previously by Dason I think)?

  6. #36
    Devorador de queso
    Points: 95,995, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,938
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: heteroskedasticity and non normal residuals in linear regression - please help!


    Quote Originally Posted by SiBorg View Post
    The central limit theorem should suggest that if my sample size is large enough then my residuals should tend to a normal distribution. So, if they don't, then does that mean I can't say that the central limit theorem will help?
    You appear to be showing a fundamental misunderstanding of the central limit theorem. The CLT doesn't apply to the data itself. It applies to sample means or in this case the estimated parameters in the model. We can show that with enough data the sampling distribution of the estimated parameters will be approximately normal even if the original errors aren't normally distributed.
    On the other hand, the vast majority of my sample lies on the diagonal line of the QQ plot. Since the confidence intervals are estimated based on a normal distribution, does it matter that, say 40 of 260 points are off the scale when the rest lie on the plot? In other words, if 80% of my residuals are normally distributed, should this be OK for the calculation of confidence intervals?
    This appears to be nonsense. Well... I guess not completely nonsense but I think you're misunderstanding the qqplot. You don't have a situation in which "80% of the residuals are normally distributed" - you just have a situation in which the residuals probably aren't perfectly normally distributed. Even if you had a situation in which your error term was a mixture of a normal distribution and something else - the qqplot wouldn't be able to tell you exactly which points weren't from the normal distribution. And that really doesn't matter anyways. We might care about outliers but using the qqplot isn't the way to identify them.

    Quote Originally Posted by victorxstc View Post
    I am as well eager to know the answer to this question. However, I for one have seen many studies that have published confidence intervals for the differences (including some of my own ones), or have encountered journal editors or reviewers asking for CI for the differences. However, differences between two populations are not necessarily normally distributed. The point is that despite that fact, authors, editors and reviewers (and readers) still seem indifferent to the type of the sample for which the CI is computed. Or, at least, they might have no other option.
    You don't actually say what you're taking differences of. But the CLT can still apply to differences in many regards so I don't have a (major) problem with using normal based methods if the sample size is large enough.
    I don't have emotions and sometimes that makes me very sad.

  7. The Following User Says Thank You to Dason For This Useful Post:

    SiBorg (10-24-2012)

+ Reply to Thread
Page 3 of 3 FirstFirst 1 2 3

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats