+ Reply to Thread
Page 1 of 4 1 2 3 4 LastLast
Results 1 to 15 of 51

Thread: Multivariate normality

  1. #1
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Multivariate normality




    I know various methods of determing if the bivariate relationship of two variables are normal. But, as Dason has drilled into me, its multivariate normality (that is the normality of the residuals) that actually matters.

    I am not sure how to test for that. If you plotted the residuals into a QQ plot and it suggested normality, would that be a valid way to be sure the regression model had multivariate normality?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  2. #2
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Multivariate normality

    Although I am not particularly a fan of normality tests, I know that tests like the Shapiro-Wilk Multivariate Normality Test exist.

    A quick google search also shows there are R packages (PDF alert).

    Maybe you could look into the theory behind these tests and comeup with something satisfactory?
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  3. The Following User Says Thank You to TheEcologist For This Useful Post:

    noetsi (06-21-2013)

  4. #3
    Devorador de queso
    Points: 95,995, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,938
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: Multivariate normality

    I think you still have a misunderstanding. The reason I kept mentioning multivariate normality is because you were talking about the response variable being normally distributed. Some authors might say that Y needs to be normal but if that's the case then they're talking about multivariate normality for Y where the mean vector is a function of X. If all we want to do is check the normality assumption we can stick to univariate normal tests because the previous statement is the same as asking if the residuals are univariately normally distributed...
    I don't have emotions and sometimes that makes me very sad.

  5. The Following User Says Thank You to Dason For This Useful Post:

    noetsi (06-21-2013)

  6. #4
    TS Contributor
    Points: 22,473, Level: 93
    Level completed: 13%, Points required for next Level: 877
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Multivariate normality

    the psych package in R also has mardia's test of multivariate skewness/kurtosis where, if statistically significant, gives you evidence to suspect your distribution is not multivariate normal.

    i know noetsi uses Mplus, and Mplus also gives you mardia's test.

    now, for what reason in particular do you need to test for multivariate normality?
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  7. The Following User Says Thank You to spunky For This Useful Post:

    noetsi (06-21-2013)

  8. #5
    TS Contributor
    Points: 22,473, Level: 93
    Level completed: 13%, Points required for next Level: 877
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Multivariate normality

    Quote Originally Posted by TheEcologist View Post
    Although I am not particularly a fan of normality tests
    .... because...?
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  9. #6
    Devorador de queso
    Points: 95,995, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,938
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: Multivariate normality

    Quote Originally Posted by spunky View Post
    .... because...?
    I can't answer for him but I feel similarly. Typically they aren't that great with small sample sizes and once you get a large enough sample size to detect departure from normality then you have a large enough sample to not care about normality...
    I don't have emotions and sometimes that makes me very sad.

  10. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Multivariate normality

    I don't use them because they have a strong reputation for very weak power.

    I can not use Mplus at work (the state will not allow it's purchase nor let me purchase it personally and place it on the computer - don't ask why) and it will be a while before I learn R.

    Is it legitimate to use the residuals of a regression in a QQ plot to test for normality?

    once you get a large enough sample size to detect departure from normality then you have a large enough sample to not care about normality...
    Why would that ever be true? I understand the CLM comes into play at a certain point, but I read treatments all the time in the literature about normality and I have almost never seen one argue that at a certain sample size normality does not matter.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  11. #8
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Multivariate normality

    Quote Originally Posted by Dason View Post
    I can't answer for him but I feel similarly. Typically they aren't that great with small sample sizes and once you get a large enough sample size to detect departure from normality then you have a large enough sample to not care about normality...
    Exactly, also once you have a large sample size these tests also tend detect significant "non-normality" when departures from normality are meaningless.

    I mean look what one outlier in 5000 does to a shapiro.test

    Code: 
    # test once
    shapiro.test(c(rnorm(4999),-5.5))
    # test 100 times
    pvals<-replicate(100,shapiro.test(c(rnorm(4999),-5.5))$p.value)
    plot(density(pvals))
    abline(v=0.05,col='red')
    You can bet your pretty pink panties that the "sampling distribution" of the above is normal.
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  12. #9
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Multivariate normality

    Of course years ago I read a Stanford professor argue outliers could make the results of ANOVA invalid regardless of the sample size (that is asymptotic methods were no protection against bias - he argued the bias could actually get worse with larger samples given this issue).
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  13. #10
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Multivariate normality

    I don't have pretty pink panties. Did I not get my pair with TS membership?
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  14. #11
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Multivariate normality

    I can't remember who (link or vinux maybe) wrote a post on the old statspedia where ever that got too about multivariate normality.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  15. #12
    TS Contributor
    Points: 22,473, Level: 93
    Level completed: 13%, Points required for next Level: 877
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Multivariate normality

    Quote Originally Posted by Dason View Post
    large enough sample size to detect departure from normality then you have a large enough sample to not care about normality...
    uhm... i can see how this is true in the case of least-squares but would it also apply for ML? i remember reading in Pawitan's classic book on everything-you-need-to-know-about-maximum-likelihood that the choice of likelihood could (or could not) create a whole bunch of problems in terms of parameter bias, etc. so i do think that multivariate normality should be satisfied (at least as much as possible) if one is choosing a normal likelihood model, or something similar to it.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  16. #13
    TS Contributor
    Points: 22,473, Level: 93
    Level completed: 13%, Points required for next Level: 877
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Multivariate normality

    Quote Originally Posted by TheEcologist View Post
    Exactly, also once you have a large sample size these tests also tend detect significant "non-normality" when departures from normality are meaningless.
    oh pff... that's just bad data practice not to check (and dump) outliers before the analysis
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  17. The Following User Says Thank You to spunky For This Useful Post:

    trinker (06-21-2013)

  18. #14
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Multivariate normality



    Quote Originally Posted by spunky View Post
    oh pff... that's just bad data practice not to check (and dump) outliers before the analysis
    I'm sorry but what you are describing is actually bad practise, IMO very bad practise. It's sad that this is still taught as "common statistical sense" is some fields.

    You should only ever "dump" outliers, kicking and screaming, being very certain they are errors.
    You should certainly not dump them on a reflex!

    Best thing I can do is quote my FAQ part on this;

    Quote Originally Posted by TheEcologist View Post
    How do I remove or deal with outliers?

    Removing outliers can cause your data to become more normal but contrary to what is sometimes perceived, outlier removal is subjective, there is no real objective way of removing outliers.

    Always remember that these points remain observations and you should not just throw them out on a whim. Instead you should have good reasons to remove your outliers. There may be many truly valid reasons to remove data-points. These include outliers caused by measurement errors, incorrectly entered data-points or impossible values in real life. If you feel that any outlier are erroneous data points and you can validate this, then you should feel free to remove them.

    On the other hand, if you see no reason why your outliers are erroneous measurements then there is no truly objective way to remove them. They are true observations and you may have to consider that the assumptions of your test do not correspond to the reality of your situation. You could always try a non-parametric test (which in general are less sensitive to outliers) or some other analysis that does not require the assumption that your data is normally distributed.
    Or from Wikipedia
    Deletion of outlier data is a controversial practice frowned on by many scientists and science instructors
    Outliers are data, dumping them is bad data practise and you should feel very dirty evertime you do it without a very good reason.
    Last edited by TheEcologist; 06-23-2013 at 11:15 AM.
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  19. The Following User Says Thank You to TheEcologist For This Useful Post:

    Englund (06-24-2013)

  20. #15
    TS Contributor
    Points: 22,473, Level: 93
    Level completed: 13%, Points required for next Level: 877
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,137
    Thanks
    166
    Thanked 538 Times in 432 Posts

    Re: Multivariate normality


    oh, pff....

    if dumping outliers is good enough for NASA then it's good enough for me


    nah, in all seriousness. i was really into this (and other good statistical practices) like a few years ago... then when i started doing stats consulting for students and profs alike i realised everybody was dumping them (or tinkering with their data in other unspeakable ways) and kept on saying no... then the weeks became months and the months became years and i started noticing that even after people had gone through the mandatory research methods courses where they were instructed to not do it... they were still doing it.

    i felt too tired to swim against the current and just lost interest. i now know it shouldn't be done and i guess i'm quite happy with that.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

+ Reply to Thread
Page 1 of 4 1 2 3 4 LastLast

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats