+ Reply to Thread
Results 1 to 8 of 8

Thread: Normality assumption for PCA?

  1. #1
    Points: 1,105, Level: 18
    Level completed: 5%, Points required for next Level: 95

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Normality assumption for PCA?



    I know that the classical Pearson correlation coefficient is only valid when
    data are normally distributed. For this, I generally use the Shapiro–Wilk
    normality test.

    I was recently wondering if the data also need to have a normal distribution to use a PCA. I didn't find a clear answer to this in the litterature but I read
    that PCA assumes a multivariate normality of the data. I was wondering (1) if
    you agree with this, (2) what this actually means, and (3) if there is a test
    to check this.

    Thank you very much!

  2. #2
    Points: 2,895, Level: 32
    Level completed: 97%, Points required for next Level: 5

    Posts
    219
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Normality assumption for PCA?

    That is not a very strict requirement. If you have multivariate normality, then great, but if you don't, results can still be interpreted. PCA is not a p-value driven technique.

    Checking that assumption is difficult. I would just check normality for each variable separately along with skewness and kurtosis stats.

  3. #3
    Points: 1,176, Level: 18
    Level completed: 76%, Points required for next Level: 24

    Location
    Oslo, Norway
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: Normality assumption for PCA?

    Quote Originally Posted by seb3343 View Post
    I know that the classical Pearson correlation coefficient is only valid when data are normally distributed.
    That is not quite true. You are free to compute Pearson's correlation for data with any distribution. Maybe not always a smart thing to do, but there is no law against it. It's when you start to make p values that things become more strict.

    Similar thing with PCA. You are free to PCA any data you wish, but it may work better for multivariate normal data.

  4. #4
    Super Moderator
    Points: 14,607, Level: 78
    Level completed: 40%, Points required for next Level: 243
    bugman's Avatar
    Posts
    1,492
    Thanks
    88
    Thanked 140 Times in 109 Posts

    Re: Normality assumption for PCA?

    Like ohammer said, but just an additional note:

    If you are using PCA for modelling purposes (either subsequent gradient analyses or regression) - then normality would be ideal. If its for data reduction or exploratory prurposes, then normality (as previous posters have mentioned) is not a strcit requirement.
    The earth is round: P<0.05

  5. #5
    Points: 1,105, Level: 18
    Level completed: 5%, Points required for next Level: 95

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Normality assumption for PCA?

    Thank you for your answers. This clarifies my concerns.

    I asked the same question to several statisticians in parallel and I got quite different answers. In the end, I guess all depends what I want to do with the data (as bugman says if its for data reduction or exploratory purposes, then normality is not a strcit requirement)

    Here are the other answers that I got:

    (1) PCA is a purely geometrical technique - there is no need for a statistical hypothesis

    (2) Multivariate normality is an assumption of PCA, but not a critical assumption. You can test for multivariate normality with a version of Shapiro-Wilk for multivariate normality.

    (3) For PCA, there are assumptions about the data - that is is continuous and normally distributed - but this can be overlooked if the purpose of the test is to generate further hypotheses

    Thanks!

    Sebastien

  6. #6
    Points: 1,176, Level: 18
    Level completed: 76%, Points required for next Level: 24

    Location
    Oslo, Norway
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: Normality assumption for PCA?

    Quote Originally Posted by seb3343 View Post
    I asked the same question to several statisticians in parallel and I got quite different answers.
    Haha, I guess they are correct on average!

  7. #7
    Super Moderator
    Points: 14,607, Level: 78
    Level completed: 40%, Points required for next Level: 243
    bugman's Avatar
    Posts
    1,492
    Thanks
    88
    Thanked 140 Times in 109 Posts

    Re: Normality assumption for PCA?

    @ Ohammer

    - Excellent!
    The earth is round: P<0.05

  8. #8
    Points: 499, Level: 9
    Level completed: 98%, Points required for next Level: 1

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Normality assumption for PCA?


    Quote Originally Posted by seb3343 View Post
    I know that the classical Pearson correlation coefficient is only valid when
    data are normally distributed. For this, I generally use the Shapiro–Wilk
    normality test.

    I was recently wondering if the data also need to have a normal distribution to use a PCA. I didn't find a clear answer to this in the litterature but I read
    that PCA assumes a multivariate normality of the data. I was wondering (1) if
    you agree with this, (2) what this actually means, and (3) if there is a test
    to check this.

    Thank you very much!
    this books gives perfect answer to your question.
    Principal Component Analysis- 2nd edition-2002

    http://www.amazon.com/Principal-Comp.../dp/0387954422

+ Reply to Thread

Similar Threads

  1. Replies: 1
    Last Post: 11-21-2011, 03:14 PM
  2. Assumption of normality
    By bunguman in forum Statistics
    Replies: 2
    Last Post: 05-06-2010, 01:32 PM
  3. Replies: 0
    Last Post: 03-19-2010, 01:58 PM
  4. circumventing the normality assumption in paired t-tests??
    By hellge1 in forum Psychology Statistics
    Replies: 5
    Last Post: 01-21-2010, 08:39 AM
  5. paired t-test normality assumption on which data?
    By gatemaze in forum Statistics
    Replies: 0
    Last Post: 11-12-2007, 06:22 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats