Principal Component Analysis: What determines sign of variable loading?

Yoshi

New Member
#1
Hi,

I have a question about how the signs of variable loadings on a component (factor) are determined in the PCA.

I am studying anxiety/fear response in animals.

I tested a group of animals by showing a negative stimulus and measuring their behavioural responses to the stimulus.
The measured variables are:
1) Distance an animal kept from the stimulus.
2) Locomotion of an animal
3) Number of vocalization A
4) Number of vocalization B

I have run Principal Component Analysis with all the variables using SPSS.
2 components came up to have eigenvalue bigger than 1.0.
The rotated component matrix shows that Distance and Locomotion are highly loaded to the 1st component, and the vocalizations are highly loaded to the 2nd component.
The loading score for the distance is -.938 and the one for the locomotion is .884. And, the loading scores for the vocalizations are .893 and .864 respectively.

Then, I tested another group of animals with the same test and measured the same variables.
The PCA in this group came up with the similar result with the first group (we expected them to be similar).
Only difference was the signs of variable loading scores.
In the second analysis, for the 1st component, the loading scores of the distance was .933 and the locomotion was -.933. For the 2nd component, the scores for the vocalization were .921 abd .916.

So, in the first group I have negative score for the distance and positve score for the locomotion. In the second group, I have positive score for the distance and negative score for the locomotion.

In terms of having the opposite signs, the results make sense since the animal that keep wider distance is probably more scared and does not move. However, I do not understand how the sign (negative/positive) is determined in the PCA.
The two groups are supposed to be similar and they are except for the signs.
Is it because how the rotation happened?
Is there any way to match the signs?

If you can help me with this question, I appreciate it.

Thank you, :)
 

CB

Super Moderator
#2
I have run Principal Component Analysis with all the variables using SPSS.
2 components came up to have eigenvalue bigger than 1.0.
Careful; I know the eigenvalue > 1 rule if the default in SPSS, but it is a very poor way for deciding how many factors to retain. See for example Zwick & Velicer, 1986. This method tends to overestimate the appropriate number of factors. If you're restricted to SPSS you're probably best served by a scree test, although ideally the parallel analysis or Velicer's minimum average partial correlation methods would be better (you can apply these criteria for determining the number of components or factors using R).

However, I do not understand how the sign (negative/positive) is determined in the PCA.
The two groups are supposed to be similar and they are except for the signs.
As I understand it, the sign of loadings in a PCA or factor analysis is fairly arbitrary and doesn't convey anything substantively important. The same factor model may fit well in two highly similar samples (for example, even two resamples from the same original sample), but minor differences between the samples may result in a model with loadings of a different sign on one or more factors being produced in one of the samples.

It would be possible to produce loadings of the same sign in each sample by rotating the factor solution in one sample towards the factor solution in the other sample, or by rotating both towards a pre-specified target matrix. This is beyond what is possible in SPSS, though; you'd have to consider how much of a problem this is for your study.
 

spunky

Can't make spagetti
#3
As I understand it, the sign of loadings in a PCA or factor analysis is fairly arbitrary and doesn't convey anything substantively important.
so... good ol' Guttman's factor indeterminacy problem rears its ugly head again, huh? yep... the sign of the loading's it's definitely an artifact of the rotation. there's this really nice book edited by the legendary C.R. Rao called Handbook of Statistics, Volume 26: Psychometrics where you can see a nice mathematical proof of that... i dont remember which chapter is the factor analysis one though... (chp 7 or 9 i think...)
 

Yoshi

New Member
#4
Hi,

Thank you very much for your replies.

I checked the component matrix given in the output of SPSS for the two groups.
The signs of the loadings were already reversed between the groups. In the first group, the locomotion: positive and the distance: negative, in the second group they are reversed.
From what I know, the component matrix gives the factor loadings before a rotation.
So, if the change in the signs is the artifact of rotation, why are they different already in the component matrix?

Thank you,
 

spunky

Can't make spagetti
#5
So, if the change in the signs is the artifact of rotation, why are they different already in the component matrix?
because the orientation of the components is arbitrary from the start, there is no inherent meaning or fixed set of coordinates to it... whatever your version of SPSS chooses it to be, there's where you will find them in the canonical space of the unit circle (keep in mind all principal components have a variance of one). that's kind of why i mentioned the formal name of the problem in the psychometric literature (the factor indeterminacy problem, which is where you'll find most of the research concerning this particular phenomenon) because, you can do an infinite number of different rotations on them and still obtain the same solution to the covariance/correlation matrix decomposition. i wouldn't be surprised if you were to do principal components from the exact same data twice but with a different computer program and get different loading signs form the ones you're getting now..