# Thread: Factor analysis, what type of data can I use?

1. ## Re: Factor analysis, what type of data can I use?

I just read an article about polychoric correlation coefficient. J Erkstrom, the author, used Karl Pearson's smallpox recovery data, which is composed of four sets of total counts for participants who either recovered or died (0, 1 values) and individuals who weren't or were vaccinated (coded 0,1) to illustrate the limitations of polychoric correlations used in non-normally distributed data.

Using the chi-square method, it is well known that the vaccine is effective with a p value <0.0001 (n=2081)
However, the polychoric correlation coefficient is significant, but far from 1. Using R, the data yielded a coefficient of 0.60 I believe this divergence of results is due to the skewed nature of the sample, as it is composed of highly polarized, thereby skewed sets of binary data.

Any further articles, ideas, thesis arguments are welcome.

PS I am including a brief summary of the R script I used as it took a while to get the results due to a small misspelling issue in the R interface.

packages you need to install:
mvtnorm, sfsmisc, polycor

library(polycor)
polychor(x)

Reference:

(1)

2. ## Re: Factor analysis, what type of data can I use?

I don't understand how you can use logistic regression in the context of EFA. I have never seen logistic regression used for data reduction, it does not generate latent factors. If you can do that, that's amazing.

I would think M or S estimators would be even better for skewed data, but they can not be used with a non-interval dependent variable.

3. ## Re: Factor analysis, what type of data can I use?

Originally Posted by noetsi
I don't understand how you can use logistic regression in the context of EFA. I have never seen logistic regression used for data reduction, it does not generate latent factors. If you can do that, that's amazing.
You're totally right. Sorry for the misunderstanding. The way I view this is that Exploratory Factor Analysis helps researchers to identify relationships among variables, pretty much like it's done in a simple correlation matrix. It is kind of a first step exploratory method. In my research case, its implementation is important, much more so because I have 45 variables. I must say it is the first time I conduct EFA, I just downloaded an article that will help me understand the method into depth.

According to the results I get from EFA, I can use multivariate techniques (logistic regression in my case) as a second research step in order to ascertain the relationship and predictive power of independent vars over a response variable. For instance, I could test how well an ordinal variable (i.e. respondent's perception about the government of their nation) predicts voting or not voting.

4. ## Re: Factor analysis, what type of data can I use?

I still have not figured out how to add papers here. Here are some useful links.

This shows how to do polychoric correlations in SAS.

http://support.sas.com/kb/25/010.html

A brief discussion of this (the use of polychoric correlations) process including other software that uses it.

http://www.john-uebersax.com/stat/sem.htm

This stakes out an important difference between EFA and PCA (you will do the later I believe if you use the SAS default for EFA and probably other software as well). Not all agree with this view.

http://www2.sas.com/proceedings/sugi30/203-30.pdf

A list of assumptions.
http://en.wikiversity.org/wiki/Explo...is/Assumptions

http://sites.stat.psu.edu/~ajw13/sta...or_assump.html

General articles on EFA methods (summaries of the state of the art to some extent).

http://pareonline.net/pdf/v10n7.pdf

http://mvint.usbmed.edu.co:8002/ojs/...ewFile/464/605

http://www.bama.ua.edu/~jcsenkbeil/g...20Analysis.pdf

http://www.cob.unt.edu/slides/paswan...y_Huffcutt.pdf

http://psych.unl.edu/psycrs/948_2011/2b_EFA_PCA.pdf

Page 2 of 2 First 1 2

 Tweet