# Factor analysis

#### LEMONIAS

Hello! Apologies if the answer is very obvious but i am new to this. I am running an exploratory factor analysis for a scale developement study. Which criterion would you suggest me to use for the factors extraction if the sample is small(N=150)?Thanks in advance.
P.s I would much appreciate it if you could give me a useful source regarding the write-up of an EFA in APA style.

#### spunky

by "criterion" do you mean which pointer to tell help you investigate the # of factors or which method of factor extraction do you want?

oh, and N=150? <--- this doesn't look super good for factor analysis :/

#### LEMONIAS

Unfortunately i ran out of time and couldn't collect more data. By criterion i mean the pointer for determining the number of factors extracted( Kaiser's, scree plot etc).
Another question i have is what if an item that loads to a factor is completely theoretically irrelevant from the other items of this factor(i know that i should increase the sample)?

#### spunky

I see. Well then you should be using Parallel Analysis. The majority of the literature supports this as the superior method to investigate the number of factors.

Regarding your second question I guess I would take time to investigate this item. Like how is it worded? Maybe it resembles more the wording of the other items that should load on that factor. What would be the item-total-correlation and alpha-if-item dropped for that item? You know, things like that. If it's a small loading I guess you can dismiss it as an artifact of your small sample. If it's a large loading it's worth being investigated.

Also, since you're doing Factor Analysis on a scale and response formats in scales are typically discrete (Yes/No, 5-point Likert-type options, etc.) remember that you should be analyzing the polychoric correlation matrix as opposed to Pearson correlations.

#### LEMONIAS

Thanks much for the answer. I'm trying to run a parallel analysis and get this message : ''Error # 34 in column 24. Text: screedata.sav
SPSS Statistics cannot access a
file with the given file specification.
The file specification is either
syntactically invalid, specifies an invalid
drive, specifies a protected
directory, specifies a protected file, orspecifies a non-sharable file.
Execution of this command stops.------ END MATRIX ---"

I get values for only ROOT, RAW DATA, MEANS & PERCENTILE and no other outputs. Any ideas? Thanks.

#### LEMONIAS

^ Ok i fixed it. If anyone could tell me how to compute polychoric correlations on spss that would be great! Thanks

#### spunky

last I knew, SPSS can't do that by itself, to the best of my knowledge (which is Version 17). there's probably syntax out there if you know how to work with syntax?

#### spunky

There are apps that do this calculation for parallel analysis. I used one recommended by a paper I read [which however, may involve a charge after 40 free days something I am pursuing now].

You might try this site and see if it works.

but this one assumes continuous data when the OP has ordinal data. Parallel Analysis is a powerful tool but even it can get tricked in the number of factors when working with ordinal data, unless you can get it to generate tetrachoric (or polychoric) correlation matrices.

#### noetsi

I use polychoric correlations although the author who suggested that site does not suggest it matters for parallel analysis. I don't know where in the app it assumes continuous data, where do you see that?

It doesn't ask you for the correlation or the covariance matrix in the app....

#### spunky

well, well well... so there's conflicting evidence with some published articles saying it shouldn't matter which one you use:

http://www.ppsw.rug.nl/~metimmer/timmermanlorenzoseva_PM_2011.pdf

http://epm.sagepub.com/content/69/5/748.short

and others saying that you should use the polychoric-based parallel analysis as opposed to the continuous one with ordinal data:

https://www.apa.org/pubs/journals/features/met-a0030005.pdf

this is a mystery! and it clearly needs some spunky in it to fix things up!

I don't know where in the app it assumes continuous data, where do you see that?

It doesn't ask you for the correlation or the covariance matrix in the app....
oh they all do, unless stated otherwise. like, there's quite a few things on the internet that do parallel analysis for you. the one we used in class is this one so you don't even have to download anything:

http://ires.ku.edu/~smishra/parallelengine.htm

#### noetsi

I actually had that, but did not post it, because I did not see eigenvalues in its out put which I thought was what you were looking for. Note its been several years since I did this, we use factor analysis rarely.

Does the app you post not assume continuous data? I assume you mean normally distributed data really, continuous is how you get that. If so with thousands of points of data, does it really matter

Its nice to work with 10-50 thousand data points commonly....well in the case of assumptions anyhow.

#### LEMONIAS

Well i suck at Syntax, just tried one and couldn't get results. Which way would you suggest me to use and get the polychor correlations?

#### noetsi

This may be doing principal component analysis rather than factor analysis - the differences continue to confuse me.
One approach to adapting factor analysis for ordinal variables is to use polychoric correlations, rather than the Pearson correlations that are used by SPSS Factor. SPSS does not have a built-in procedure for computing polychoric correlations, but there is an extension command (SPSSINC HETCOR) to print polychoric and polysrial correlations available in the SPSS Community for SPSS Statistics versions from 17.0 upwards. (Click the "Downloads for IBM SPSS Statistics" link, then the "Extension Commands" link under "Tools and Utilities". Look for the downloadable file SPSSINC_HETCOR.zip in the list of extension commands there. This command calculates polyserial, polychoric, and Pearson correlations between variables with the type determined by the variable measurement levels. The package includes a dialog box interface for the procedure. Beginning with Version 19, this file is installed with R Essentials.. The Programmability Plug-ins and Essentials for SPSS Statistics versions 19 and above are available on the "Downloads for IBM SPSS Statistics" page. There may also be SPSS macros available on the internet to do this. See Technote 1479694 for instructions to read a correlation matrix in text format into SPSS and then analyze that matrix with the Factor procedure.
http://www-01.ibm.com/support/docview.wss?uid=swg21477550

This is the most useful page I have ever found on this topic. If you go down to the software portion it discusses a SPSS macro to do this.

http://www.john-uebersax.com/stat/tetra.htm

#### LEMONIAS

I see that this macro is for tetrachoric correlations. I've been looking two days now how to do it on spss but i havent figured it out and its prety urgent for me to finish with this shi t. Could someone who knows run a PA and polychoric correlations with my data and send me the output? I'd be really grateful.

#### noetsi

I can't because I use a state computer. The link I sent from John Ubersox should tell you exactly how it is done. It is what I do with SAS, run what he tells you to do.

#### LEMONIAS

if you know how to convert an SPSS file into a comma-delimited vile (.csv) you can use an app i programmed on the intra-webs. sometimes it takes SPSS files if it's from an older version (say V17 or before) but I'm finding it's easier if you feeed it a .csv file.

https://psychometroscar.wordpress.com/shiny-apps-resources/
Thanks! I tried it though and get a message ''Error: non-numeric argument to binary operator''. Any ideas?

#### spunky

did you upload a .csv file like i instructed you to do?

#### LEMONIAS

Thanks very much. I did it with a sav file eventually.
Another question: Can i extend my finidings by using the factor scores in a regression model? I mean the derived factors are clear regarding which one is the outcome and which the predictor.
Is it valid to do it?