# SPSS Factor analysis - Should I use all variables or can I choose the variables and create constructs?

#### SkyXY

##### New Member
Hi,

I think I have a foundamental problem in understanding the factor analysis. I want to do a regression analysis looking into the relationship between Company X financial indicators and their Investment behaviour for the last 10 years. Since I have a lot of financial ratios, I would like to group them. For example ROE, ROI, ROS, Profit Margin all belong to the construct "Profitability". For the construct "Investment", the dependent variable, I would like to group capital expenditure, R&D investment and Investment capital cash flow. The goal should be to find out what influences the Investment the most and how is it being influenced.

In order to group, I would like to use factor analysis in SPSS and only choose the variables I think belong together and create a new variable (construct). Then I would look at the KMO & Bartlett Test in order to judge the quality of the constructs. If it's qualitative enough, l want to do a corelation analysis afterwards and then the multiple Regression.

Unfortunately, the more I read about factor analysis, the more confused I get. As far as I understand, there is Explorative Factor Analysis and Confirmative Factor Analysis. SPSS can only do EFA. But EFA is being used to find hidden relationship between the variables without having prior ideas or knowledge about these hidden relationships. CFA is being used to create a construct beforehand and then examine whether the data fit the construct. Is this right so far?

It seems there is also a practical difference in doing both, since SPSS can only do EFA. So is it possible to do what I want, using EFA to group the financial indicators and manually choose which of the variables I put together or do I HAVE to do the EFA with all variables.

BR

Last edited:

#### spunky

##### Doesn't actually exist
It seems there is also a practical difference in doing both, since SPSS can only do EFA. So is it possible to do what I want, using EFA to group the financial indicators and manually choose which of the variables I put together or do I HAVE to do the EFA with all variables.
If you're using SPSS (restricted to EFA only) then you HAVE to use all the variables and let the algorithm do what it wants. If you want to tell a priori which observed variables should be an indicator of which latent variable/construct then you're talking about Confirmatory Factor Analysis (CFA).

#### SkyXY

##### New Member
If you're using SPSS (restricted to EFA only) then you HAVE to use all the variables and let the algorithm do what it wants. If you want to tell a priori which observed variables should be an indicator of which latent variable/construct then you're talking about Confirmatory Factor Analysis (CFA).
This is also what confuses me regarding SPSS. EFA should be used with the whole data set, alright. So, if I want to build constructs beforehand I have to use CFA? Then what about Principal Component Analysis (PCA)? As far as I understand, it's two different types of analysis and it doesn't require the whole data set right? But in SPSS PCA seems to be part of the section: Factor Analysis. So can I use PCA instead to create constructs? Let's assume I will do a correlation analysis beforehand to verify the relationship between the variables used for each construct of course.

Otherwise, let's assume I use all variables, then I use these factors for multiple regression. How do I interpret the result afterwards, since I don't really know what these factors represent?

#### spunky

##### Doesn't actually exist
So, if I want to build constructs beforehand I have to use CFA?
Correct

Then what about Principal Component Analysis (PCA)? As far as I understand, it's two different types of analysis and it doesn't require the whole data set right?
Related but not the same analysis, true. PCA is used for dimension-reduction of the covariance matrix but it doesn't posit a statistical model for the data. And whether or not in requires "the whole data set" depends on what you're trying to do. Notice that you *cannot* use SPSS to tell in advance to the PCA algorithm which indicators should make up which constructs.

But in SPSS PCA seems to be part of the section: Factor Analysis. So can I use PCA instead to create constructs? Let's assume I will do a correlation analysis beforehand to verify the relationship between the variables used for each construct of course.
This gets a little bit into the weeds of whether you assume formative VS reflective indicators. Formative indicators are variance-based ones whereas reflective ones are covariance-based. And whether you're interested in one or the others depends on your research hypothesis/theoretical framework.

Otherwise, let's assume I use all variables, then I use these factors for multiple regression. How do I interpret the result afterwards, since I don't really know what these factors represent?
Two things. (a) PCA regression is a known and trusted technique. Sticking factor scores in a multiple regression model is a bad idea most of the time because they are NOT invariant under rotation so you can always rotate the scores until you get the set of values that gives you the results that you want. (b) You... don't interpret. You merely describe and use that as a pointer for future research so that you (or other people) can gather more data and use the results that you offered as pointers for a more confirmatory model.

Btw, just as a quick side note, I don't really think you can do what you're trying to do using SPSS. You'll probably need a latent variable software like Mplus (for covariance-based analyses) or Smart-PLS (for variance-based analyses). Or the statistical programming language R (which is free).

#### SkyXY

##### New Member
Well,, I would like to research the enterprise transformation of Company X. Enterprise transformation is mostly enabled by it's increasing investments in the last decade. So, I would like to use statistical analysis to find out the relationship between financial indicators and the investment. But since I would have a lot of financial indicators mainly pointing to the same things e.g. profitability (ROE, ROIC, ROA etc.), I would like to reduce the amount of variables and group those who should fit together in theory (Profitability ratios, Liquidity ratios, Efficiency ratios etc.). The hypothesis would be for example like the higher the profitablity, the more does Company X invest in R&D since it got more capital to spend. So would you rather recommend CFA for this instance?
Sorry for this kind of question but I'm kind of new to this.