Subscale CFA


An existing questionnaire I am using (23 items, supposedly 2 subscales, N = 700) has shown poor fit to the 2-factor model as specified by its designers (confirmatory factor analysis yielded a CFI and TLI both <.70).

Subsequently, I performed a CFA on the items of the subscale I am most interested in, removing two items that contributed very little variance (on a 3-points scale, ~99% scored 0, the remaing ~1% scored 1). I fitted a one factor model on this subscale, which yielded acceptable fit indices (CFI and TLI >.90, RMSEA = .027).

What I am wondering now:
1. Can I justify removing these two items due to extremely little variance?
2. Is it valid to use this subscale in subsequent analyses, after the CFA on the entire questionnaire yielded a bad fit?

After a lot of googling, I have failed to come up with satisfactory answers, so all help is much appreciated!

If you could provide some sources/precedents for your answers, that would be awesome:cool:

Thanks in advance!

First off, you have a very nice sample size. Has this scale been validated by research that is not connected to the scale author(s)?
If a stringent validation of scale content has been done then your reported CFA can be incredibly insightful. If there is very little theory or previous underlying validity then you would be best to take an Exploratory Factor Analysis (EFA) approach with the subset of your choice.
To answer your questions:
1. You can justify removing the two items if the loadings are weak or non existent when you use an EFA. If you use a CFA then you should keep include the two-items as they should not change your model fit.
2. It is valid to use the subscale if your previous CFA was not a good fit.
The bottom line is, if the scale was developed using strong theory and has been validated then using a CFA on the subset is safe but keep the two items. If the scale is lacking in theoretical foundation and prior validity then an EFA approach would be better to take and depending on results the two items might end up being cut anyway. Furthermore, if the scale has not been widely validated then you should assess the reliability using Cronbach's alpha. In SPSS specifically, you can find how reliability would change if you omitted those two items in the subset.

If you're unfamiliar with EFA vs CFA this is a good link to another question posted online:
Last edited:
Thank you very much for your advice!

The specified 2-factor structure has been validated in multiple populations, by different authors, which is why I decided to initially go with a CFA. However, it has not yet been used in the current population, which may provide an argument for EFA.
When I perform a CFA on the subscale without removing the two items, the CFI drops to .808, and TLI drops to .76, so CFA is not feasible then.

Performing an EFA using the covariance matrix unfortunately yields very unclear results, with low explained variance, small factor loadings and multiple cross-loadings. Moreover, the EFA solutions do not make any conceptual sense, so I am afraid the questionnaire just isn't suitable for the current population.

However, thanks once more for your insightful advice.