+ Reply to Thread
Results 1 to 15 of 15

Thread: How to obtain individual factor scores after PCAs on multple imputed datsets???

  1. #1
    Points: 476, Level: 9
    Level completed: 52%, Points required for next Level: 24

    Posts
    8
    Thanks
    0
    Thanked 2 Times in 2 Posts

    How to obtain individual factor scores after PCAs on multple imputed datsets???




    Hello,

    I really hope, someone here can help me, I've searched the internet for hours and could not get any clue on how to solve my problem.
    I want to do a principal component analysis on my dataset to combine my variables into a few components. The main analysis is, however, not the PCA, but a logistic regression with the different PCA factors as predictors. This means, I need to compute all the individual factor scores before I can proceed with the logistic regression. The problem is, that there are a lot of missings in my dataset and because of that, I cannot simply use listwise deletion because this would reduce my N too much. Therefore, I have to use an imputation method for handling my missing data. My method of choice would be a Multiple imputation method, and I've already generated 5 imputed data sets. BUT WHAT NOW??? In the literature on MI methods, it is usually recommended to combine the results of the main analysis, so I could do 5 PCAs on my five imputed datasets, but then I need to calculate individual factor scores to continue my analysis and do the logistic regression. And here's my dilemma: How can I do that with multiple imputed datasets?! I would have to combine the estimated missings from the 5 datasets directly, but this is not the usual way to analyse imputed datasets. The idea to model the uncertainty would get lost somehow, but I really do not know what to do or how to combine the individual estimates of missing values cases if I would try it - should I just calculate the means of each individual case out of the 5 different values which I obtained after the multiple imputation?!

    If it were possible to create one single dataset out of the 5 imputed ones I could - theoretically - do all the analysis including the PCA just on that one data set...

    Have you any, ANY ideas how to solve this problems?!

    Greetings,
    Marina

  2. #2
    Points: 476, Level: 9
    Level completed: 52%, Points required for next Level: 24

    Posts
    8
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Thank you all for having read this awfully long and much too complicated piece of writing ....but I have finally, FINALLY, after hours and hours of research found a SOLUTION (and it isn't really that complicated as I thought....) so consider this TOPIC as SOLVED! Thx

  3. #3
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Would you mind sharing your solution?

  4. #4
    Points: 476, Level: 9
    Level completed: 52%, Points required for next Level: 24

    Posts
    8
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    I've changed my general approach a little bit. Insteadt of using MI for doing the PCA I've used an EM-Algorithm to estimate just the covariances/correlations of my original sample directly, without making a detour over imputing missings. Then, I used the estimated correlation matrix as direct input into a single PCA. With the resulting factors I'm indeed calculating factor scores for every single of my five imputed data sets, followed by five logistic regressions. Finally, I'm combining the results of all five logistic regressions. That's it and I think it's statistically the best solution. Originally I had wished to use a Full Information Maximum Likelihood Estimation Algorithm for estimating my covariances but I couldn't find a properly intergrated syntax or program so I used the EM-Estimation option in SPSS (without imputing here, because those imputations are biased) - for imputing I used an R library (Amalia).+

    Greetings,
    Marina

  5. The Following User Says Thank You to marbar For This Useful Post:

    mrtwino (09-26-2013)

  6. #5
    Points: 1,573, Level: 22
    Level completed: 73%, Points required for next Level: 27

    Posts
    24
    Thanks
    6
    Thanked 0 Times in 0 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Quote Originally Posted by marbar View Post
    I've changed my general approach a little bit. Insteadt of using MI for doing the PCA I've used an EM-Algorithm to estimate just the covariances/correlations of my original sample directly, without making a detour over imputing missings. Then, I used the estimated correlation matrix as direct input into a single PCA. With the resulting factors I'm indeed calculating factor scores for every single of my five imputed data sets, followed by five logistic regressions. Finally, I'm combining the results of all five logistic regressions. That's it and I think it's statistically the best solution. Originally I had wished to use a Full Information Maximum Likelihood Estimation Algorithm for estimating my covariances but I couldn't find a properly intergrated syntax or program so I used the EM-Estimation option in SPSS (without imputing here, because those imputations are biased) - for imputing I used an R library (Amalia).+

    Greetings,
    Marina
    Hi Marina and many thanks for this brilliant idea! However I have some questions pertaining to it which I would be most grateful if you could help me solve it.

    1) Could you please explain what do you mean about EM-Algorithm to estimate just the covariances/correlations of my original sample directly. Could you please descripe the steps in SPSS or a form of spss syntax in order to be able to do it
    2) How you used the estimated correlation matrix as a direct input to PCA

    Thank you so much of your support on this

    Christos

  7. #6
    Phineas Packard
    Points: 16,013, Level: 81
    Level completed: 33%, Points required for next Level: 337
    Lazar's Avatar
    Location
    Sydney
    Posts
    1,159
    Thanks
    198
    Thanked 336 Times in 299 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Two simplier ways to do it:

    1. Mplus will do what you want in a single step using either MI or FIML.
    2. You do a PCA on each data set individually and treat the resulting factor scores as plausable values (i.e. each data set contains slightly different PCA results; they should be fairly similar if your missing data model is efficent). You then run your logistic regression and only then combine the results from the imputed data sets.

  8. The Following User Says Thank You to Lazar For This Useful Post:

    triunk (07-18-2012)

  9. #7
    Points: 1,573, Level: 22
    Level completed: 73%, Points required for next Level: 27

    Posts
    24
    Thanks
    6
    Thanked 0 Times in 0 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Thank you so much for your reply.

    one silly question: If my only purpose is to do PCA in a complete data set and just stop to the extraction and interpretation of factors should I do logistic regression???

    Many thanks

    C

  10. #8
    Phineas Packard
    Points: 16,013, Level: 81
    Level completed: 33%, Points required for next Level: 337
    Lazar's Avatar
    Location
    Sydney
    Posts
    1,159
    Thanks
    198
    Thanked 336 Times in 299 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    This is somewhat tough. It depends on how many factors you pull and how well defined the factor structure is and (most importantly) how much missing data you have and how effective your missing data model is. In most cases I would think that if you imputed datasets gave you wildly different results, such that it was not possible to integrate the findings, then this suggests you have to go back to the drawing board and improve your missing data model or give up in defeat.

    You can help everything along a little by a) having a clear idea about the number of factors you want to extract and what the likely factor structure is and; b) using some form of target rotation so that you are giving the imputations less chance to diverge from each other. Of course the easist thing is to let Mplus do it for you or SPSS. SPSS in recent versions allows for multiple imputations and should automatically combine results for most analyses (I dont use SPSS much so can not be sure what it does this for and what it doesn't).

    On a side note I think FIML is the way to go where possible. You will not have these problems of integrating multiple datasets with FIML.

  11. #9
    Points: 1,573, Level: 22
    Level completed: 73%, Points required for next Level: 27

    Posts
    24
    Thanks
    6
    Thanked 0 Times in 0 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Quote Originally Posted by Lazar View Post
    This is somewhat tough. It depends on how many factors you pull and how well defined the factor structure is and (most importantly) how much missing data you have and how effective your missing data model is. In most cases I would think that if you imputed datasets gave you wildly different results, such that it was not possible to integrate the findings, then this suggests you have to go back to the drawing board and improve your missing data model or give up in defeat.

    You can help everything along a little by a) having a clear idea about the number of factors you want to extract and what the likely factor structure is and; b) using some form of target rotation so that you are giving the imputations less chance to diverge from each other. Of course the easist thing is to let Mplus do it for you or SPSS. SPSS in recent versions allows for multiple imputations and should automatically combine results for most analyses (I dont use SPSS much so can not be sure what it does this for and what it doesn't).

    On a side note I think FIML is the way to go where possible. You will not have these problems of integrating multiple datasets with FIML.
    Thank you very much for this detailed answer. To be honest I used Amelia programs which uses something like EM to impute the missing data...but instead of one solution it gives 5...imputed data sets...the reason I did that was because I have likert-scale with ordinal data...so...I could not do FIML nor multiple imputation (at least so quickly) so I did something in the middle... now my problem was how to combine these results to do PCA...to finish this part of the analysis...

    Again thank you soooooooo much for this reply. I really appreciate it.

    Chris

  12. #10
    Phineas Packard
    Points: 16,013, Level: 81
    Level completed: 33%, Points required for next Level: 337
    Lazar's Avatar
    Location
    Sydney
    Posts
    1,159
    Thanks
    198
    Thanked 336 Times in 299 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Both FIML and MI will work perfectly fine with likert ordinal scales. You can either assume an underlying continous variable (in which case both MI and FIML will be relatively fast) or directly impute ordinal data (in which case you will need a relatively small missing data model and be willing to watch Amelia or what ever grind away in the background for some time).

    I have left Amelia running for days at a time!

  13. #11
    Points: 1,573, Level: 22
    Level completed: 73%, Points required for next Level: 27

    Posts
    24
    Thanks
    6
    Thanked 0 Times in 0 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Quote Originally Posted by Lazar View Post
    Both FIML and MI will work perfectly fine with likert ordinal scales. You can either assume an underlying continous variable (in which case both MI and FIML will be relatively fast) or directly impute ordinal data (in which case you will need a relatively small missing data model and be willing to watch Amelia or what ever grind away in the background for some time).

    I have left Amelia running for days at a time!
    Weeeeeeeell...Regarding FIML I know only AMOS. Amos assumes that everything is continuous and normally distributed, so if you want to include dummy variables, you have to dummy code them in SPSS before you attach
    them in AMOS. So I wanted to avoid doing this analysis and I choose Amelia...but I guess now...this consumed more time than I have anticipated...

  14. #12
    Phineas Packard
    Points: 16,013, Level: 81
    Level completed: 33%, Points required for next Level: 337
    Lazar's Avatar
    Location
    Sydney
    Posts
    1,159
    Thanks
    198
    Thanked 336 Times in 299 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    This is because AMOS is a piece of crap If you are using Amelia you know R in which case I would suggest Lavaan, SEM, or even the Open MX packages over Amos.

  15. #13
    Points: 1,573, Level: 22
    Level completed: 73%, Points required for next Level: 27

    Posts
    24
    Thanks
    6
    Thanked 0 Times in 0 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Actully no..I only used ameliaview...which was more friendly to me...to use...but I will look for the other open softwares...or Rtools

    Many thanks anyway for this insightfull conversation!

  16. #14
    Points: 2, Level: 1
    Level completed: 3%, Points required for next Level: 48

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Angry Re: How to obtain individual factor scores after PCAs on multple imputed datsets???

    Hey!

    i have a similar problem and can´t seem to find a solution. Maybe someone can help me!!

    I have to do validate a questionnaire, so I´m doing correlations (with other questionnaires) and a PCA. I did Multiple Imputation with SPSS, so now I have the original Data and 5 Imputations. Correlations weren´t a problem with multiple imputed Data in SPSS.

    NOW: My problem is, SPSS doesn´t offer PCA for imputed data, so I don´t get a "pooled" result!

    Is there any other way for me to do it? oder do I have to handle my Missing Data for the PCA some other way?

    If some one has any advise what to do and how to do it, I would be very grateful!!!

    Thanks in advance!

    Susanna

  17. #15
    Points: 4, Level: 1
    Level completed: 7%, Points required for next Level: 46

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: How to obtain individual factor scores after PCAs on multple imputed datsets???


    This is how I did it:

    1. Create dummies for Imputation_

    2. Put your factor analysis syntax between the "filter by" commands.

    3. Run 5 factor analyses and save the factor scores 5 times, each time filtered by
    another imputed data set. Make sure you rename the names of the factor scores each time to prevent confusion.

    4. Sort the data set by respondent id.

    5. Cut-paste the columns with factor scores 1 by 1 to create a horizontal line with factor scores instead of a diagonal one.

    6. Take the mean of the 5 factor scores for each individual with the mean.5 syntax so that no missing values are allowed.

    7. Now you have N averaged factor scores and 5*N missing factor scores, but that's OK because every respondent has a mean factor score now and the 5 duplicate respondents will be deleted listwise anyway in subsequent analyses.

    Please let me know if anything is unclear or if you know a more efficient way.

    Kind regards,

    Paul Tromp
    Research Master in Social and Behavioral Sciences, Tilburg University

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats