+ Reply to Thread
Results 1 to 7 of 7

Thread: Missing Data Analysis Question

  1. #1
    Points: 46, Level: 1
    Level completed: 92%, Points required for next Level: 4

    Posts
    6
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Missing Data Analysis Question



    Hi gurus,

    I really appreciate some advise on handling missing values.

    Briefly, subjects rated 21 objects (obj 1 ...obj 21) on 14 attributes (attr1...attr14). Given the size of the questionnaire (21*14), we split the questionnaire into two parts:

    N1 (about 150) subjects rated Obj1..Obj11 on all 14 attributes.
    N2 (about 120) subjects rated Obj12...Obj 21 on all 14 attributes.

    Both N1 and N2 come from the same population (based on Ch-squared tests on different demographic variables).

    Our Goal: Perform Exploratory factor Analysis and obtain factor scores on all 21 Objects.

    My questions are:

    Q1. I basically need a complete correlation matrix (21 objects * 14 attributes). About 50% of the data is missing. What kind of imputation should I use to generate the other 50% of missing data?

    Q2. Can you recommend a text or journal article which discusses modern missing data analysis procedures - I'm OK with linear algebra and basic stat concepts and can handle notation?

    Q3. Can you point me to a programming manual/tutorial for SPSS?

    thanks in advance,

    statpuzz

  2. #2
    TS Contributor
    Points: 6,701, Level: 53
    Level completed: 76%, Points required for next Level: 49
    Lazar's Avatar
    Location
    Sydney
    Posts
    675
    Thanks
    111
    Thanked 167 Times in 152 Posts

    Re: Missing Data Analysis Question

    I think I may be the bearer of bad news here. I dont think you will be able to impute data here as there is no overlap at all between the two groups and hence no missing data model that could be built. Missing by design is common but not the way you have it. Typically there is always some overlap that can be used to build a missing data model (see Craig Enders, 2010 Applied Missing Data Analysis book).

  3. The Following User Says Thank You to Lazar For This Useful Post:

    statpuzz (10-16-2012)

  4. #3
    Points: 46, Level: 1
    Level completed: 92%, Points required for next Level: 4

    Posts
    6
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Missing Data Analysis Question

    Thanks for the response.

    I do have some common data (about 8 items on demographics) across the two groups. Will that help?

    Statpuzz
    Last edited by statpuzz; 10-16-2012 at 05:14 PM.

  5. #4
    TS Contributor
    Points: 6,701, Level: 53
    Level completed: 76%, Points required for next Level: 49
    Lazar's Avatar
    Location
    Sydney
    Posts
    675
    Thanks
    111
    Thanked 167 Times in 152 Posts

    Re: Missing Data Analysis Question

    It is better than nothing (how much better than nothing will depend on how strongly associated those demographics are to the ratings of the objects). I think you could give it a go and see if you are successful but check iteration plots (for means and standard deviations) and convergence carefully to see if the results are sensible. Even if the results are ok you may have to be prepared through for extremely large standard errors given the large amount of uncertainty there will be in your missing data model.

    What I would think about doing if you have a chance is to collect a third sample that rated objects 5 to 15 or something thus giving you the overlap you need.

  6. The Following User Says Thank You to Lazar For This Useful Post:

    statpuzz (10-16-2012)

  7. #5
    Points: 46, Level: 1
    Level completed: 92%, Points required for next Level: 4

    Posts
    6
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Missing Data Analysis Question

    Thanks again Lazar,

    On reflection, we plan to collect new data.

    Based on the "missing by design" phrase in your comment, I started searching for published papers which deal with split-questionnaire designs (SQD) and found some. I am trying to download and read some of the papers.

    I came across a specific planned missing design called 3-form design (Graham, 2006). If you have some insights, I would appreciate if you can share them.

    Statpuzz

  8. #6
    Test of Gnomality
    Points: 8,883, Level: 63
    Level completed: 45%, Points required for next Level: 167
    hlsmith's Avatar
    Posts
    1,647
    Thanks
    106
    Thanked 276 Times in 269 Posts

    Re: Missing Data Analysis Question

    Were the two original samples large enough to find statistically significant difference between them for the demographic? That always seems to be a point of concern with sample sizes in regards to showing reflective samples when you don't want significance, is there power.

    However, the listed plan seems like a plausible solution to attempt to bridge the two set and answer some questions.

  9. #7
    Points: 46, Level: 1
    Level completed: 92%, Points required for next Level: 4

    Posts
    6
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Missing Data Analysis Question


    Hlsmith,

    The sample sizes were 150 and 120 for the two samples - good enough to conclude that they were not significantly different.

    The issue is that the data was collected in 2007 and deals with perceptions/attitudes which change quite a bit. If I collected additional data on a subset of objects overlapping the two samples as Lazar suggested, I guess I will have to make a strong assumption that partial correlations are the same in 2012, even if means have changed.

    On more pragmatic grounds, the set up cost for conducting a survey is high, but marginal cost/subject is not really high. Thus, I thought I might as well collect a new sample. Still planning..

+ Reply to Thread

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats