Hi everyone,
I am working to validate a 10-item scale, 5-point Likert response format in a sample of 673. Trying to find a "happy medium" for all I've been hearing about what to do with missing data. I am primarily using SPSS.
My adviser wants me to delete cases missing 3 or more items (2.5% of sample) and do mean substitution for the rest of the missing data. I've heard lots of conflicting info on what to do.
Six cases are missing all 10 items. Eleven cases are missing 3-9 items.
I ran the MCAR test which said there was a pattern in the missing data, and looking at it the pattern is pretty obvious. 52 cases are missing the same two items.
Missing data is also more frequent among people of certain ethnic groups, and I need to look at a couple of other variables to see if there are also differences there.
Could someone offer me a quick step by step approach on what I should do to address the missing data? My first thoughts were to:
1. Drop cases missing 3 or more items, then
2. Use multiple imputation in SPSS which I have not formally learned but this paper is due soon! I don't know where to start with it.
I've been told that it's OK to do individual mean substitution for those respondents who are missing two items or less, but something else would have to be done with people who miss more. Is it OK to use a few different methods for dealing with missing data in the same set? Delete those that are completely missing, use mean sub for missing 2 or less, and use imputation for people who miss more? This seems too complicated though.
Any help is appreciated!!!
I am working to validate a 10-item scale, 5-point Likert response format in a sample of 673. Trying to find a "happy medium" for all I've been hearing about what to do with missing data. I am primarily using SPSS.
My adviser wants me to delete cases missing 3 or more items (2.5% of sample) and do mean substitution for the rest of the missing data. I've heard lots of conflicting info on what to do.
Six cases are missing all 10 items. Eleven cases are missing 3-9 items.
I ran the MCAR test which said there was a pattern in the missing data, and looking at it the pattern is pretty obvious. 52 cases are missing the same two items.
Missing data is also more frequent among people of certain ethnic groups, and I need to look at a couple of other variables to see if there are also differences there.
Could someone offer me a quick step by step approach on what I should do to address the missing data? My first thoughts were to:
1. Drop cases missing 3 or more items, then
2. Use multiple imputation in SPSS which I have not formally learned but this paper is due soon! I don't know where to start with it.
I've been told that it's OK to do individual mean substitution for those respondents who are missing two items or less, but something else would have to be done with people who miss more. Is it OK to use a few different methods for dealing with missing data in the same set? Delete those that are completely missing, use mean sub for missing 2 or less, and use imputation for people who miss more? This seems too complicated though.
Any help is appreciated!!!