For example participant 1 filled in questionnaire 1, 4, 5 but not 3 and 2. Participant 2 filled in 2, 3, 4, 5 but not one and so forth. The items within the questionnaires are completely missing so no respondent responded to item several items within one questionnaire. Some questionnaires aren't responded by 51% of participants. Can anyone help me what would be the best approach in this situation?

I read a lot about MCAR, MAR and MNAR and would say the data is MCAR as far as I understand. Please correct me if I'm wrong. I also calculated the randomness in SPSS. I therefore went to Analyze -> Missing values analysis and opted t-tests with groups formed by indicator variables under the descriptives option and clicked EM. I thought that might be a good idea. As I said, I'm completely new to this and would really appreciate some help.

Thank you in advance.

I think that there's a lot to unpack here, since you didn't give much information as to what you are planning on doing with the data. As such, I'll just have to speak generally about your situation/data as a whole, and I'm assuming that you will be working with 3+ variables at a time (multivariate statistical analysis).

The way that I would approach this (and not saying that this is the quickest or prettiest way) is by subsetting my variables based upon participants and questionnaires of interest. This is to ensure that whatever association/analyses I conduct, missing data is omitted. Sort of like the data that goes into Multiple Correspondence Analysis (MCA), your variables can be the questionnaires themselves, but those variables have categories (the questions on those questionnaires). If a participant didn't complete a questionnaire, their data can't really be considered in any analyses

*using *that questionnaire, as all of the categorical data

*within *that variable (questionnaire) would be blank. Thus, your sample size would just be decreased upon omission. You would then have to do your own comparison on the effects of other variables before and after missing data omission to determine whether or not it's significant. If the changes

*are *significantly different in some variables, then you'd have to ask yourself whether or not those changes are relevant to your study.

If you

*did *still want to fill in missing data (again, I would

*not *suggest this), then you're really limited in what you can do. Pairwise Deletion, bivariate correlation is estimated on all dataavailable for each successive pair of study variables, really only performs well with MCAR data with a

*large *sample size and with less than or equal to 5% of the data missing. The problem then if you decide to use your Pairwise-Deleted data is that most multivariate analyses will not run on it. If you are interested as to why, Pairwise Deletion has the tendency, much greater than Likewise Deletion (another technique dealing with multivariate missing data), to result in a non-positive determinate of the variance-covariance matrix.

The most promise for multivariate missing data lies in Multivariate Imputation (MVI). I'd look into it if you

*really *can't eliminate your missing data. Disclaimer, it is sensitive to missing data bias

*and *high measurement error, increasingly so with smaller sample sizes. In this case, you input sample size would be the number of people WITH data, with the intention that MVI will output the remainder of your data.

Again, just because this method shows the most promise, that does NOT mean that you should use it here. With 51% of your data missing for some of you questionnaires/variables (and thus "x" number of categories within that variable), I would suggest data elimination and analyzing the differences between data before and after missing data elimination! Hope I was able to help in SOME way!