Replacing missing data with EM

Hello everyone,

For my undergraduate dissertation in Psychology I have conducted research testing the executive function (EF) skills of 40 children, gaining five EF scores. Several children did not complete all the tests, so I have some missing data (13.75% of values).

I have been browsing the web for appropriate replacement methods - as my sample is so small I cannot afford deleting the values - and came across the EM (expectation-maximization) method. As my values are MCAR (missing completely at random), this seemed like a good solution, as it is more accurate than replacing mean or using regression (right?).

So, I have run the required SPSS procedures and have a dataset with my replaced scores, but do face a puzzle now: Firstly, how can I 'test' whether these scores are actually accurate? Or do I just assume that they are? Secondly, and most importantly, what do I need to report in my write-up for this procedure? Is it enough to write that this method was used to replace missing data, or do I need to report some coefficients etc.?

I'd be grateful for some advice!
Many thanks!


Probably A Mammal
There is no way to test the accuracy of replacement because you don't know what should go there, and you can really use quite a number of methods to fill in the holes, as it were. I've heard of EM but never looked into it, and it is hard to say what method is 'right'. I'd report what method you're doing and maybe some descriptive statistics of pre and post changes to your data to give an idea about how your method changed. I'd also probably use a number of methods and argue to defend the choice of method I use (or run the entire analysis and use the alternatives, including no filling, as a sensitivity analysis).