Repeated measures when pre/post group is not independent or dependent


I am running a pre-post study using several different instruments (e.g. 1-5 Likert scale instruments which produce a final score). It is being done to a group of employees at a hospital pre and post a new intervention approach with patients. However, as some of the questions on the instruments were sensitive (e.g., re: job satisfaction) the participants insisted that their questionnaire packages be anonymous. I now have ~25 pre packages and ~25 post packages to compare certain scales for. However, I know for a fact that my pre and post groups did not have all the same employees in them. I.e.) some people completed both (though I cannot identify which package is theirs because it is anonymous), some only completed the pre and some only completed the post.

My question is: How do I measure the pre-post differences in this group? The assumption of independent samples t-test does not work, because the samples are not entirely independent. However, the paired samples t-test also does not work because my pre and post samples do not have the exact same people in each.

Any advice with how to proceed would be greatly appreciated.

Thank you,



This is a not so uncommon issue. There are not too many conclusions that you can make. Somebody may come along and propose something, but it is EXTREMELY difficult to equate changes to the intervention.

Did you take completely random samples each time with 100% response rates?
Thank you hlsmith for your interest.

No, unfortunately. The samples were neither completely random or 100%. It was a sample of nurses, and the packages were left out for a week and they did them at their convenience. There are about ~50 nurses total and both pre and post we got around half back. I would say roughly 65% of the nurses who responded to the pre also responded to the post. (I know this for sure because we had a ballot on the back that they ripped off and entered into a jar with their name. So, I know who completed each package, just not which one is theirs.)

Unfortunately, if we were to get identifying data, this research would not have been completed at all. This was our only option.

I know that no one statistical option will be a great fit, but I am open to advice about one that would be the "best" fit?