Is there a test...

Hi there!

I have a new work project and I'm finding it challenging to select the approriate test. I'm not even sure if the right test exists so I'm hoping you can help. I'm trying to see whether our staff training was effective in improving a specific performance indicator over time. I'm not used to doing this type of analysis.

I'm looking to compare unpaired pre/post data with multiple data collections in the same group. [Essentially, supervisors submit QA data continuously for their worker bees and we want to compare before training intervention vs after.] The variable I am most interested in is dichotomous/nominal.

Oh, and my data is non-normal. :(

Everything I know of wants PAIRED data or ORDINAL/INTERVAL level data.

Am I doomed? What's a girl to do?
I'm confused as to why your data would be unpaired if supervisors provide evaluations of workers before and after training intervention...

Anyway, would a chi-square test not work?

Also, there's a bit of a cheat that is often employed, in that a dichotomous variable is in a way an ordinal variable in that the given category is present vs. not present (so we now have a direction). In the case of a multinomial (3+) nominal variable, you can dummy-code the variable into multiple binomial variables for each category on whether it is present vs. not present - again, you're turning them into ordinal variables. From there you can run non-parametric tests like the Mann Whitney U test.
Thanks for your quick reply!

I consider the data unpaired because each workerbee doesn't have the same number of data points pre and post training. Workerbee may have 3 from pre-training and 2 from post training. Does that make sense?

I held off on Chi-Square as I thought it could only be used between two or more independent groups. I figured this would not count as independent?

I may well try your cheat. Thanks for the idea!
By data points, do you mean variables, or one variable that changes from a 3-category variable pre-training to a 2-category post-training?

If the former, you can definitely run a paired test on the 2 variables that are measured both pre- and post, and ignore the 3rd variable.

If the latter, then it depends on whether there any common categories in both your pre and post. If there are, then I would suggest turning those into ordinal binomial variables (condition present vs. not present).

If the categories have nothing in common between pre and post, then you just have a bunch of apples and oranges, and you therefore have nothing to compare.

Regarding the "cheat" (which it isn't really), think of it this way: it's like turning the gender variable from "male vs. female" to "male vs. not male" or "female vs. not female", depending on whichever way you prefer to look at it.


Omega Contributor
How many data points does a typical person have pre and also post.

Check out friedmans test I think it may be the analog to repeated measures anova or maybe just nonrepeated anova, which regularly gets employed with only two group in therepeated scenario. Also what is your sample size, just because you think your data are not normal doesn't mean model residuals would be or that you cant transform them.
Thank you both.

N=2498 (which isn't bad but some workerbees are grossly oversampled...but that's what happens when it's program evaluation and you work with whatever data is available I guess). The typical workerbee has about 10 data points pre and 10 datapoints post. Some do have as many as 50 and others have as few as two. There are about 45 workerbees.

By data point, I am referring to each instance a supervisor records whether the variable in question was PRESENT vs. NOT PRESENT for their workerbee. Workerbees are reviewed constantly but irregularly. One month a Supervisor might report whether the variable was present vs. not present every other week, and another month the supervisor might review their workerbee everyweek and therefore report whether the variable was present or not present 4 times. Some workerbees have 15 datapoints prior to the training intervention and 10 post intervention (or viceversa).