Chance level binomial, not 1/n

I have some data which I am going to performed a mixed ANOVA on, however, I think I might need to transform the raw data, but am not sure how to. I've asked my dissertation supervisor about this, but he doesn't have a clue, so it's up to me to figure this out!

My data comes from an experiment in which the response is a number between 0 and 4, which will either be correct or incorrect. My problem is that the chance of guessing the correct answer due to chance follows a binomial distribution (i.e. so if an individual consistently chooses the number '2', they are likely to score quite well).

The ANOVA isn't being performed on this raw data - I need to transform it into some kind of score (i.e. how many trials the individual was correct on), so it's this transformation I'm stuck on.

Is there a simple way of converting my raw data so that it take chance into account?



Ambassador to the humans
What are your independent variables? It doesn't sound like you really want an ANOVA. Maybe either some sort of generalized linear model or maybe just a simple chi-square test.


New Member
I also don't see what u need and why u need it. Its better if u elaborate more on ur experimental design. and what aspect/hypothesis you are looking at.


TS Contributor
Sounds like if you just take the proportion of correct answers for each subject you will be fine.

The mixed anova will work well so long as their are a large number of questions for each subject.

I say this because each subject will provide an observation, say Y, that is distributed as a sample proportion is, for sample size N, with var pq/N. This variance ought to be consistently estimated by Mixed anova.

Another and probably beeter obtion is just to do some kind of categorical analysis, as Dason suggests. As pyg suggests the exact test used will depend on what the ANOVA factors are.

Thanks for the replies :)

I think I need to clarify what I'm actually doing, so here goes:

There are 2 between-participants factors that are categorical, and one within-participants factor, also categorical.

The within factor is difficulty, and participants are given multiple tests within each difficulty, marked on a pass/fail basis. I was going to perform an ANOVA on the number of tests they pass for each difficulty.

However, each test is made up of 5 questions, and I thought that my results might be messed up by someone randomly guessing '2' each time, as it's the most likely answer.