How to test for differences in binary score data

I have data from a human subject study that was setup as a 3x2 (system by task type) experiment, with 24 participants. Each participant completed 6 tasks in all, one using each of the 3 systems, and 2 task types. I have two performance measures: the first is completion time, and the second is a human assigned completion score. The score is either 1 if the task was completed correctly or 0 if the task was completed incorrectly

I am interested in seeing in there are differences between systems overall, and within each of the task types. So, for the time data I have used a basic RM-ANOVA. However, I am puzzled about how I should treat the score data. On the one hand it is binary score, and because there is only one trial in each system by task type combination the distribution can't be normal. On the other hand, it is score data, such as the results from a true and false test, and so why should it be treated differently than if score had 100 questions (in which case I could run and RM-ANOVA, right?).

I think I missing a basic assumption here.