*specific*category over the other overall. It seems like I'm maybe looking to test whether there are two modes in the distribution, for picking one category over the other?

- Thread starter lightray
- Start date

There is a collection of subjects (say 40). Each subject answers 20 questions which are scored (effectively) as A or B according to the category. If things were really random, you would expect 10 A and 10 B on average. You suspect that in fact subjects will tend to be something more like either 15 A 5 B or 5 A 15 B. How do we test that it is unreasonable assume random A and B given the data?

You need a measure of how spread out the data is. For instance if subjects tended to like one group over the other you might get data like 15 16 4 5 17 3..., whereas if there was no connection the data may be more like 7 11 9 8 12 ... So a suitable measure might be the variance of the subject scores - a high variance means your theory is correct, a low variance means probably random.

The main problem is how high is high?

Some reader may have a variance test to suggest but here is one way of doing a permutation test -

1 Collate the data, find the score (number of As perhaps) for each subject, find the variance of these scores. Write it down.

2 Now mix up all the data and allocate it to the subjects at random. Collate and find the variance. Record. Repeat 1000 times or so.

3 Find the percentile of the variance from 1 in the set of variances from 2. This is your p value.