A measure for agreement between raters scoring

Hi -

I'm looking for a measure for the level of agreement between a number of raters who score performance of something on 1-4 scoring range.

So, for example, we have 10 raters, who have each independently assessed something:

1. 4
2. 4
3. 3
4. 3
5. 2
6. 3
7. 4
8. 4
9. 3
10. 3

Any help for how to calculate the level of agreement?


Less is more. Stay pure. Stay poor.
(Fleischman's) multi-rater kappa seems right. Depending on your background - I have used a version I found for Excel and also have also used a SAS macro. Sure many other options are available.


TS Contributor
Another option is Kendall's Coefficient of Concordance. Note: Kendall's is only used for ordinal ratings, whereas Kappa may be used for nominal or ordinal. The main difference between Kendall's and Kappa is that Kendall's takes into consideration "near misses". That is, kappa only cares if you have a perfect match between categories while Kendall's gives "partial credit" if you miss, but are close. For example, if two raters rate the same item as 4, both Kappa and Kendall give "full credit". If the ratings are 1 and 4, neither give "credit". However, if the ratings are 3 and 4, Kendall's gives "partial credit" while Kappa gives none.


Less is more. Stay pure. Stay poor.
Unfortunately with the ever growing nature of the WWW, I could not find a link for the Excel file. But hopefully it may help in your searches - it was for: Fleiss' Generalized Kappa