# Agreement statistic for 2x2 matrix with one structural zero

#### jwallin

##### New Member
I have two observers who watched a series of videos looking for gestures. They recorded the onset-time of any gesture they observed. An agreement was scored when the two observers recorded a gesture onset within 1-second of the other observer. A disagreement was scored when the one observer recorded a gesture onset, but the other did not.

The agreement matrix looks like this:

******| Present | Absent |
----------------------------
Present | 290 | 373 |
-----------------------------
Absent | 67 | 0 |
----------------------------

Clearly the observers have differing opinions about what constitutes a gesture!

Percentage agreement is calculated easily enough: 290/(290 + 373 + 67) = 39% (Miserable agreement, yes, but that's the point of this portion of the paper.)

But, I'm trying to include kappa statistics in addition to percentage agreements in this paper. This has worked fine for other tables, and I'm learning a good bit about the statistic, but this table has that structural zero in the bottom right corner and it's giving me grief. It is logically impossible for the value to be anything other than a zero, because a record only exists in the cases where one observer or the other (or both) observed a gesture. Am I right to assume that the traditional kappa statistic would be invalid? It doesn't seem right to calculate an expected value for a cell that logically must be zero.

Granted, I could consider every single second in the video record where neither observer recorded as gesture as an agreement on "no gesture present," and enter this data into the lower right cell. However, this would artificially inflate agreement, though, as gestures were relatively rare. [Additionally, it's probably not a sound assumption, as some gestures take more than a single second to execute, so observations over subsequent seconds are not particularly independent of one another.]

Is there a way to calculate a kappa (or kappa-like) statistic for a 2x2 matrix with a single structural zero?

Last edited:

#### spunky

##### Doesn't actually exist
Is there a way to calculate a kappa (or kappa-like) statistic for a 2x2 matrix with a single structural zero?
uhmm... just me throwing ideas around here... just how much do you need that agreement statistic to be there in your paper? thing is structural zeroes are not the easiest ones to handle (especially yours which is in the main diagnonal) so you're porbably gonna have to throw at it some log-linear modeling and, in all honesty, i only know how to solve this kind of problems through monte carlo simulations...

dealing with those things can get so complicated sometimes that i've read books where the authors simply say that the researcher should choose a (small) number and just put it there...

kudos to you, however, for noticing you simply cannot do a cohen's kappa or cramer's V or phi coefficient or whatever people want

#### jwallin

##### New Member
Spunky,

Thanks! I'm glad to know there's not an easy way of doing this staring me in the face

I don't think it's necessary that I pursue this terribly far. My adviser usually only reports percentage agreement for tasks of this kind, anyhow, I was just exploring the kappa statistics to learn a little more about them. For the purposes of this paper, I think it will probably be enough to mention why the kappa is inappropriate and move on.

Someone somewhere in my search last night (Bakeman and Richardson, 1994, maybe) recommended an iterative proportional fitting model that might accommodate contingency tables with zeros, but I'm not sure if it's what I'm looking for.

#### spunky

##### Doesn't actually exist
Someone somewhere in my search last night (Bakeman and Richardson, 1994, maybe) recommended an iterative proportional fitting model that might accommodate contingency tables with zeros, but I'm not sure if it's what I'm looking for.
I think you mean Bakeman and ROBINSON's Log-Linear analysis' book, right? that's actually a really good book if you're interested in this whole contingency tables analysis situation... and yeah, it would definitely help you understand how to deal with that zero over there. but it is kind of what i was getting at on my previous post... doing log-linear analysis is substantially more complicated than just calculating the kappa statistic. i'm not saying you shouldn't do it... it's just that i know some people dont feel all that comfy around stats and i dont wanna go around saying "yeaaah.. you just plug everything on SPSS, click here and there and ta'da! there's your solution".
it depends on how deep you wanna get into this or how much stats effort you're willing to put on it.

anyways, those are my 2 cents for your post here...

#### jwallin

##### New Member
Bakeman and Robinson, of course. Thanks for the correction. Just ordered the book from Abebooks. Thanks again for your help!

#### lumhearts

##### New Member
Can I ask - why do you think the kappa is inappropriate? Are you basing that on the chi-square assumption? I can't say I've had a simple kappa situation with a zero cell count.

#### spunky

##### Doesn't actually exist
I can't say I've had a simple kappa situation with a zero cell count.
the probelm is that jwallin's research design is set up so that a zero has to be there. i'm sure he/she can explain his/her research design better (i'm personally a little bit confused by it) but he/she claimed that this is a structural zero, as opposed to an expected or sample zero which usually arises just because of sampling variability. the problem with structural zeroes is that, to the best of my understanding, you cant really invoke any parametric properties to try and deal with them... although they can still mask an association (or lack of thereof). i'm sure he/she can explain better why this is a structural zero, if it really is one.

thanks!

#### jwallin

##### New Member
Spunky's addressed it pretty well, I think. Here's a little clarification (I hope!).

I had two observers watching several hours of video tapes. Each observer would watch for what they believed to be a gesture, which we'd defined as a non-locomotor movement of the arms or head that consisted of a discrete "excursion" away from a rest position and a return to a rest position.

Whenever they observed a gesture, the observers would record the onset time of that gesture (the start of the excursion away from the rest position).

So I have data that look something like this (data are video timecodes in HH:MM:SS format):

Observer 1:

00:01:30
00:02:54
00:03:01
00:03:23
etc.

Observer 2:
00:00:30
00:01:30
00:02:54
00:03:23
00:03:50
etc.

With these values, we'd have the following agreements/disagreements:

00:00:30 disagreement (only one observer recorded a gesture here)
00:01:30 agreement (both observers recorded a gesture here)
00:02:54 agreement
00:03:01 disagreement
00:03:23 agreement
00:03:50 disagreement

The agreement matrix would look like:

_______| Present | Absent|
Present |__ 3 __|__ 1 __|
Absent |___2___|___0___|

That bottom-right cell will always be a zero, because this is an instance of occurrences-only agreement. We only have records of those timecodes when one or both observers claim to have observed a gesture. There can never be an instance where both observers recorded "no gesture."

(Again, I COULD consider every second that the observers did not record a gesture as an agreement on "no gesture," but that will exaggerate agreement, which is obviously quite poor for actually seeing gestures similarly, and I don't particularly care about those instances where they both see nothing going on.)