which statistical test to use - Observed difference in outcome is due to chance?

#1
Hi,
I have results from two very sensitive but different lab test trying to find pathogen in a patient sample. For some patients, results from two test are not identical (listed below).

What statistical test to perform to find out the odds that the difference observed between two test is due to chance?

Patient ID Test1 Test2
1 Pos. Neg.
2 Pos. Neg.
3 Pos. Pos.
4 Pos. Neg.
5 Neg. Pos.
6 Pos. Neg.
7 Neg. Neg.
8 Neg. Neg.
9 Neg. Neg.
10 Neg. Neg.
11 Neg. Pos.
12 Pos. Pos.
13 Pos. Neg.
14 Pos. Pos.
15 Pos. Neg.
16 Neg. Neg.
17 Pos. Pos.
18 Pos. Neg.
19 Pos. Neg.

Thanks
 

CowboyBear

Super Moderator
#2
What statistical test to perform to find out the odds that the difference observed between two test is due to chance?
Can you expand on what you mean by this? Don't worry about statistical terminology - what is it that you're actually trying to find out here?
 
#3
We are trying to find out that the difference in the results (Pos,Neg. or Neg,Pos) we are getting between two test which supposed to show similar results either both should result in positive or negative finding of that pathogen is by chance and what are the odds that it is true.
 

CowboyBear

Super Moderator
#6
What I'm trying to get you to reflect on here is: What do you mean when you say "by chance"? Remember, even a single mismatched pair is enough to demonstrate that the tests don't always give the same answer. This could not happen "by chance" if the tests actually give identical results. You don't need any inferential test to show that.

You seem to want to do a statistical test here - but what is the hypothesis that you're wanting to test?
 
#7
Hi Matt,
I really appreciate that you are trying to help me here. Actually below is the question asked by one of the doctor to me (scientist) at work. I did not understand what exactly he is trying to prove here but thought it may make sense to bio statistician.

"Can you please calculate p value (0.05 cutoff)?
Assuming tests are equally sensitive, what are odds that difference observed is due to chance?"

If doesn't make any sense, i can totally understand.

Thanks a lot
D
 

rogojel

TS Contributor
#9
hi,
as CB said, this is fairly little to go on, I would suggest the following:

A. Is there a way to get the "real" status of the patients i.e. whether they are truly negative or positive? You could use this as a kind of standard to judge the two methods against.

B. There probably is some kind of continuous measurement behind the decision Pos or Neg. If yes, can you get those numbers ? It would be much easier to analyse those.

C. Assuming that each test result was random, with neg being slightly more probable (20/38) you would have a chance of 0.051 of seeing 13 or more neg results in one column as you do. That could be interpreted as a weak hint that method 2 has a preference towards neg results. This could be a good or bad thing depending on how many true negatives you have in the sample.

regards
 
#10
Hi Rogogel and CB,
1) Real status is they all are positive by conventional clinical diagnosis.

There are 19 patients. 12 are positive by Test1, but only 6 are positive by Test2. We believe, based on this limited data set, that Test1 is a much better test than Test2 (essentially twice as sensitive, 63% vs 32%). However, what if our belief is mistaken, and the apparent difference in performance is due to chance alone? Can we "disprove" (based on low p value) equivalent sensitivities?
 
#11
Hi Guys,
I tried to solve by my own. I collected data on the Normal patients and respective test1 and standard diagnosis data on that (they were all negative), there is no data available test2 for Normal. I tried Cohen’s Kappa and McNemar’s Test with Test1 vs Test2 and Test1 vs Standard.

1) Test1 vs Test2

Number of observed agreements: 9 ( 47.37% of the observations)
Number of agreements expected by chance: 8.6 ( 45.15% of the observations)
Kappa= 0.040 (The strength of agreement is considered to be 'poor')
SE of kappa = 0.186
95% confidence interval: From -0.323 to 0.404


Disagreement Statistic

chiSq 3.6 Prob 0.057



2) Test1 vs Standard

Number of observed agreements: 22 ( 75.86% of the observations)
Number of agreements expected by chance: 13.7 ( 47.32% of the observations)
Kappa= 0.542 (The strength of agreement is considered to be 'moderate'.)
SE of kappa = 0.134
95% confidence interval: From 0.279 to 0.805
 

rogojel

TS Contributor
#12
hi,
the Cohen-Kappa is not really helpful in answering your basic question which is, if I understand it correctly, whether the observed increase in Neg results in Test2 is due to chance or shows some systematic effect?

This looks like a shameless self-promotion but this is the answer to THAT question:

C. Assuming that each test result was random, with neg being slightly more probable (20/38) you would have a chance of 0.051 of seeing 13 or more neg results in one column as you do. That could be interpreted as a weak hint that method 2 has a preference towards neg results. This could be a good or bad thing depending on how many true negatives you have in the sample.
So, you have a 5% chance of seeing this many Negs in Test2 IF the two methods were equivalent or to put it another way, if you repetead this trial 100 times with the same sample size AND there were no difference between the two tests, you would see only 5 outcomes that had this many or more Negs in Test2. Given the small sample size, this is pretty suggestive, though no strong proof, that Test2 is biased towards Neg results.

regards