Hi,
I have results from two very sensitive but different lab test trying to find pathogen in a patient sample. For some patients, results from two test are not identical (listed below).
What statistical test to perform to find out the odds that the difference observed between two test is due to chance?
Patient ID Test1 Test2
1 Pos. Neg.
2 Pos. Neg.
3 Pos. Pos.
4 Pos. Neg.
5 Neg. Pos.
6 Pos. Neg.
7 Neg. Neg.
8 Neg. Neg.
9 Neg. Neg.
10 Neg. Neg.
11 Neg. Pos.
12 Pos. Pos.
13 Pos. Neg.
14 Pos. Pos.
15 Pos. Neg.
16 Neg. Neg.
17 Pos. Pos.
18 Pos. Neg.
19 Pos. Neg.
Thanks
We are trying to find out that the difference in the results (Pos,Neg. or Neg,Pos) we are getting between two test which supposed to show similar results either both should result in positive or negative finding of that pathogen is by chance and what are the odds that it is true.
what are the odds that the discrepancy in result we are getting between two tests is by chance
What I'm trying to get you to reflect on here is: What do you mean when you say "by chance"? Remember, even a single mismatched pair is enough to demonstrate that the tests don't always give the same answer. This could not happen "by chance" if the tests actually give identical results. You don't need any inferential test to show that.
You seem to want to do a statistical test here - but what is the hypothesis that you're wanting to test?
Matt aka CB | twitter.com/matthewmatix
Hi Matt,
I really appreciate that you are trying to help me here. Actually below is the question asked by one of the doctor to me (scientist) at work. I did not understand what exactly he is trying to prove here but thought it may make sense to bio statistician.
"Can you please calculate p value (0.05 cutoff)?
Assuming tests are equally sensitive, what are odds that difference observed is due to chance?"
If doesn't make any sense, i can totally understand.
Thanks a lot
D
I would go back to him and ask him to clarify then - that isn't enough to go on I'm afraid :/
Matt aka CB | twitter.com/matthewmatix
hi,
as CB said, this is fairly little to go on, I would suggest the following:
A. Is there a way to get the "real" status of the patients i.e. whether they are truly negative or positive? You could use this as a kind of standard to judge the two methods against.
B. There probably is some kind of continuous measurement behind the decision Pos or Neg. If yes, can you get those numbers ? It would be much easier to analyse those.
C. Assuming that each test result was random, with neg being slightly more probable (20/38) you would have a chance of 0.051 of seeing 13 or more neg results in one column as you do. That could be interpreted as a weak hint that method 2 has a preference towards neg results. This could be a good or bad thing depending on how many true negatives you have in the sample.
regards
Hi Rogogel and CB,
1) Real status is they all are positive by conventional clinical diagnosis.
There are 19 patients. 12 are positive by Test1, but only 6 are positive by Test2. We believe, based on this limited data set, that Test1 is a much better test than Test2 (essentially twice as sensitive, 63% vs 32%). However, what if our belief is mistaken, and the apparent difference in performance is due to chance alone? Can we "disprove" (based on low p value) equivalent sensitivities?
Hi Guys,
I tried to solve by my own. I collected data on the Normal patients and respective test1 and standard diagnosis data on that (they were all negative), there is no data available test2 for Normal. I tried Cohen’s Kappa and McNemar’s Test with Test1 vs Test2 and Test1 vs Standard.
1) Test1 vs Test2
Number of observed agreements: 9 ( 47.37% of the observations)
Number of agreements expected by chance: 8.6 ( 45.15% of the observations)
Kappa= 0.040 (The strength of agreement is considered to be 'poor')
SE of kappa = 0.186
95% confidence interval: From -0.323 to 0.404
Disagreement Statistic
chiSq 3.6 Prob 0.057
2) Test1 vs Standard
Number of observed agreements: 22 ( 75.86% of the observations)
Number of agreements expected by chance: 13.7 ( 47.32% of the observations)
Kappa= 0.542 (The strength of agreement is considered to be 'moderate'.)
SE of kappa = 0.134
95% confidence interval: From 0.279 to 0.805
hi,
the Cohen-Kappa is not really helpful in answering your basic question which is, if I understand it correctly, whether the observed increase in Neg results in Test2 is due to chance or shows some systematic effect?
This looks like a shameless self-promotion but this is the answer to THAT question:
So, you have a 5% chance of seeing this many Negs in Test2 IF the two methods were equivalent or to put it another way, if you repetead this trial 100 times with the same sample size AND there were no difference between the two tests, you would see only 5 outcomes that had this many or more Negs in Test2. Given the small sample size, this is pretty suggestive, though no strong proof, that Test2 is biased towards Neg results.C. Assuming that each test result was random, with neg being slightly more probable (20/38) you would have a chance of 0.051 of seeing 13 or more neg results in one column as you do. That could be interpreted as a weak hint that method 2 has a preference towards neg results. This could be a good or bad thing depending on how many true negatives you have in the sample.
regards
This makes a lot more sense now
You can definitely calculate inter-rater reliability here, but that doesn't really get to the heart of the question that the doctor at your work has asked. The question is pretty simply: Is the sensitivity rate higher in test 2? This then becomes a simple hypothesis test for two proportions. See http://stattrek.com/hypothesis-test/...oportions.aspx
Matt aka CB | twitter.com/matthewmatix
Tweet |