Among all other potential ways, one way is possible. You can correlate the distance (in meter) with the infection status (which is something binary). You should use a Spearman correlation coefficient for that purpose.
I would like to know which test would be appropriate for testing my hypothesis.
I am doing a research project on parasitic infections in snail hosts. Snails are infected when coming into contact with bird faeces.
My study takes samples of snails from 5 sites and dissects them to check for prevalence of infection.
Site 5 is 130 metres from a known bird roosting site.
Site 4 is 300 metres from known roosting site
Site 3 is 524 metres from roosting site
Site 2 is 664 metres " " "
Site 1 is 786 metres " " "
At each site I collected 50 snails.
Site 1 had 6 infected snails (out of 50)
Site 2 also had 6 infected snails (out of 50)
Site 3 had 7 infected snails
Site 4 had 9
Site 5 had 14
Hypothesis - More parasite infections would occur at site(s) closest to the known bird roosting site
Any thoughts on which test would be the best to check my hypothesis??
Among all other potential ways, one way is possible. You can correlate the distance (in meter) with the infection status (which is something binary). You should use a Spearman correlation coefficient for that purpose.
"victor is the reviewer from hell" -Jake
"victor is a machine! a publication machine!" -Vinux
GretaGarbo (04-22-2013), Justice! (04-22-2013)
I have used Minitab to do a Spearman correlation coefficient. Would I be right in stating that the Pearson's r value is the 'p' value??
Pearson's r value is a correlation coefficient like Spearman's (but making different assumptions and calculated a different way). It is not the "p" value which is an assessment of how likely that the results you got were entirely due to random error. You will have a p value with both Spearman's and Pearson.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Justice! (04-22-2013)
as noetsi said, no. Correlation coefficients (Pearson or Spearman) give you two values: A correlation coefficient (Pearson's R or Spearman's Rho) and the P value.
The correlation coefficient shows the extent and direction of the correlation. For example you can find that R = -0.34 P = 0.008. In this example, there is 34% correlation between distance and infection. Note that the sign is negative. Therefore, there is a negative 34% significant correlation, meaning that the shorter the distance, the higher the chance of infection.
However, please note that you should use a Spearman's coefficient, instead of Pearson's. I don't know if you have SPSS or not. But if you had SPSS you could do the followings to run the Spearman. If you have done Pearson's test in Minitab already, I think you won't have difficulty in doing Spearman in Minitab. However, before that, make sure you are dealing with 250 rows in your spreadsheet file (each row for a single specimen), not with 5 rows (not each row for a site).
In your SPSS file, you have 250 cases, right? (5 sites, each with 50 cases, so a total of 250 cases). In your raw data file (with at least 250 rows), just write the distance value for each site, in a new column, besides each of your 250 cases. So for example you need to write the number 524 (the third distance) for 50 times, besides the corresponding rows. Then make sure your column dealing with "infection status" is all 0 and 1. If not, create a new column which contains the infection status of each of 250 cases as 0 and 1. No you have two columns, each has 250 cases, and each row shows a single snail: its infection status (0 and 1) and its distance. Now go to analysis -> correlate -> bivariate, and select Spearman test and select those two columns. The test is now ready to be run.
"victor is the reviewer from hell" -Jake
"victor is a machine! a publication machine!" -Vinux
Justice! (04-22-2013)
I put my distance figures into row C1 then the number of infected snails into column C2, ran the test and got these results: (Just checking I am on the right track so far!)
All 2 1 1 1 5
40 20 20 20 100
Cell Contents: Count
% of Total
Pearson's r -0.970143
Spearman's rho -0.974679
Apologies Victorxstc, I posted before seeing what you had wrote. I do not have SPSS (I have tried to download a trial version but keep getting an error message when trying to download) I will persevere with trying to get it
If you are doing this work near a university, they commonly have SPSS on their computers these days.
The two values you noted (for Pearson's R and Spearman's Rho) are very close, effectively the same thing.
If either of your variables is coded as a dichotomy (that is for example infected/ not infected) then neither Pearson nor Spearman's will work correctly. You need to do polychoric correlations although I doubt Minitab will do this (even SPSS and SAS won't in the core code, they need special Macros or R code in the case of SPSS).
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Justice! (04-22-2013)
No problem Justice
They are similar and Minitab is efficient. However, before any analyses, please make sure you are dealing with your raw data, not the summary of your raw data. In your Minitab file, you should have at least 250 rows. If your file is like that, your correlation coefficients are very good, as the more the coefficient is near the value 1, the higher the correlation.
"victor is the reviewer from hell" -Jake
"victor is a machine! a publication machine!" -Vinux
Justice! (04-22-2013)
My University does indeed have SPSS, I am off on Wednesday and intend to go in to use their computer. Just thought I would try and download now or try a different software (Minitab) so I could get crackin' instead of waiting until Wednesday
Thank you for the tip
Thank you Justice! Or what should I call you?
Now we know if distance is significant – or not.
Maybe you could cooperate with Palmer86, because he has got identical data as you!
Oh, maybe he has plagiarized your result? Or maybe you should be careful with him since I was told that he had not been the most polite person. Or maybe you could cooperate with Mmanuel, a person I tried to help a lot. You two – I mean, you three – seems to have a lot in common.
Justice, if you find a topic difficult, then you see, there is a search engine called Google, that can be very useful. For example I googled “logit model” and saw 690 000 links. You should not expect someone else to write a thesis for you when there already are 690 000 others for you to read before.
Hlsmith suggested Fishers exact test. Karabiner pointed out that a chi-squarred test could be used. Victorxstc literally did the test for you.
When someone is serving the results on a silver plate for you, do you find it embarrassing to say “thank you” then?
If you find it humiliating (“squat”) to say thank you, then I suggest that you don't do that!
I will withdraw from this subject. I have tried to help you in many posts. But please don't thank me!
Justice! (04-23-2013), victorxstc (04-23-2013)
[QUOTE=Justice!;123875]I would like to know which test would be appropriate for testing my hypothesis.
I am doing a research project on parasitic infections in snail hosts. Snails are infected when coming into contact with bird faeces.
Each snail has 2 characteristics: a) infected yes/no and b) its distance from theMy study takes samples of snails from 5 sites and dissects them to check for prevalence of infection.
Site 5 is 130 metres from a known bird roosting site.
Site 4 is 300 metres from known roosting site
Site 3 is 524 metres from roosting site
Site 2 is 664 metres " " "
Site 1 is 786 metres " " "
At each site I collected 50 snails.
Site 1 had 6 infected snails (out of 50)
Site 2 also had 6 infected snails (out of 50)
Site 3 had 7 infected snails
Site 4 had 9
Site 5 had 14
roosting site. You could try a Mann-Whitney U-test with infected yes/no as
grouping variable and distance as dependent variable. This will show you
whether in the infected group the distances are significantly higher or lower than
in the non-infected group.
With kind regards
K.
Justice! (04-23-2013)
[QUOTE=Karabiner;124000] I agree on that, but doesn't a correlation coefficient suffice. Besides, I guess before Mann-Whitney, Justice should do a Kruskal-Wallis to see if there is any overall difference between the 5 sites' infection rates or not. Well, a Kruskal-Wallis does not directly show the direction and extent of the "correlation" (and further evaluations would be necessary), at least not as clearly as the correlation coefficients show the extent and direction of the association.
Besides, when doing Kruskal-Wallis and Mann-Whitney tests, the length of the distance is discarded, because it would be used Only as a grouping variable; while in correlation coefficients, the distances (in meter) would have a meaning, which this favors the accuracy of the results.
Kind regards
"victor is the reviewer from hell" -Jake
"victor is a machine! a publication machine!" -Vinux
Justice! (04-23-2013)
Perhaps. But I feel uneasy with Spearman on binary-versus-rank-data.I agree on that, but doesn't a correlation coefficient suffice.
Maybe some forgotten childhood experience.
That is, treat infection yes/no as ordinal? I had rather assumed that this wasBesides, I guess before Mann-Whitney, Justice should do a Kruskal-Wallis to see if there is any overall difference between the 5 sites' infection rates or not.
categorical, in which case the Chi² could apply (expected frequencies are all
> 5, AFAICS).
I would treat it ordinal DV, not as grouping variable. I guessedBesides, when doing Kruskal-Wallis and Mann-Whitney tests, the length of the distance is discarded, because it would be used Only as a grouping variable;
that since there are 5 fixed distances and none in-between, ordinal
would be appropriate.
With kind regards
K.
Justice! (04-23-2013)
Tweet |