# Thread: Which test to use for my data

1. ## Which test to use for my data

Hi, I am not sure which kind of test to use to analyse my data.
Can anyone help?
I collected 50 snails from 5 different sites & dissected to find out prevalence of infection within the snails of trematode parasites.
Results -
Site 1 - 6 out of 50 snails were infected
Site 2 - 6 out of 50 snails were infected
Site 3 - 7 out of 50
Site 4 - 9 out of 50
Site 5 - 14 out of 50

Which test should I use to compare this data ? I have worked out the S.D.

2. ## Re: Which test to use for my data

I have quite similar problem, i am just comment here so i can easily look up later in my posts.

3. ## Re: Which test to use for my data

I would use a Fisher's exact test, examining to see if there is a difference between any of the groups (omnibus test). If you get a significant p-value, then perform direct pairwise comparisons between each of the groups to determine where the difference is. You will need to correct you level of significant when you conduct the pairwise comparisons to address the possibility of pairwise error (more or less, decrease the p-value cut-off to take into account chance).

4. ## The Following User Says Thank You to hlsmith For This Useful Post:

palmer86 (04-10-2013)

5. ## Re: Which test to use for my data

Thanks! Now, are there any online tutorials on how to do a Fisher's exact test?? (I'm not mathematically minded at all!)
A step-by-step guide would be brill

6. ## Re: Which test to use for my data

Since no cell has an expected value < 5
a Chi² could be used.

With kind regards

K.

7. ## The Following User Says Thank You to Karabiner For This Useful Post:

palmer86 (04-10-2013)

8. ## Re: Which test to use for my data

Thanks Karabiner, sorry to be a pain but would you be able to explain how to do a Chi² with my results??
Thanks, S

9. ## Re: Which test to use for my data

Before that please tell if your 5 sites represent different settings? Or all of them are similar, and only different in one variable? What is your independent variables. But anyhow, you should draw a table with two rows and five columns in the following format:

Code:
``````6  - 6  - 7  - 9  - 14
44 - 44 - 43 - 41 - 36``````
I ran the chi-square for you (depending on the software you use, there are different ways of feeding the above table to the software). Your result was not significant (P = 0.167), showing that the test power was not sufficient to detect the difference visible in site #5. You should use your results as a pilot study to conduct a larger study with a proper sample size determined a priori. Only there were significant pairwise differences between site #5 and both sites #1 and 2 (each P value = 0.046), although if you want to correct multiple comparison problem, those will be insignificant as well.

10. ## Re: Which test to use for my data

Thanks Victorxstc that's brill, my 5 sites were all on the same part of beach with site 5 being close to a roosting site for birds (the main transmission agent for the dispersal of trematode parasites) and site 1 being furthest away from the roosting site. My hypothesis states that the closer to the roosting site, the more infected snails I will find.
Thanks again, S

11. ## Re: Which test to use for my data

You should have stated your complete research hypothesis

I suppose that for your hypothesis a Mann-Whitney U-test
could be an option, with infected yes/no as grouping factor
and distance from the roosting site (5 distances) as dependent
variable.

With kind regards

K.

12. ## Re: Which test to use for my data

Originally Posted by palmer86
My hypothesis states that the closer to the roosting site, the more infected snails I will find.
But why didn't you say that from the start?

If we pretend that the sites are on distance 1 through 5 from the birds (note that I have turned the scale) then there will be a statistical difference.

Here is some R code and results from a logit model:
The distance variable is statistically significant p=0.0249.

Code:
``````infect    <- c(6 , 6 , 7 , 9 , 14)
noninfect <- c(44, 44, 43,41 , 36)
distance  <- c(5 ,  4,  3, 2 ,  1)

mod <- glm(cbind(infect,noninfect) ~ distance, family=binomial)
summary(mod)

Call:
glm(formula = cbind(infect, noninfect) ~ distance, family = binomial)

Deviance Residuals:
1        2        3        4        5
0.4852  -0.1439  -0.4108  -0.3984   0.4655

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  -0.8135     0.3701  -2.198   0.0279 *
distance     -0.2792     0.1245  -2.243   0.0249 *
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 6.03434  on 4  degrees of freedom
Residual deviance: 0.80033  on 3  degrees of freedom
AIC: 23.535

Number of Fisher Scoring iterations: 4``````

13. ## Re: Which test to use for my data

Well to be honest, statistical analysis is not my strong point, hence why I asked for help in the first place! I did not know what info you needed to help me...
Cheers though

14. ## Re: Which test to use for my data

Originally Posted by GretaGarbo
Here is some R code
I used R software that is free open source software, maybe the best statistical software in the world. You can just download it.

Nobody is complaining on you palmer. We are just telling you that it is good strategy to tell the overall purpose, not just a technical question. Now, you have got help from many people with very good knowledge but still just gradually iterated to this position.

Well to be honest, statistical analysis is not my strong point, hence why I asked for help in the first place!
But you have good biological knowledge and can tell us about that and where you are aiming and what is your purpose.

Now, go and get the actual distance from (the centre) of the birds place to the centre of each site and replace the distance number I had above and run the code again.

15. ## Re: Which test to use for my data

Was merely explaining why I did not 'state my complete research hypothesis' or 'say that from the start', reason being, I'm mathematically challenged haha!! My knowledge of statistics is laughable I'm afraid, hence why I am on here.
Cheers

16. ## Re: Which test to use for my data

I have downloaded R -3.0.0 for windows and worked out my distances for each site from the roosting site of the birds. I have no idea how to 'put' these numbers in though. Is there a certain way to type in my data and what do I click on to do the Fisher's test?? I'm confused!!
Told you I am rubbish!
Cheers, S

17. ## Re: Which test to use for my data

But why do you want to do a Fishers exact test? Was it not good enough with the logit model from above?

And what is it that you want to test with the test?