# Which test to use for my data

Status
Not open for further replies.
P

#### palmer86

##### Guest
Hi, I am not sure which kind of test to use to analyse my data.
Can anyone help?
I collected 50 snails from 5 different sites & dissected to find out prevalence of infection within the snails of trematode parasites.
Results -
Site 1 - 6 out of 50 snails were infected
Site 2 - 6 out of 50 snails were infected
Site 3 - 7 out of 50
Site 4 - 9 out of 50
Site 5 - 14 out of 50

Which test should I use to compare this data ? I have worked out the S.D.

#### prakoso

##### New Member
I have quite similar problem, i am just comment here so i can easily look up later in my posts.

#### hlsmith

##### Omega Contributor
I would use a Fisher's exact test, examining to see if there is a difference between any of the groups (omnibus test). If you get a significant p-value, then perform direct pairwise comparisons between each of the groups to determine where the difference is. You will need to correct you level of significant when you conduct the pairwise comparisons to address the possibility of pairwise error (more or less, decrease the p-value cut-off to take into account chance).

P

#### palmer86

##### Guest
Thanks! Now, are there any online tutorials on how to do a Fisher's exact test?? (I'm not mathematically minded at all!)
A step-by-step guide would be brill

#### Karabiner

##### TS Contributor
Since no cell has an expected value < 5
a Chi² could be used.

With kind regards

K.

P

#### palmer86

##### Guest
Thanks Karabiner, sorry to be a pain but would you be able to explain how to do a Chi² with my results??
Thanks, S

#### victorxstc

##### Pirate
Before that please tell if your 5 sites represent different settings? Or all of them are similar, and only different in one variable? What is your independent variables. But anyhow, you should draw a table with two rows and five columns in the following format:

Code:
6  - 6  - 7  - 9  - 14
44 - 44 - 43 - 41 - 36
I ran the chi-square for you (depending on the software you use, there are different ways of feeding the above table to the software). Your result was not significant (P = 0.167), showing that the test power was not sufficient to detect the difference visible in site #5. You should use your results as a pilot study to conduct a larger study with a proper sample size determined a priori. Only there were significant pairwise differences between site #5 and both sites #1 and 2 (each P value = 0.046), although if you want to correct multiple comparison problem, those will be insignificant as well.

P

#### palmer86

##### Guest
Thanks Victorxstc that's brill, my 5 sites were all on the same part of beach with site 5 being close to a roosting site for birds (the main transmission agent for the dispersal of trematode parasites) and site 1 being furthest away from the roosting site. My hypothesis states that the closer to the roosting site, the more infected snails I will find.
Thanks again, S

#### Karabiner

##### TS Contributor
You should have stated your complete research hypothesis

I suppose that for your hypothesis a Mann-Whitney U-test
could be an option, with infected yes/no as grouping factor
and distance from the roosting site (5 distances) as dependent
variable.

With kind regards

K.

#### GretaGarbo

##### Human
My hypothesis states that the closer to the roosting site, the more infected snails I will find.
But why didn't you say that from the start?

If we pretend that the sites are on distance 1 through 5 from the birds (note that I have turned the scale) then there will be a statistical difference.

Here is some R code and results from a logit model:
The distance variable is statistically significant p=0.0249.

Code:
infect    <- c(6 , 6 , 7 , 9 , 14)
noninfect <- c(44, 44, 43,41 , 36)
distance  <- c(5 ,  4,  3, 2 ,  1)

mod <- glm(cbind(infect,noninfect) ~ distance, family=binomial)
summary(mod)

Call:
glm(formula = cbind(infect, noninfect) ~ distance, family = binomial)

Deviance Residuals:
1        2        3        4        5
0.4852  -0.1439  -0.4108  -0.3984   0.4655

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  -0.8135     0.3701  -2.198   0.0279 *
distance     -0.2792     0.1245  -2.243   0.0249 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 6.03434  on 4  degrees of freedom
Residual deviance: 0.80033  on 3  degrees of freedom
AIC: 23.535

Number of Fisher Scoring iterations: 4

P

#### palmer86

##### Guest
Well to be honest, statistical analysis is not my strong point, hence why I asked for help in the first place! I did not know what info you needed to help me...
Cheers though

#### GretaGarbo

##### Human
Here is some R code
I used R software that is free open source software, maybe the best statistical software in the world. You can just download it.

Nobody is complaining on you palmer. We are just telling you that it is good strategy to tell the overall purpose, not just a technical question. Now, you have got help from many people with very good knowledge but still just gradually iterated to this position.

Well to be honest, statistical analysis is not my strong point, hence why I asked for help in the first place!
But you have good biological knowledge and can tell us about that and where you are aiming and what is your purpose.

Now, go and get the actual distance from (the centre) of the birds place to the centre of each site and replace the distance number I had above and run the code again.

P

#### palmer86

##### Guest
Was merely explaining why I did not 'state my complete research hypothesis' or 'say that from the start', reason being, I'm mathematically challenged haha!! My knowledge of statistics is laughable I'm afraid, hence why I am on here.
Cheers

P

#### palmer86

##### Guest
I have downloaded R -3.0.0 for windows and worked out my distances for each site from the roosting site of the birds. I have no idea how to 'put' these numbers in though. Is there a certain way to type in my data and what do I click on to do the Fisher's test?? I'm confused!!
Told you I am rubbish!
Cheers, S

#### GretaGarbo

##### Human
But why do you want to do a Fishers exact test? Was it not good enough with the logit model from above?

And what is it that you want to test with the test?

P

#### palmer86

##### Guest
hlsmith mentioned in an earlier post that I should do a Fisher's exact test to analyse my results. Is this not correct test to use to check my null hypothesis?

#### GretaGarbo

##### Human
Then what is your null hypothesis?

(Well, I thought (when you had told us that there was a distance) that I had suggested a more efficient evaluation. But it is your study!)

P

#### palmer86

##### Guest
What was your evaluation? Sorry, I'm not being funny, I'm quite confused...I thought is was an example of how to do Fisher's using R?? If you could explain your reasoning for the evaluation and how you did it, if not too much trouble?? I have worked out distances for each site in relation to the bird roosting site, you mentioned 'plugging them in' but not sure what you mean by this :-s

P

#### palmer86

##### Guest
Basically, in my report I need to analyse my data (with a p value) to reject or accept hypothesis. I need to tell them which statistical software I used and also put in my appendix the whole calculation that the software gave me...my tutor is nuts about statistics and I really want to get this right.
Thanks, S

#### GretaGarbo

##### Human
Are you serious?

Just plug in the correct distance in the code I showed and rerun it again!

Code:
infect    <- c(6 , 6 , 7 , 9 , 14)
noninfect <- c(44, 44, 43,41 , 36)
distance  <- c(5 ,  4,  3, 2 ,  1)

mod <- glm(cbind(infect,noninfect) ~ distance, family=binomial)
summary(mod)

I must say!!

Status
Not open for further replies.