+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 26

Thread: Data analysis - which test is best

  1. #1
    Banned
    Points: 54, Level: 1
    Level completed: 8%, Points required for next Level: 46

    Posts
    6
    Thanks
    9
    Thanked 0 Times in 0 Posts

    Question Data analysis - which test is best




    I would like to know which test would be appropriate for testing my hypothesis.

    I am doing a research project on parasitic infections in snail hosts. Snails are infected when coming into contact with bird faeces.

    My study takes samples of snails from 5 sites and dissects them to check for prevalence of infection.
    Site 5 is 130 metres from a known bird roosting site.
    Site 4 is 300 metres from known roosting site
    Site 3 is 524 metres from roosting site
    Site 2 is 664 metres " " "
    Site 1 is 786 metres " " "

    At each site I collected 50 snails.

    Site 1 had 6 infected snails (out of 50)
    Site 2 also had 6 infected snails (out of 50)
    Site 3 had 7 infected snails
    Site 4 had 9
    Site 5 had 14

    Hypothesis - More parasite infections would occur at site(s) closest to the known bird roosting site

    Any thoughts on which test would be the best to check my hypothesis??

  2. #2
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: Data analysis - which test is best

    Among all other potential ways, one way is possible. You can correlate the distance (in meter) with the infection status (which is something binary). You should use a Spearman correlation coefficient for that purpose.
    "victor is the reviewer from hell" -Jake
    "victor is a machine! a publication machine!" -Vinux

  3. The Following 2 Users Say Thank You to victorxstc For This Useful Post:

    GretaGarbo (04-22-2013), Justice! (04-22-2013)

  4. #3
    Banned
    Points: 54, Level: 1
    Level completed: 8%, Points required for next Level: 46

    Posts
    6
    Thanks
    9
    Thanked 0 Times in 0 Posts

    Re: Data analysis - which test is best

    I have used Minitab to do a Spearman correlation coefficient. Would I be right in stating that the Pearson's r value is the 'p' value??

  5. #4
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Data analysis - which test is best

    Pearson's r value is a correlation coefficient like Spearman's (but making different assumptions and calculated a different way). It is not the "p" value which is an assessment of how likely that the results you got were entirely due to random error. You will have a p value with both Spearman's and Pearson.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  6. The Following User Says Thank You to noetsi For This Useful Post:

    Justice! (04-22-2013)

  7. #5
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: Data analysis - which test is best

    as noetsi said, no. Correlation coefficients (Pearson or Spearman) give you two values: A correlation coefficient (Pearson's R or Spearman's Rho) and the P value.

    The correlation coefficient shows the extent and direction of the correlation. For example you can find that R = -0.34 P = 0.008. In this example, there is 34% correlation between distance and infection. Note that the sign is negative. Therefore, there is a negative 34% significant correlation, meaning that the shorter the distance, the higher the chance of infection.

    However, please note that you should use a Spearman's coefficient, instead of Pearson's. I don't know if you have SPSS or not. But if you had SPSS you could do the followings to run the Spearman. If you have done Pearson's test in Minitab already, I think you won't have difficulty in doing Spearman in Minitab. However, before that, make sure you are dealing with 250 rows in your spreadsheet file (each row for a single specimen), not with 5 rows (not each row for a site).

    In your SPSS file, you have 250 cases, right? (5 sites, each with 50 cases, so a total of 250 cases). In your raw data file (with at least 250 rows), just write the distance value for each site, in a new column, besides each of your 250 cases. So for example you need to write the number 524 (the third distance) for 50 times, besides the corresponding rows. Then make sure your column dealing with "infection status" is all 0 and 1. If not, create a new column which contains the infection status of each of 250 cases as 0 and 1. No you have two columns, each has 250 cases, and each row shows a single snail: its infection status (0 and 1) and its distance. Now go to analysis -> correlate -> bivariate, and select Spearman test and select those two columns. The test is now ready to be run.
    "victor is the reviewer from hell" -Jake
    "victor is a machine! a publication machine!" -Vinux

  8. The Following User Says Thank You to victorxstc For This Useful Post:

    Justice! (04-22-2013)

  9. #6
    Banned
    Points: 54, Level: 1
    Level completed: 8%, Points required for next Level: 46

    Posts
    6
    Thanks
    9
    Thanked 0 Times in 0 Posts

    Re: Data analysis - which test is best

    I put my distance figures into row C1 then the number of infected snails into column C2, ran the test and got these results: (Just checking I am on the right track so far!)


    All 2 1 1 1 5
    40 20 20 20 100

    Cell Contents: Count
    % of Total


    Pearson's r -0.970143
    Spearman's rho -0.974679

  10. #7
    Banned
    Points: 54, Level: 1
    Level completed: 8%, Points required for next Level: 46

    Posts
    6
    Thanks
    9
    Thanked 0 Times in 0 Posts

    Re: Data analysis - which test is best

    Apologies Victorxstc, I posted before seeing what you had wrote. I do not have SPSS (I have tried to download a trial version but keep getting an error message when trying to download) I will persevere with trying to get it

  11. #8
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Data analysis - which test is best

    If you are doing this work near a university, they commonly have SPSS on their computers these days.

    The two values you noted (for Pearson's R and Spearman's Rho) are very close, effectively the same thing.

    If either of your variables is coded as a dichotomy (that is for example infected/ not infected) then neither Pearson nor Spearman's will work correctly. You need to do polychoric correlations although I doubt Minitab will do this (even SPSS and SAS won't in the core code, they need special Macros or R code in the case of SPSS).
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  12. The Following User Says Thank You to noetsi For This Useful Post:

    Justice! (04-22-2013)

  13. #9
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: Data analysis - which test is best

    No problem Justice

    They are similar and Minitab is efficient. However, before any analyses, please make sure you are dealing with your raw data, not the summary of your raw data. In your Minitab file, you should have at least 250 rows. If your file is like that, your correlation coefficients are very good, as the more the coefficient is near the value 1, the higher the correlation.
    "victor is the reviewer from hell" -Jake
    "victor is a machine! a publication machine!" -Vinux

  14. The Following User Says Thank You to victorxstc For This Useful Post:

    Justice! (04-22-2013)

  15. #10
    Banned
    Points: 54, Level: 1
    Level completed: 8%, Points required for next Level: 46

    Posts
    6
    Thanks
    9
    Thanked 0 Times in 0 Posts

    Re: Data analysis - which test is best

    My University does indeed have SPSS, I am off on Wednesday and intend to go in to use their computer. Just thought I would try and download now or try a different software (Minitab) so I could get crackin' instead of waiting until Wednesday
    Thank you for the tip

  16. #11
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Data analysis - which test is best

    Quote Originally Posted by Justice! View Post
    Site 5 is 130 metres from a known bird roosting site.
    Site 4 is 300 metres from known roosting site
    Site 3 is 524 metres from roosting site
    Site 2 is 664 metres " " "
    Site 1 is 786 metres " " "

    Thank you Justice! Or what should I call you?

    Now we know if distance is significant – or not.

    Maybe you could cooperate with Palmer86, because he has got identical data as you!

    Oh, maybe he has plagiarized your result? Or maybe you should be careful with him since I was told that he had not been the most polite person. Or maybe you could cooperate with Mmanuel, a person I tried to help a lot. You two – I mean, you three – seems to have a lot in common.

    Justice, if you find a topic difficult, then you see, there is a search engine called Google, that can be very useful. For example I googled “logit model” and saw 690 000 links. You should not expect someone else to write a thesis for you when there already are 690 000 others for you to read before.

    Hlsmith suggested Fishers exact test. Karabiner pointed out that a chi-squarred test could be used. Victorxstc literally did the test for you.

    When someone is serving the results on a silver plate for you, do you find it embarrassing to say “thank you” then?

    If you find it humiliating (“squat”) to say thank you, then I suggest that you don't do that!

    I will withdraw from this subject. I have tried to help you in many posts. But please don't thank me!

  17. The Following 2 Users Say Thank You to GretaGarbo For This Useful Post:

    Justice! (04-23-2013), victorxstc (04-23-2013)

  18. #12
    TS Contributor
    Points: 17,775, Level: 84
    Level completed: 85%, Points required for next Level: 75
    Karabiner's Avatar
    Location
    FC Schalke 04, Germany
    Posts
    2,541
    Thanks
    56
    Thanked 640 Times in 602 Posts

    Re: Data analysis - which test is best

    [QUOTE=Justice!;123875]I would like to know which test would be appropriate for testing my hypothesis.

    I am doing a research project on parasitic infections in snail hosts. Snails are infected when coming into contact with bird faeces.
    My study takes samples of snails from 5 sites and dissects them to check for prevalence of infection.
    Site 5 is 130 metres from a known bird roosting site.
    Site 4 is 300 metres from known roosting site
    Site 3 is 524 metres from roosting site
    Site 2 is 664 metres " " "
    Site 1 is 786 metres " " "

    At each site I collected 50 snails.

    Site 1 had 6 infected snails (out of 50)
    Site 2 also had 6 infected snails (out of 50)
    Site 3 had 7 infected snails
    Site 4 had 9
    Site 5 had 14
    Each snail has 2 characteristics: a) infected yes/no and b) its distance from the
    roosting site. You could try a Mann-Whitney U-test with infected yes/no as
    grouping variable and distance as dependent variable. This will show you
    whether in the infected group the distances are significantly higher or lower than
    in the non-infected group.

    With kind regards

    K.

  19. #13
    TS Contributor
    Points: 17,775, Level: 84
    Level completed: 85%, Points required for next Level: 75
    Karabiner's Avatar
    Location
    FC Schalke 04, Germany
    Posts
    2,541
    Thanks
    56
    Thanked 640 Times in 602 Posts

    Re: Data analysis - which test is best

    Quote Originally Posted by noetsi View Post
    the "p" value which is an assessment of how likely that the results you got were entirely due to random error
    Beg your pardon, but wouldn't that mean p(Hypothesis|Data), i.e. Bayes statistics?
    With the frequentist approach, we achieve p(Data|Hypothesis) .

    With kind regards

    K.

  20. The Following User Says Thank You to Karabiner For This Useful Post:

    Justice! (04-23-2013)

  21. #14
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: Data analysis - which test is best

    [QUOTE=Karabiner;124000]
    Quote Originally Posted by Justice! View Post
    I would like to know which test would be appropriate for testing my hypothesis.

    I am doing a research project on parasitic infections in snail hosts. Snails are infected when coming into contact with bird faeces.


    Each snail has 2 characteristics: a) infected yes/no and b) its distance from the
    roosting site. You could try a Mann-Whitney U-test with infected yes/no as
    grouping variable and distance as dependent variable. This will show you
    whether in the infected group the distances are significantly higher or lower than
    in the non-infected group.

    With kind regards

    K.
    I agree on that, but doesn't a correlation coefficient suffice. Besides, I guess before Mann-Whitney, Justice should do a Kruskal-Wallis to see if there is any overall difference between the 5 sites' infection rates or not. Well, a Kruskal-Wallis does not directly show the direction and extent of the "correlation" (and further evaluations would be necessary), at least not as clearly as the correlation coefficients show the extent and direction of the association.

    Besides, when doing Kruskal-Wallis and Mann-Whitney tests, the length of the distance is discarded, because it would be used Only as a grouping variable; while in correlation coefficients, the distances (in meter) would have a meaning, which this favors the accuracy of the results.

    Kind regards
    "victor is the reviewer from hell" -Jake
    "victor is a machine! a publication machine!" -Vinux

  22. The Following User Says Thank You to victorxstc For This Useful Post:

    Justice! (04-23-2013)

  23. #15
    TS Contributor
    Points: 17,775, Level: 84
    Level completed: 85%, Points required for next Level: 75
    Karabiner's Avatar
    Location
    FC Schalke 04, Germany
    Posts
    2,541
    Thanks
    56
    Thanked 640 Times in 602 Posts

    Re: Data analysis - which test is best


    I agree on that, but doesn't a correlation coefficient suffice.
    Perhaps. But I feel uneasy with Spearman on binary-versus-rank-data.
    Maybe some forgotten childhood experience.
    Besides, I guess before Mann-Whitney, Justice should do a Kruskal-Wallis to see if there is any overall difference between the 5 sites' infection rates or not.
    That is, treat infection yes/no as ordinal? I had rather assumed that this was
    categorical, in which case the Chi² could apply (expected frequencies are all
    > 5, AFAICS).
    Besides, when doing Kruskal-Wallis and Mann-Whitney tests, the length of the distance is discarded, because it would be used Only as a grouping variable;
    I would treat it ordinal DV, not as grouping variable. I guessed
    that since there are 5 fixed distances and none in-between, ordinal
    would be appropriate.

    With kind regards

    K.

  24. The Following User Says Thank You to Karabiner For This Useful Post:

    Justice! (04-23-2013)

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats