# multiple comparison of chi square tests of binary data

#### Emma_SR

##### New Member
Hello everybody,

the code below is to illustrate how I examined the (significant) differences in certain characteristics between different regions. The data set is non-parametric so I used the Kruskal Wallis test and for multiple comparison between the regions I used the function kruskalmc. I searched for homogeneous groups amongst the regions, named them (for example a, ab, b...) and plotted the result in a boxplot.

Code:
library(pgirmess)
kruskal.test(characteristic ~ region) # kruskal wallis test
charac <- kruskalmc(characteristic ~ region, probs = 0.05)  # multiple comparison between the regions
# looking for "homogenous groups"
library(multcompView)
test <- charac$dif.com$difference # select logical vector
names(test) <- row.names(charac$dif.com)# add comparison names # create a list with "homogenous groups" coded by letter let <- multcompLetters(test, compare="<", threshold=0.05, Letters=c(letters, LETTERS, "."), reversed = FALSE) boxplot(characteristic ~ region,xlab = "region", ylab = "characteristic") #boxplot mtext(side=3,text=let$Letters,at=1:length(let\$Letters),cex=0.8)  # text at the top
I'm searching for the equivalent for comparing binary data (0/1) amongst different regions. For example: the sample (a tree) was present at an old map (date 1775).
The sample gets the characteristic "present".
Each region has a different number of samples.
Region 1: present: 0 0 0 1 0 1 1 ...
Region 2 has more samples: present: 1 0 0 0 0 1 0 1 1 1 ...
And so on.

I used a chi square test to examine significant differences in the binary data between the regions, but I don't know how to make a multiple comparison between the regions, like illustrated above. I also wish to find homogenous groups (amongst the regions), name them and plot the result in a boxplot in the same way as above.

Code:
chisq.test(present, region)   # chi square test
Does anyone have any idea how this can be done?
Thanks a lot,
Emma

Last edited: