# dunn.test

#### Lidia_N

##### New Member
Hi everyone!!

I'm quite new with R, so I hope you can help me I'm working with bacteria pyrosequencing data and I'm doing the statistical analysis with R. I have 21 samples and 7 different treatments. I loaded my data into R phyloseq obtaining:

> psR
phyloseq-class experiment-level object
otu_table() OTU Table: [ 7498 taxa and 21 samples ]
sample_data() Sample Data: [ 21 samples by 8 sample variables ]
tax_table() Taxonomy Table: [ 7498 taxa by 6 taxonomic ranks ]
phy_tree() Phylogenetic Tree: [ 7498 tips and 7497 internal nodes ]

Since I found that there are statistically significant differences between the treaments (with the adonis function), I wanted to know which OTUs have a different abundance in the different treatments. To do that I used the function dunn.test (with Kruskal-Wallis test incorporated), swapping first the rows and the columns in the OTU table in order to apply the test:

swap_otu_table <- t(otu_table(psR))
treatment <- c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'E', 'E', 'E', 'F', 'F', 'F', 'G', 'G', 'G')
swap_otu_tableDF <- as.data.frame(swap_otu_table)
ncol(swap_otu_tableDF)
 7498
lapply(swap_otu_tableDF[1:7498], function(x) kruskal.test(x ~ treatment, data=swap_otu_tableDF))

The output of this recursive function is quite difficult to read, especially for all 7498 OTUs.

Is there some way to apply a Kruskal-Wallis + Dunn's test in a recursive way that gives a table as an output, preferably in order of significance, and with not only the OTU code but also the taxonomic identification contained in tax_table(psR)?

Thank you very much!

Lidia

#### Lazar

##### Phineas Packard
You will have to look at the elements that make up the kruskal.test:
Code:
x <- c(2.9, 3.0, 2.5, 2.6, 3.2) # normal subjects
y <- c(3.8, 2.7, 4.0, 2.4)      # with obstructive airway disease
z <- c(2.8, 3.4, 3.7, 2.2, 2.0) # with asbestosis
test<- kruskal.test(list(x, y, z))
str(test) # suggests you need to extract to statistic, parameter, and p.value compnents
#e.g.
c(test$statistic, test$parameter, test$p.value) So my guess would be something like: Code: sapply(swap_otu_tableDF[1:7498], function(x) { temp <- kruskal.test(x ~ treatment, data=swap_otu_tableDF) c(temp$statistic, temp$parameter, temp$p.value)
}
)
Might give you what you want but I have not tested it. You can then order the data by p.value if you like.

#### Lidia_N

##### New Member

Unfortunately the code does not work with the function dunn.test... Following your suggestion I tried:

dunn.test.1 <- dunn.test(swap_otu_tableDF[,1], treatment, kw=TRUE, label=TRUE, method='bonferroni', alpha=0.05)
str(dunn.test.1)
List of 4
$chi2 : num 4.46$ Z : num [1:21] -1.08 0 1.08 -1.08 0 ...
$P : num [1:21] 0.14 0.5 0.14 0.14 0.5 ...$ P.adjusted: num [1:21] 1 1 1 1 1 1 1 1 1 1 ...
sapply(swap_otu_tableDF[1:7498], function(x) {
temp <- dunn.test( x ~ treatment, data=swap_otu_tableDF)
c(temp$p.value) } ) The big problem with this recursive dunn.test output is also that is somehow divided into two parts. Firstly there is the table for all the OTUs, without the code: Kruskal-Wallis rank sum test data: x and treatment Kruskal-Wallis chi-squared = 12.0548, df = 6, p-value = 0.06 Comparison of x by treatment (Bonferroni) Row Mean-| Col Mean | 0 B C CA L SO ---------+------------------------------------------------------------------ B | 0.000000 | 1.0000 | C | -1.813193 -1.813193 | 0.7329 0.7329 | CA | -0.604397 -0.604397 1.208795 | 1.0000 1.0000 1.0000 | L | -2.417591 -2.417591 -0.604397 -1.813193 | 0.1640 0.1640 1.0000 0.7329 | SO | 0.000000 0.000000 1.813193 0.604397 2.417591 | 1.0000 1.0000 0.7329 1.0000 0.1640 | TR | -1.510994 -1.510994 0.302198 -0.906596 0.906596 -1.510994 | 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 And then after all these tables (for all the OTUs) there are the values with the OTUs code:$C11||IVN7XNJ04I9G4B
$C11||IVN7XNJ04I9G4B$chi2
 12.05479

$C11||IVN7XNJ04I9G4B$Z
 0.0000000 -1.8131937 -1.8131937 -0.6043979 -0.6043979
 1.2087958 -2.4175915 -2.4175915 -0.6043979 -1.8131937
 0.0000000 0.0000000 1.8131937 0.6043979 2.4175915
 -1.5109947 -1.5109947 0.3021989 -0.9065968 0.9065968
 -1.5109947

$C11||IVN7XNJ04I9G4B$P
 0.500000000 0.034900979 0.034900979 0.272789571 0.272789571
 0.113370658 0.007811802 0.007811802 0.272789571 0.034900979
 0.500000000 0.500000000 0.034900979 0.272789571 0.007811802
 0.065394899 0.065394899 0.381250205 0.182310020 0.182310020
 0.065394899

$C11||IVN7XNJ04I9G4B$P.adjusted
 1.0000000 0.7329206 0.7329206 1.0000000 1.0000000 1.0000000
 0.1640478 0.1640478 1.0000000 0.7329206 1.0000000 1.0000000
 0.7329206 1.0000000 0.1640478 1.0000000 1.0000000 1.0000000
 1.0000000 1.0000000 1.0000000

The two parts are not connected and I can't understand why...