# collapsing rows of a dataframe and "summing" the labels of the old rows

#### gianmarco

##### TS Contributor
Hello.

Let's suppose I have the following dataframe:
Code:
mydata <- structure(list(none = c(4, 4, 25, 18, 10), light = c(2, 3, 10,
24, 6), medium = c(3, 7, 12, 33, 7), heavy = c(2, 4, 4, 13, 2
), clust = structure(c(1L, 1L, 2L, 3L, 1L), .Label = c("1", "2",
"3"), class = "factor")), .Names = c("none", "light", "medium",
"heavy", "clust"), row.names = c("SM", "JM", "SE", "JE", "SC"
), class = "data.frame")

none light medium heavy clust
SM    4     2      3     2     1
JM    4     3      7     4     1
SE   25    10     12     4     2
JE   18    24     33    13     3
SC   10     6      7     2     1
What I wish to accomplish is (1) to collapse the rows by cluster membership (which is indicated by the last columns to the right) and (2) to have new row labels including the labels of the collapsed rows.

Point (1) can be accomplished by:
Code:
aggregate(. ~ clust, data=mydata, sum)
which returns what follows:
Code:
clust none light medium heavy
1     1   18    11     17     8
2     2   25    10     12     4
3     3   18    24     33    13
I would like to have suggestions about point (2). I would like to get something similar to the above, where instead of (say) 1 I would get the names of the categories belonging to cluster 1 (SM-JM-SC).

Thank you
Best
Gm

#### Dason

Code:
> nms <- tapply(rownames(mydata), mydata\$clust, paste, collapse = "-")
> out <- aggregate(. ~ clust, data=mydata, sum)
> out
clust none light medium heavy
1     1   18    11     17     8
2     2   25    10     12     4
3     3   18    24     33    13
> rownames(out) <- nms
> out
clust none light medium heavy
SM-JM-SC     1   18    11     17     8
SE           2   25    10     12     4
JE           3   18    24     33    13