dataframe: proportion of the occurrence of a given ordering

gianmarco

TS Contributor
#1
Hello!
Suppose I have a dataframe as the one below:
Code:
Geology Biochemistry Chemistry Zoology Physics Engineering
1         6            1         2       5      4           3
2         3            4         6       5       1           2
3         6            2         3       4       5           1
The leftmost number is the row number.

This is just a short example of a larger dataframe (say, with 1000 rows) containing the relative position of the column categories as iteratively calculated by an ordination method.

I am wondering how can I calculate how many times (i.e., the proportion) each ordering of categories occurs in the whole dataframe.
In other words, given a dataframe containing say 1000 rows, I would like to end up having something like the following:

-order 4,3,5,6,2,1 > 83%
-order 6,1,3,2,4,5 > 60%
etc etc etc

Any hint?
Thanks
Gm
 
#2
Can't you just join them with "paste0" and compute the frequency with "table".

Code:
a1 <- c(1,1,3)
a2 <- c(2,2,2)
a3 <- c(3,3,1)

join <- paste0(a1,a2,a3)

join
table(join)
But I had difficulty in understanding your text so I am not sure if I got it right. :)
 

gianmarco

TS Contributor
#3
Thanks Greta.
Given the fictional dataframe I posted, how can I use past0 row-wisely?

I tried, but I come up with, e.g., 6,3,6 instead of 6,1,2,5,4,3.

Cheers
Gm

EDIT
my issue is somewhat similar to this ONE from another forum, but I do not understand how to apply the solution to my case.
 
#4
Thanks Greta.
Given the fictional dataframe I posted, how can I use past0 row-wisely?
It does paste row-wise. Look at datf below.

Code:
a1 <- c(1,1,3)
a2 <- c(2,2,2)
a3 <- c(3,3,1)

datf <- data.frame(a1,a2,a3) 
datf

join <- paste0(a1,a2,a3)
join

table(join)
Otherwise make your data as a reproducible example.
 

gianmarco

TS Contributor
#5
Otherwise make your data as a reproducible example.
Code:
mydata <-structure(list(Geology = c(6, 5, 4, 4, 1, 2, 2, 3, 2, 5, 7, 3, 
5, 1, 1, 7, 1, 3, 2, 3, 1, 4, 7, 1, 3, 2, 7, 6, 3, 6), Biochemistry = c(9, 
9, 10, 8, 10, 1, 3, 1, 9, 10, 1, 8, 9, 2, 10, 8, 5, 10, 10, 9, 
7, 1, 6, 6, 4, 10, 1, 7, 1, 8), Chemistry = c(5, 3, 7, 7, 2, 
4, 4, 9, 7, 6, 8, 1, 7, 3, 2, 6, 4, 9, 6, 1, 5, 5, 3, 4, 5, 4, 
8, 8, 6, 4), Zoology = c(1, 8, 9, 10, 7, 5, 8, 10, 5, 7, 10, 
6, 8, 7, 6, 9, 3, 1, 4, 5, 4, 6, 2, 8, 10, 5, 4, 5, 7, 3), Physics = c(10, 
10, 3, 3, 4, 6, 5, 2, 3, 8, 5, 2, 1, 5, 7, 3, 9, 5, 8, 4, 3, 
10, 1, 2, 6, 6, 6, 3, 5, 2), Engineering = c(3, 6, 8, 2, 6, 3, 
9, 5, 8, 4, 2, 5, 6, 8, 8, 4, 2, 8, 1, 10, 6, 3, 8, 5, 9, 3, 
3, 10, 8, 1), Microbiology = c(4, 4, 2, 9, 8, 7, 10, 4, 10, 1, 
4, 9, 2, 4, 9, 10, 7, 2, 3, 8, 10, 2, 10, 3, 2, 9, 10, 4, 9, 
10), Botany = c(2, 2, 6, 6, 5, 9, 6, 7, 4, 2, 9, 4, 3, 9, 5, 
5, 8, 7, 7, 6, 2, 9, 4, 10, 7, 8, 9, 1, 4, 7), Statistics = c(8, 
7, 5, 1, 9, 10, 1, 8, 1, 3, 6, 10, 10, 10, 3, 1, 10, 4, 9, 7, 
8, 8, 5, 7, 1, 7, 2, 2, 2, 9), Mathematics = c(7, 1, 1, 5, 3, 
8, 7, 6, 6, 9, 3, 7, 4, 6, 4, 2, 6, 6, 5, 2, 9, 7, 9, 9, 8, 1, 
5, 9, 10, 5)), .Names = c("Geology", "Biochemistry", "Chemistry", 
"Zoology", "Physics", "Engineering", "Microbiology", "Botany", 
"Statistics", "Mathematics"), row.names = c(NA, -30L), class = "data.frame")
 
#6
Well....


Code:
join2 <- paste0(mydata$Geology  , mydata$Biochemistry , mydata$Chemistry ,mydata$Physics  , mydata$Engineering  )  #and so on...

join2
table(join2)
I fear that I don't understand what you want.... good luck! :)