用dplyr进行多级排序

用dplyr进行多级排序,r,dplyr,R,Dplyr,我有以下数据框: tdf <- structure(list(GO = c("Cytokine-cytokine receptor interaction", "Cytokine-cytokine receptor interaction|Endocytosis", "I-kappaB kinase/NF-kappaB signaling", "NF-kappa B signaling pathway", "NF-kappaB import into nucleus", "T cel

我有以下数据框:

tdf <- structure(list(GO = c("Cytokine-cytokine receptor interaction", 
"Cytokine-cytokine receptor interaction|Endocytosis", "I-kappaB kinase/NF-kappaB signaling", 
"NF-kappa B signaling pathway", "NF-kappaB import into nucleus", 
"T cell chemotaxis"), PosCount = c(17, 18, 4, 5, 1, 2), shortgo = structure(c(1L, 
1L, 2L, 2L, 2L, 3L), .Label = c("z", "X", "y"), class = "factor")), .Names = c("GO", 
"PosCount", "shortgo"), row.names = c(NA, 6L), class = "data.frame")
然后我想做的是首先按
shortgo
字母顺序排序-不区分大小写-然后对每个
shortgo
组内部按
PosCount
排序。因此:

                                                  GO PosCount shortgo
                       NF-kappa B signaling pathway        5       X
                I-kappaB kinase/NF-kappaB signaling        4       X
                      NF-kappaB import into nucleus        1       X
                                  T cell chemotaxis        2       y
 Cytokine-cytokine receptor interaction|Endocytosis       18       z
             Cytokine-cytokine receptor interaction       17       z
但为什么这不起作用:

library(dplyr)
tdf[order(tdf$shortgo),]
tdf <- tdf %>% group_by(shortgo) %>% arrange(desc(PosCount))
库(dplyr)
tdf[订单(tdf$shortgo),]
tdf%group_by(shortgo)%%>%arrange(desc(PosCount))

正确的方法是什么?

您只需将它们组合到一个通话中即可。尽管您需要先将
shortgo
转换为
character
类(请参见下面的说明)


因此,您需要转换为字符的原因是因为
shortgo
是一个因子,它基本上是一个
integer
向量,具有
levels
属性。因此,
order
使用这些整数对向量进行排序。在您的情况下,整数与级别的正确顺序不对应

tdf$shortgo
## [1] z z x x x y
## Levels: z x y
as.numeric(tdf$shortgo)
## [1] 1 1 2 2 2 3
因此您可以看到,
z
被编码为1,
x
被编码为2,
y
被编码为3,而它应该是3,2,1。因此,
sort
返回“错误”结果

比照

test <- factor(sort(as.character(tdf$shortgo)))
sort(test)
## [1] x x x y z z
## Levels: x y z

test您可以使用
order
base
R

with(tdf, tdf[order(tolower(shortgo), -PosCount),])

#                                                  GO PosCount shortgo
#4                       NF-kappa B signaling pathway        5       X
#3                I-kappaB kinase/NF-kappaB signaling        4       X
#5                      NF-kappaB import into nucleus        1       X
#6                                  T cell chemotaxis        2       y
#2 Cytokine-cytokine receptor interaction|Endocytosis       18       z
#1             Cytokine-cytokine receptor interaction       17       z

请注意,
tolower
除了将
因子
转换为
字符
sort(tdf$shortgo)
# 1] z z x x x y
# Levels: z x y
test <- factor(sort(as.character(tdf$shortgo)))
sort(test)
## [1] x x x y z z
## Levels: x y z
with(tdf, tdf[order(tolower(shortgo), -PosCount),])

#                                                  GO PosCount shortgo
#4                       NF-kappa B signaling pathway        5       X
#3                I-kappaB kinase/NF-kappaB signaling        4       X
#5                      NF-kappaB import into nucleus        1       X
#6                                  T cell chemotaxis        2       y
#2 Cytokine-cytokine receptor interaction|Endocytosis       18       z
#1             Cytokine-cytokine receptor interaction       17       z