Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/sorting/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在R中使用sort()或order()对因子排序_R_Sorting - Fatal编程技术网

在R中使用sort()或order()对因子排序

在R中使用sort()或order()对因子排序,r,sorting,R,Sorting,我正在尝试根据一列对数据帧进行排序。我的数据帧结构是: data.frame': 9194 obs. of 7 variables: $ taxonomy_y: Factor w/ 51 levels "Alistipes","Alphaproteobacteria",..: 1 1 1 1 1 1 1 1 1 1 ... $ otu1id : Factor w/ 51 levels "_1","_10","_102",..: 12 12 12 12 12 12 12 12 1

我正在尝试根据一列对数据帧进行排序。我的数据帧结构是:

data.frame':    9194 obs. of  7 variables:
 $ taxonomy_y: Factor w/ 51 levels "Alistipes","Alphaproteobacteria",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ otu1id    : Factor w/ 51 levels "_1","_10","_102",..: 12 12 12 12 12 12 12 12 12 12 ...
 $ taxonomy_x: Factor w/ 51 levels "Alistipes","Alphaproteobacteria",..: 45 50 42 24 17 14 2 7 39 44 ...
 $ otu2id    : Factor w/ 51 levels "_1","_10","_102",..: 23 41 26 51 2 10 25 35 42 5 ...
 $ otu2      : chr  "333" "241" "14" "56" ...
 $ otu1      : chr  "16" "119" "90" "16" ...
 $ CONTROL1  : num  0.0897 0.0864 0.2444 0.1818 0.5976 ...
我的数据框看起来像:

     taxonomy_y otu1id  taxonomy_x   otu2id otu2 otu1   
 1  Alistipes    _14    Roseburia      _29  333  16   
 2  Alistipes    _14    Turicibacter   _63  241  119 
 3  Alistipes    _14    Parasutterella _37  14   90 
 4  Alistipes    _14    Dorea          _98  56   16 
 5  Alistipes    _14    Clostridium    _10  178  16 
 6  Alistipes    _14    Clostridium S  _12  155  16 
我尝试对column1id使用sort()和order()函数,但排序不正确,如下所示:(请关注otuid列)


为什么我在2之前得到10?我需要像_1,_2,_3 _4….这样的排序顺序。我如何做到这一点?我正在使用ubundu OS

,因为
otu1id
列是
factor
,您无法直接订购

例如,观察数据的级别

factor(as.character(1:10))
# [1] 1  2  3  4  5  6  7  8  9  10
#Levels: 1 10 2 3 4 5 6 7 8 9
我们可以删除字符串开头的
“quot
,将数据转换为数字和
顺序

df[order(as.numeric(sub("_", "", df$otu1id))), ]
#OR
#df[order(as.numeric(sub("\\D", "", df$otu1id))), ]

#   taxonomy_y otu1id     taxonomy_x otu2id otu2 otu1
#1   Alistipes     _1      Roseburia    _29  333   16
#2   Alistipes     _1   Turicibacter    _63  241  119
#3   Alistipes     _1 Parasutterella    _37   14   90
#9   Alistipes     _2    Clostridium    _12  155   16
#4   Alistipes    _10          Dorea    _98   56   16
#5   Alistipes    _10    Clostridium    _10  178   16
#6   Alistipes    _10    Clostridium    _12  155   16
#10  Alistipes    _23   ClostridiumS    _12  155   16
#7   Alistipes   _100    Clostridium    _12  155   16
#8   Alistipes  _1008    Clostridium    _12  155   16

如果将
otu1id
转换为字符,则可以直接从
gtools

df[gtools::mixedorder(as.character(df$otu1id)), ]
数据

df <- structure(list(taxonomy_y = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), .Label = "Alistipes", class = "factor"), otu1id = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 4L, 5L, 6L), .Label = c("_1", "_10", 
"_100", "_1008", "_2", "_23"), class = "factor"), taxonomy_x = structure(c(5L, 
6L, 4L, 3L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("Clostridium", 
"ClostridiumS", "Dorea", "Parasutterella", "Roseburia", "Turicibacter"
), class = "factor"), otu2id = structure(c(3L, 5L, 4L, 6L, 1L, 
2L, 2L, 2L, 2L, 2L), .Label = c("_10", "_12", "_29", "_37", "_63", 
"_98"), class = "factor"), otu2 = c(333L, 241L, 14L, 56L, 178L, 
155L, 155L, 155L, 155L, 155L), otu1 = c(16L, 119L, 90L, 16L, 
16L, 16L, 16L, 16L, 16L, 16L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

df由于
otu1id
列是
factor
您不能直接订购

例如,观察数据的级别

factor(as.character(1:10))
# [1] 1  2  3  4  5  6  7  8  9  10
#Levels: 1 10 2 3 4 5 6 7 8 9
我们可以删除字符串开头的
“quot
,将数据转换为数字和
顺序

df[order(as.numeric(sub("_", "", df$otu1id))), ]
#OR
#df[order(as.numeric(sub("\\D", "", df$otu1id))), ]

#   taxonomy_y otu1id     taxonomy_x otu2id otu2 otu1
#1   Alistipes     _1      Roseburia    _29  333   16
#2   Alistipes     _1   Turicibacter    _63  241  119
#3   Alistipes     _1 Parasutterella    _37   14   90
#9   Alistipes     _2    Clostridium    _12  155   16
#4   Alistipes    _10          Dorea    _98   56   16
#5   Alistipes    _10    Clostridium    _10  178   16
#6   Alistipes    _10    Clostridium    _12  155   16
#10  Alistipes    _23   ClostridiumS    _12  155   16
#7   Alistipes   _100    Clostridium    _12  155   16
#8   Alistipes  _1008    Clostridium    _12  155   16

如果将
otu1id
转换为字符,则可以直接从
gtools

df[gtools::mixedorder(as.character(df$otu1id)), ]
数据

df <- structure(list(taxonomy_y = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), .Label = "Alistipes", class = "factor"), otu1id = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 4L, 5L, 6L), .Label = c("_1", "_10", 
"_100", "_1008", "_2", "_23"), class = "factor"), taxonomy_x = structure(c(5L, 
6L, 4L, 3L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("Clostridium", 
"ClostridiumS", "Dorea", "Parasutterella", "Roseburia", "Turicibacter"
), class = "factor"), otu2id = structure(c(3L, 5L, 4L, 6L, 1L, 
2L, 2L, 2L, 2L, 2L), .Label = c("_10", "_12", "_29", "_37", "_63", 
"_98"), class = "factor"), otu2 = c(333L, 241L, 14L, 56L, 178L, 
155L, 155L, 155L, 155L, 155L), otu1 = c(16L, 119L, 90L, 16L, 
16L, 16L, 16L, 16L, 16L, 16L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
df