R语言中的排序与分组

R语言中的排序与分组,r,sorting,dataframe,R,Sorting,Dataframe,我在文本文件中有这样的数据 fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SOEGIYH12A6D4FC0E3 1 fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SOFLJQZ12A6D4FADA6 1 fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SOHTKMO12AB01843B0 1 fd50c4007b68a3737fe052d5a4f78ce8aa117f

我在文本文件中有这样的数据

fd50c4007b68a3737fe052d5a4f78ce8aa117f3d    SOEGIYH12A6D4FC0E3  1
fd50c4007b68a3737fe052d5a4f78ce8aa117f3d    SOFLJQZ12A6D4FADA6  1
fd50c4007b68a3737fe052d5a4f78ce8aa117f3d    SOHTKMO12AB01843B0  1
fd50c4007b68a3737fe052d5a4f78ce8aa117f3d    SODQZCY12A6D4F9D11  1
fd50c4007b68a3737fe052d5a4f78ce8aa117f3d    SOXLOQG12AF72A2D55  1
d7083f5e1d50c264277d624340edaaf3dc16095b    SOUVUHC12A67020E3B  1
d7083f5e1d50c264277d624340edaaf3dc16095b    SOUQERE12A58A75633  1
d7083f5e1d50c264277d624340edaaf3dc16095b    SOIPJAX12A8C141A2D  1
d7083f5e1d50c264277d624340edaaf3dc16095b    SOEFCDJ12AB0185FA0  2
d7083f5e1d50c264277d624340edaaf3dc16095b    SOATCSU12A8C13393A  2
我成功地将其保存在变量中,但:

  • 我需要根据第三个字段对这些数据进行排序
  • 我需要根据第一个字段对数据进行排序,并根据同一个第一个字段对数据进行分组,然后对组中的第三个字段求和
  • 有可能使用R语言吗

    输出应为:

    fd50c4007b68a3737fe052d5a4f78ce8aa117f3d 5
    d7083f5e1d50c264277d624340edaaf3dc16095b 7
    
    正如你在问题中所说的,你有两个问题:

  • 计算一个变量与另一个变量的和
  • 对数据进行排序
  • 第一个问题可以使用
    plyr
    软件包解决:

    ##Some dummy data
    library(plyr)
    dd = data.frame(V1 = rep(c("A", "A", "B"), 4), V2 = rep(1:3,each=2 ))
    
    ##The function ddply takes in a data frame dd
    ##Splits the data frame by column V1
    ##Sums the column V2
    dd1 = ddply(dd, "V1", summarise,  V2 = sum(V2))
    
    第二个问题可以通过搜索“”来解决


    Q1:按一列对数据帧进行排序通常使用
    顺序
    。您确实需要按顺序命名数据帧,这对于新用户来说可能是多余的。但是数字索引非常灵活,各种构造的数字向量也可以产生有用的结果,因此需要特定的向量对象

    > dat[ order(dat$V1), ]
                                             V1                 V2 V3
    6  d7083f5e1d50c264277d624340edaaf3dc16095b SOUVUHC12A67020E3B  1
    7  d7083f5e1d50c264277d624340edaaf3dc16095b SOUQERE12A58A75633  1
    8  d7083f5e1d50c264277d624340edaaf3dc16095b SOIPJAX12A8C141A2D  1
    9  d7083f5e1d50c264277d624340edaaf3dc16095b SOEFCDJ12AB0185FA0  2
    10 d7083f5e1d50c264277d624340edaaf3dc16095b SOATCSU12A8C13393A  2
    1  fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SOEGIYH12A6D4FC0E3  1
    2  fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SOFLJQZ12A6D4FADA6  1
    3  fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SOHTKMO12AB01843B0  1
    4  fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SODQZCY12A6D4F9D11  1
    5  fd50c4007b68a3737fe052d5a4f78ce8aa117f3d SOXLOQG12AF72A2D55  1
    
    问题2:要对类别中的向量求和并返回数据帧,请使用
    聚合

    > with(dat , aggregate(V3 ~ V1) )
                                            V1 V3
    6 d7083f5e1d50c264277d624340edaaf3dc16095b  7
    1 fd50c4007b68a3737fe052d5a4f78ce8aa117f3d  5
    
    如果需要订购:

    > dat2 <- with(dat , aggregate(V3 ~ V1) )
    > dat2[order(dat2$V1), ]
                                            V1 V3
    6 d7083f5e1d50c264277d624340edaaf3dc16095b  7
    1 fd50c4007b68a3737fe052d5a4f78ce8aa117f3d  5
    
    >dat2 dat2[订单(dat2$V1),]
    V1 V3
    6 d7083f5e1d50c264277d624340edaaf3dc16095b 7
    1 fd50c4007b68a3737fe052d5a4f78ce8aa117f3d 5
    
    > dat2 <- with(dat , aggregate(V3 ~ V1) )
    > dat2[order(dat2$V1), ]
                                            V1 V3
    6 d7083f5e1d50c264277d624340edaaf3dc16095b  7
    1 fd50c4007b68a3737fe052d5a4f78ce8aa117f3d  5