R 当列中的值在df中相等时添加行_R

R 当列中的值在df中相等时添加行

R 当列中的值在df中相等时添加行,r,R,对于示例数据帧： df <- structure(list(animal.1 = structure(c(1L, 1L, 2L, 2L, 2L, 4L, 4L, 3L, 1L, 1L), .Label = c("cat", "dog", "horse", "rabbit"), class = "factor"), animal.2 = structure(c(1L, 2L,

对于示例数据帧：

df <- structure(list(animal.1 = structure(c(1L, 1L, 2L, 2L, 2L, 4L, 
                                        4L, 3L, 1L, 1L), .Label = c("cat", "dog", "horse", "rabbit"), class = "factor"), 
                 animal.2 = structure(c(1L, 2L, 2L, 2L, 4L, 4L, 1L, 1L, 3L, 
                                        1L), .Label = c("cat", "dog", "hamster", "rabbit"), class = "factor"), 
                 number = c(5L, 3L, 2L, 5L, 1L, 4L, 6L, 7L, 1L, 11L)), .Names = c("animal.1", 
                                                                                  "animal.2","number"), class = "data.frame", row.names = c(NA, 
                                                                                                                                             -10L))

df 400K观测值，所以任何人都可以推荐任何能够处理大型数据集的东西，这将是非常棒的
提前感谢。
一个选项是使用数据表。将“data.frame”转换为“data.table”（setDT（
），如果“animal.1”行等于“animal.2”，则按两列分组后，将“number”替换为“number”的sum
，最后得到唯一的行
library(data.table)
setDT(df)[as.character(animal.1)==as.character(animal.2), 
               number:=sum(number) ,.(animal.1, animal.2)]
unique(df)
#    animal.1 animal.2 number
#1:      cat      cat     16
#2:      cat      dog      3
#3:      dog      dog      7
#4:      dog   rabbit      1
#5:   rabbit   rabbit      4
#6:   rabbit      cat      6
#7:    horse      cat      7
#8:      cat  hamster      1

或者使用dplyr
选项。方法类似于data.table
。我们按“animal.1”、“animal.2”分组，然后仅当“animal.1”等于“animal.2”时才将“number”替换为sum
，并得到唯一的行
library(dplyr)
  df %>% 
     group_by(animal.1, animal.2) %>% 
     mutate(number=replace(number,as.character(animal.1)==
                                    as.character(animal.2),
     sum(number))) %>% 
     unique()

为什么不简单地df%>%分组（动物1，动物2）%>%总结（数字=总和（数字））
？@StevenBeaupre`这与我与David Arenburg讨论过的是同一件事。只要两个区域中不同的动物没有重复，这个解决方案就行columns@StevenBeaupré用猫狗4创建一个新行，然后用这种方法和帖子中的方法尝试。现在已经清楚了。很抱歉让你重复一遍。 +1@StevenBeaupr谢谢你的评论。但是，我可能也错了（关于解释OP的描述），因为两个聪明的人提出了同样的建议。