通过赋值列聚集data.frame中的行
我有一个例子通过赋值列聚集data.frame中的行,r,dataframe,aggregate,R,Dataframe,Aggregate,我有一个例子data.frame: set.seed(1) df <- data.frame(id = letters[1:10], a = sample(100,10), b = sample(100,10), aggregate_with = c(rep(NA,6),"y","b","b","e"), aggregate_order = c(rep(NA,6),"a,b","a,b","b,a","a,b")) > df id a b
data.frame
:
set.seed(1)
df <- data.frame(id = letters[1:10], a = sample(100,10), b = sample(100,10),
aggregate_with = c(rep(NA,6),"y","b","b","e"), aggregate_order = c(rep(NA,6),"a,b","a,b","b,a","a,b"))
> df
id a b aggregate_with aggregate_order
1 a 27 21 <NA> <NA>
2 b 37 18 <NA> <NA>
3 c 57 68 <NA> <NA>
4 d 89 38 <NA> <NA>
5 e 20 74 <NA> <NA>
6 f 86 48 <NA> <NA>
7 g 97 98 y a,b
8 h 62 93 b a,b
9 i 58 35 b b,a
10 j 6 71 e a,b
如您所见,聚合了
中第2行的列a
。df
分别是df
中第2行、第8行和第9行的列a
、a
和b
,反之亦然。聚合中第5行的a
和b
。df
对df
中第5行和第10行的a
和b
列求和。虽然df
中的第7行有一个aggregate\u和值,但它在df
中不存在,因此没有被聚合。我使用的是数据表
库
library(data.table)
dt <- as.data.table(df)
#a table to join with
dt2 <- dt[, list(id = aggregate_with, a, b, aggregate_order)]
#set the right order
dt2[, c('a', 'b') := list(ifelse(aggregate_order == 'a,b', a, b), ifelse(aggregate_order == 'a,b', b, a))]
setkey(dt2, id)
#joining tables
res <- dt2[dt]
#replacing NA's with 0 and summing
for (j in c('a', 'b')) set(res, which(is.na(res[[j]])), j, 0)
res[!aggregate_with %in% id, list(a = sum(a) + i.a[1], b = sum(b) + i.b[1]), by = id]
库(data.table)
dt循环-但是认为有一个更优雅的解决方案。你应该用你所拥有的进行编辑,这样人们就不会花太多的精力去达到你已经达到的程度。
library(data.table)
dt <- as.data.table(df)
#a table to join with
dt2 <- dt[, list(id = aggregate_with, a, b, aggregate_order)]
#set the right order
dt2[, c('a', 'b') := list(ifelse(aggregate_order == 'a,b', a, b), ifelse(aggregate_order == 'a,b', b, a))]
setkey(dt2, id)
#joining tables
res <- dt2[dt]
#replacing NA's with 0 and summing
for (j in c('a', 'b')) set(res, which(is.na(res[[j]])), j, 0)
res[!aggregate_with %in% id, list(a = sum(a) + i.a[1], b = sum(b) + i.b[1]), by = id]