dplyr group by RunID将值携带到下一个组
我有我想要分组的数据,执行计算,然后是最终结果,将其用于下一组的计算 我们使用条件语句并按组执行计算,例如:dplyr group by RunID将值携带到下一个组,r,dplyr,R,Dplyr,我有我想要分组的数据,执行计算,然后是最终结果,将其用于下一组的计算 我们使用条件语句并按组执行计算,例如: # Example Data condition <- c(0,0,0,1,1,1,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,1,1,0) col_a <- c(0,0,0,2,3,4,0,0,0,2,4,5,6,0,0,0,0,0,0,0,0,1,2,0) col_b <- c(0,0,0,10,131,14,0,0,0,22,64,75,96,0
# Example Data
condition <- c(0,0,0,1,1,1,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,1,1,0)
col_a <- c(0,0,0,2,3,4,0,0,0,2,4,5,6,0,0,0,0,0,0,0,0,1,2,0)
col_b <- c(0,0,0,10,131,14,0,0,0,22,64,75,96,0,0,0,0,0,0,0,0,41,52,0)
df <- data.frame(condition,col_a,col_b)
我想做的是。对于第一个结果,计算列中的结果为28。我想将该值转移到下一个组,并插入列a,第10行(第28行,第2行)。然后,随着该值的更新。第二组计算结果为96*28=2688,而不是(96*2=192)
结转将始终插入每个组的第一行,如上例所示
预期产出:
condition col_a col_b calculation
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 1 2 10 0
5 1 3 131 0
6 1 4 14 28
7 0 0 0 0
8 0 0 0 0
9 0 0 0 0
10 1 28 22 0
11 1 4 64 0
12 1 5 75 0
13 1 6 96 2688
14 0 0 0 0
15 0 0 0 0
其他解决方案:
我将删除所有0,s。在每组底部添加2个标识号,以进行连续运行,然后使用for循环进行抓取和替换。可能不是最优雅,但似乎很有效:
# Subset to remove all 0
subset.no.zero <- subset(output,condition >0)
# Loop to move values
for (i in 1:nrow(subset.no.zero)) {
temp <- ifelse(subset.no.zero$last.tag[i-1] == 2, subset.no.zero$calculation[i-1],subset.no.zero$col_a[i])
subset.no.zero$new_col_a[i] <- data.frame(temp)
}
# Re join by index no.
final_out <- full_join(output,subset.no.zero, by="index")
#删除所有0的子集
子集编号0(0)
#循环以移动值
对于(1中的i:nrow(子集编号0)){
temp我只能提供data.table解决方案,但也许您可以将逻辑转换为dplyr:
library(data.table)
setDT(df)
#first group multiply 2 and 14
df[rleid(condition) %in% 1:2 & condition != 0,
calculation := {
res <- rep(NA_real_, .N)
res[.N] <- col_b[.N] * col_a[1]
res
}
]
#all groups other than first copy col_b
df[, calculation := if (condition[.N] != 0) {
if(is.na(calculation[.N])) {
res <- rep(NA_real_, .N)
res[.N] <- col_b[.N]
res
} else calculation
} else NA_real_,
by = rleid(condition)
]
#cumulative product
df[!is.na(calculation),
calculation := cumprod(calculation)]
#copy values into col_a
df[i = df[, .(condition = condition[1], i = .I[1]),
by = rleid(condition)][condition == 1L][-1, i], #finds rows to replace values
col_a := head(df[!is.na(calculation), calculation], -1)
]
# condition col_a col_b calculation
# 1: 0 0 0 NA
# 2: 0 0 0 NA
# 3: 0 0 0 NA
# 4: 1 2 10 NA
# 5: 1 3 131 NA
# 6: 1 4 14 28
# 7: 0 0 0 NA
# 8: 0 0 0 NA
# 9: 0 0 0 NA
#10: 1 28 22 NA
#11: 1 4 64 NA
#12: 1 5 75 NA
#13: 1 6 96 2688
#14: 0 0 0 NA
#15: 0 0 0 NA
#16: 0 0 0 NA
#17: 0 0 0 NA
#18: 0 0 0 NA
#19: 0 0 0 NA
#20: 0 0 0 NA
#21: 0 0 0 NA
#22: 1 2688 41 NA
#23: 1 2 52 139776
#24: 0 0 0 NA
# condition col_a col_b calculation
库(data.table)
setDT(df)
#第一组乘以2和14
df[rleid(条件)%in%1:2&条件!=0,
计算:={
res你能展示你的预期输出吗?你是如何得到28
?当我运行你的代码时,我得到了20(行数:6)和44(行数:13)我不确定,我重新运行,得到了与示例相同的结果。语句是:first(colu a)*last(col b),所以应该是这样的?哦..你有一个错误。你有first(col a)*first(col b)
而不是…*last()…哎哟,让我来解决!
# Subset to remove all 0
subset.no.zero <- subset(output,condition >0)
# Loop to move values
for (i in 1:nrow(subset.no.zero)) {
temp <- ifelse(subset.no.zero$last.tag[i-1] == 2, subset.no.zero$calculation[i-1],subset.no.zero$col_a[i])
subset.no.zero$new_col_a[i] <- data.frame(temp)
}
# Re join by index no.
final_out <- full_join(output,subset.no.zero, by="index")
library(data.table)
setDT(df)
#first group multiply 2 and 14
df[rleid(condition) %in% 1:2 & condition != 0,
calculation := {
res <- rep(NA_real_, .N)
res[.N] <- col_b[.N] * col_a[1]
res
}
]
#all groups other than first copy col_b
df[, calculation := if (condition[.N] != 0) {
if(is.na(calculation[.N])) {
res <- rep(NA_real_, .N)
res[.N] <- col_b[.N]
res
} else calculation
} else NA_real_,
by = rleid(condition)
]
#cumulative product
df[!is.na(calculation),
calculation := cumprod(calculation)]
#copy values into col_a
df[i = df[, .(condition = condition[1], i = .I[1]),
by = rleid(condition)][condition == 1L][-1, i], #finds rows to replace values
col_a := head(df[!is.na(calculation), calculation], -1)
]
# condition col_a col_b calculation
# 1: 0 0 0 NA
# 2: 0 0 0 NA
# 3: 0 0 0 NA
# 4: 1 2 10 NA
# 5: 1 3 131 NA
# 6: 1 4 14 28
# 7: 0 0 0 NA
# 8: 0 0 0 NA
# 9: 0 0 0 NA
#10: 1 28 22 NA
#11: 1 4 64 NA
#12: 1 5 75 NA
#13: 1 6 96 2688
#14: 0 0 0 NA
#15: 0 0 0 NA
#16: 0 0 0 NA
#17: 0 0 0 NA
#18: 0 0 0 NA
#19: 0 0 0 NA
#20: 0 0 0 NA
#21: 0 0 0 NA
#22: 1 2688 41 NA
#23: 1 2 52 139776
#24: 0 0 0 NA
# condition col_a col_b calculation