R 在特定条件下计算值
我试图创建变量来计算前几行中的某个值。所以对于第三行中的计数a,我需要计算第1~3行中“a”的数量。像这样,我想创建R 在特定条件下计算值,r,data.table,R,Data.table,我试图创建变量来计算前几行中的某个值。所以对于第三行中的计数a,我需要计算第1~3行中“a”的数量。像这样,我想创建计数a、计数b、计数c、计数d、计数e(如果var1的唯一值是c(a、b、c、d、e)) 数据: var1 count_a count_b count_c ... a 0 0 0 a 1 0 0 b 2
计数a、计数b、计数c、计数d、计数e
(如果var1
的唯一值是c(a、b、c、d、e)
)
数据:
var1 count_a count_b count_c ...
a 0 0 0
a 1 0 0
b 2 0 0
b 2 1 0
c 2 2 0
a 2 2 1
d 3 2 1
e 3 2 1
下面是数据的代码
我想使用data.table
中的setDT(data)
函数来实现此函数
count_a = cumsum(var1 == "a")
count_a
[1] 1 2 2 2 2 3 3 3
这实现了“第三行中的a计数,我需要计算第1~3行中“a”的数量”,但这与示例中的不同。使用cumsum的解决方案:
# OPs data
foo <- c("a", "a", "b", "b", "c", "a", "d", "e")
# Use cumsum to get cumulative sum
# Using dummy variable to get first count as 0
sapply(unique(foo), function(x) cumsum(c("dummy", foo) == x))
# a b c d e
# [1,] 0 0 0 0 0
# [2,] 1 0 0 0 0
# [3,] 2 0 0 0 0
# [4,] 2 1 0 0 0
# [5,] 2 2 0 0 0
# [6,] 2 2 1 0 0
# [7,] 3 2 1 0 0
# [8,] 3 2 1 1 0
# [9,] 3 2 1 1 1
# Use data.table to join everything (as wanted by OP)
library(data.table)
result <- data.table(foo,
sapply(unique(foo), function(x) cumsum(c("dummy", foo) == x)))
setnames(result, c("var1", paste0("count_", unique(foo))))
#操作数据
foo由于OP明确要求提供数据表
解决方案,这里有两种稍微不同的方法。请注意,这些是的替代实现):
我也尝试应用中使用的方法,但需要大量抛光才能获得预期结果,如下所示:
DT <- data.table(var1)
DT[, rn := .I][DT, on = .(rn < rn), by = .EACHI, .SD[, .(N = .N), by = var1]][
, dcast(.SD, rn ~ var1, fill = 0)][DT, on = "rn"]
我有var1,希望创造第二个第三个。。。列(计数a,计数b,…)请接受有助于解决问题的答案
V2 a b c d e
1: a 0 0 0 0 0
2: a 1 0 0 0 0
3: b 2 0 0 0 0
4: b 2 1 0 0 0
5: c 2 2 0 0 0
6: a 2 2 1 0 0
7: d 3 2 1 0 0
8: e 3 2 1 1 0
CJ(unique(var1), var1, sorted = FALSE)[
, cnt := cumsum(V1 == shift(V2, fill = "")), by = rleid(V1)][
, dcast(.SD, rowid(V1) ~ V1)][, V1 := var1][]
V1 a b c d e
1: a 0 0 0 0 0
2: a 1 0 0 0 0
3: b 2 0 0 0 0
4: b 2 1 0 0 0
5: c 2 2 0 0 0
6: a 2 2 1 0 0
7: d 3 2 1 0 0
8: e 3 2 1 1 0
DT <- data.table(var1)
DT[, rn := .I][DT, on = .(rn < rn), by = .EACHI, .SD[, .(N = .N), by = var1]][
, dcast(.SD, rn ~ var1, fill = 0)][DT, on = "rn"]
rn a b c d NA var1
1: 1 0 0 0 0 1 a
2: 2 1 0 0 0 0 a
3: 3 2 0 0 0 0 b
4: 4 2 1 0 0 0 b
5: 5 2 2 0 0 0 c
6: 6 2 2 1 0 0 a
7: 7 3 2 1 0 0 d
8: 8 3 2 1 1 0 e