R 在dcast.data.table中使用函数列表时出错
我试图使用R 在dcast.data.table中使用函数列表时出错,r,data.table,R,Data.table,我试图使用dcast.data.table重塑数据,但是,当我使用预定义的函数列表时,dcast.data.table抛出错误 require(data.table) require(Hmisc) n <- 2 contributors <- 1:2 dates <- 2 DT <- data.table(ID = rep(rep(1:n, contributors), each = dates)) DT[, contributor := c(1,1,2,2,2,3
dcast.data.table
重塑数据,但是,当我使用预定义的函数列表时,dcast.data.table
抛出错误
require(data.table)
require(Hmisc)
n <- 2
contributors <- 1:2
dates <- 2
DT <- data.table(ID = rep(rep(1:n, contributors), each = dates))
DT[, contributor := c(1,1,2,2,2,3)]
DT[, date := c(1,2,1,1,2,2)]
DT[, amount := rnorm(.N)]
DT[, rate := c(1,1,1,3,3,4)]
DT
# ID contributor date amount rate
# 1: 1 1 1 -1.3888607 1
# 2: 1 1 2 -0.2787888 1
# 3: 2 2 1 -0.1333213 1
# 4: 2 2 1 0.6359504 3
# 5: 2 2 2 -0.2842529 3
# 6: 2 3 2 -2.6564554 4
var.list <- as.list(Cs(amount, rate))
collapse <- function(x) paste(x, collapse = ',')
fun.list <- list(sum, collapse)
dcast.data.table(data = DT, ID + contributor ~ date,
fun.aggregate = fun.list,
value.var = var.list, fill = NA)
# Error in aggregate_funs(fun.call, lvals, sep, ...) :
# When 'fun.aggregate' and 'value.var' are both lists, 'value.var' must be either of length =1 or =length(fun.aggregate).
如果在dcast
中直接定义了fun.aggregate
,则不存在任何问题:
dcast.data.table(data = DT, ID + contributor ~ date,
fun.aggregate = list(sum, collapse),
value.var = var.list, fill = NA)
# ID contributor amount_sum_1 amount_sum_2 rate_collapse_1 rate_collapse_2
# 1: 1 1 -1.3888607 -0.2787888 1 1
# 2: 2 2 0.5026291 -0.2842529 1,3 3
# 3: 2 3 NA -2.6564554 NA 4
我想知道为什么会发生这种情况,以及如何绕过此错误,以便在
dcast.data.table
中使用预定义的函数列表。出于其价值,您可以手动构建对dcast
的调用,使用substitute()
将用户提供的列表文本传递给dcast
,如下所示:
z = as.data.table(expand.grid(a=LETTERS[1:3],b=1:3,c=5:6,d=3:4,stringsAsFactors =FALSE))[sample(36,9)]
myfun = function(DT,fmla,funs,vars)
do.call("dcast",list(zz,a~.,fun=substitute(funs),value.var = list('c','d')))
myfun(z,a~.,list(sum,mean),list('c','d'))
> a c_sum d_mean
> 1: A 24 3.500000
> 2: B 10 3.500000
> 3: C 18 3.333333
但是,您的用户(即在本例中调用myfun()
的用户)必须提供一个列表文字,因为这无法绕过dcast
的内部,dcast遍历传递给fun.aggregate
的参数的AST,后者需要列表文字 看来有报道了
z = as.data.table(expand.grid(a=LETTERS[1:3],b=1:3,c=5:6,d=3:4,stringsAsFactors =FALSE))[sample(36,9)]
myfun = function(DT,fmla,funs,vars)
do.call("dcast",list(zz,a~.,fun=substitute(funs),value.var = list('c','d')))
myfun(z,a~.,list(sum,mean),list('c','d'))
> a c_sum d_mean
> 1: A 24 3.500000
> 2: B 10 3.500000
> 3: C 18 3.333333