R data.table,J中用户定义函数的命名空间

R data.table,J中用户定义函数的命名空间,r,data.table,R,Data.table,我有一个如下所示的数据表。我想计算每个市场每个信号的回报相关性 dt = data.table(mkt = rep(letters[1:3], each = 3), rtn = rnorm(9), signal1=rnorm(9), signal2=rnorm(9), signal3 = rnorm(9)) mkt rtn signal1 signal2 signal3 1: a 0.2488643 0.4110516 -0.04861252 -1.3

我有一个如下所示的数据表。我想计算每个市场每个信号的回报相关性

dt = data.table(mkt = rep(letters[1:3], each = 3), rtn = rnorm(9), signal1=rnorm(9), signal2=rnorm(9), signal3 = rnorm(9))
   mkt      rtn    signal1     signal2    signal3
1:   a  0.2488643  0.4110516 -0.04861252 -1.3599824
2:   a  1.3387256 -0.4418436 -0.17055841 -1.2161698
3:   a -1.4058236 -1.2624645 -0.24315048 -1.2722546
4:   b  1.7056606  0.2618591  2.60779232  0.7786226
5:   b  0.7913587 -1.0596116  0.31152541  1.7336651
6:   b -1.8690651  0.1942825  0.95430075 -0.7030462
7:   c -0.4937575 -1.8645226 -0.32312077 -1.7138482
8:   c -0.7153342 -0.5142624 -0.43817789 -1.3637261
9:   c  0.3766730 -0.0954339  0.71159756 -1.2118075

dt[, lapply(.SD, function(x) cor(x, rtn, use = 'c')), .SDcols = 3:5, by = mkt]
Error in is.data.frame(y) : object 'rtn' not found

如何使J中的匿名函数知道rtn列?

我认为一种方法是将其包含在
.SDcols
中,以便匿名函数能够找到
rtn
,然后可能删除
rtn
列(因为它只有1作为一个值,因为它将与自身相关):

然后你可以做:

dt2 <- dt[, lapply(.SD, function(x) cor(x, rtn, use = 'c')), .SDcols = c(2, 3:5), by = mkt]
dt2[, rtn := NULL]
dt2
#   mkt    signal1    signal2    signal3
#1:   a  0.6759421 -0.5037837  0.8605805
#2:   b -0.8494135  0.6720274  0.7832928
#3:   c -0.9425291  0.5683629 -0.9976231

dt2刚刚发现
cor
是矢量化的,所以不需要
lappy
。只需
cor(.SD,rtn,use='C')
就可以了,但仍然需要
中的
rtn
。SDcols
dt2 <- dt[, lapply(.SD, function(x) cor(x, rtn, use = 'c')), .SDcols = c(2, 3:5), by = mkt]
dt2[, rtn := NULL]
dt2
#   mkt    signal1    signal2    signal3
#1:   a  0.6759421 -0.5037837  0.8605805
#2:   b -0.8494135  0.6720274  0.7832928
#3:   c -0.9425291  0.5683629 -0.9976231