R 将结果以独占方式保存在另一个data.table中

R 将结果以独占方式保存在另一个data.table中,r,data.table,R,Data.table,我正在尝试将具有不同参数的相同函数应用于单个列,并将结果保存在单独的data.table中,而不更新/修改原始表: library(data.table) set.seed(43) dt <- data.table( a = sample(c("aaa","bbb","ccc"),15,replace = T), year=sample(c("2015","2018"),15,replace=T), b = sample(c("o","

我正在尝试将具有不同参数的相同函数应用于单个列,并将结果保存在单独的data.table中,而不更新/修改原始表:

library(data.table)
set.seed(43)
dt <- data.table(
         a = sample(c("aaa","bbb","ccc"),15,replace = T),
         year=sample(c("2015","2018"),15,replace=T),
         b = sample(c("o","r","s","c","d","f"),15,replace = T),
         variant=sample(c("osdcf", "osc", "offsco", "osc", "odfsc", "oc"),15,replace = T)
       )
stringsim_methods=c("lv","osa","dl","lcs","jw","qgram")
for (x in stringsim_methods) { 
         dt1=dt[,(x):=stringsim("oscdf",variant, method=x),by=.(variant,year)]
         }
有没有更优雅的方法来实现这一点:

   variant year        lv       osa        dl       lcs        jw    qgram
1:   osdcf 2018 0.6000000 0.8000000 0.8000000 0.8000000 0.9333333 1.0000000
2:  offsco 2015 0.3333333 0.3333333 0.3333333 0.5454545 0.6972222 0.7272727
3:   osdcf 2015 0.6000000 0.8000000 0.8000000 0.8000000 0.9333333 1.0000000
4:   odfsc 2015 0.2000000 0.2000000 0.2000000 0.6000000 0.4666667 1.0000000
5:  offsco 2018 0.3333333 0.3333333 0.3333333 0.5454545 0.6972222 0.7272727
6:   odfsc 2018 0.2000000 0.2000000 0.2000000 0.6000000 0.4666667 1.0000000
7:      oc 2015 0.4000000 0.4000000 0.4000000 0.5714286 0.8000000 0.5714286
8:     osc 2018 0.6000000 0.6000000 0.6000000 0.7500000 0.8666667 0.7500000
9:     osc 2015 0.6000000 0.6000000 0.6000000 0.7500000 0.8666667 0.7500000

谢谢。

两个更改将使它更干净:

1.在第一步中,您似乎没有真正总结,因此您只需要两个变量的唯一组合
2.您可以在j中用lappy替换for


stringsim假设您的
stringsim
功能如下

stringsim <- function(x,variant,method) paste(method, variant, sep = ":")

如果您只想“选择”原始列或计算列以将其存储在新的data.table中,则无需使用
:=

您可以为您的问题添加
stringsim
功能的最小实现吗?谢谢!也许
dt[,lapply(stringsim_方法,函数(x)stringsim(“oscdf”,variant,method=x)),by=(variant,year)]
不是一个
数据。表
方法,但按要求工作:-)呵呵,好的,让我们做一个独特的内部操作。现在它看起来很像data.table,不是吗@RYoda?很好,我真正喜欢的是通过
(stringsim_方法):=
来“注入”想要的列名,这避免了我第二步调用
集合名
stringsim <- function(x,variant,method) 1
dt_red <- dt[,unique(.SD),.SDcols=c("variant","year")]
dt_red[,(stringsim_methods):=lapply(stringsim_methods,function(x) 
stringsim("oscdf",variant, method=x)),.(variant,year)]
stringsim <- function(x,variant,method) paste(method, variant, sep = ":")
dt3 <- dt[,
          lapply(stringsim_methods, function(x) stringsim("oscdf", variant, method = x)),
          by = .(variant, year)]
data.table::setnames(dt3, 3:length(dt3), stringsim_methods)
> dt3
   variant year        lv        osa        dl        lcs        jw        qgram
1:   osdcf 2018  lv:osdcf  osa:osdcf  dl:osdcf  lcs:osdcf  jw:osdcf  qgram:osdcf
2:  offsco 2015 lv:offsco osa:offsco dl:offsco lcs:offsco jw:offsco qgram:offsco
3:   osdcf 2015  lv:osdcf  osa:osdcf  dl:osdcf  lcs:osdcf  jw:osdcf  qgram:osdcf
4:   odfsc 2015  lv:odfsc  osa:odfsc  dl:odfsc  lcs:odfsc  jw:odfsc  qgram:odfsc
5:  offsco 2018 lv:offsco osa:offsco dl:offsco lcs:offsco jw:offsco qgram:offsco
6:   odfsc 2018  lv:odfsc  osa:odfsc  dl:odfsc  lcs:odfsc  jw:odfsc  qgram:odfsc
7:      oc 2015     lv:oc     osa:oc     dl:oc     lcs:oc     jw:oc     qgram:oc
8:     osc 2018    lv:osc    osa:osc    dl:osc    lcs:osc    jw:osc    qgram:osc
9:     osc 2015    lv:osc    osa:osc    dl:osc    lcs:osc    jw:osc    qgram:osc