R 将结果以独占方式保存在另一个data.table中
我正在尝试将具有不同参数的相同函数应用于单个列,并将结果保存在单独的data.table中,而不更新/修改原始表:R 将结果以独占方式保存在另一个data.table中,r,data.table,R,Data.table,我正在尝试将具有不同参数的相同函数应用于单个列,并将结果保存在单独的data.table中,而不更新/修改原始表: library(data.table) set.seed(43) dt <- data.table( a = sample(c("aaa","bbb","ccc"),15,replace = T), year=sample(c("2015","2018"),15,replace=T), b = sample(c("o","
library(data.table)
set.seed(43)
dt <- data.table(
a = sample(c("aaa","bbb","ccc"),15,replace = T),
year=sample(c("2015","2018"),15,replace=T),
b = sample(c("o","r","s","c","d","f"),15,replace = T),
variant=sample(c("osdcf", "osc", "offsco", "osc", "odfsc", "oc"),15,replace = T)
)
stringsim_methods=c("lv","osa","dl","lcs","jw","qgram")
for (x in stringsim_methods) {
dt1=dt[,(x):=stringsim("oscdf",variant, method=x),by=.(variant,year)]
}
有没有更优雅的方法来实现这一点:
variant year lv osa dl lcs jw qgram
1: osdcf 2018 0.6000000 0.8000000 0.8000000 0.8000000 0.9333333 1.0000000
2: offsco 2015 0.3333333 0.3333333 0.3333333 0.5454545 0.6972222 0.7272727
3: osdcf 2015 0.6000000 0.8000000 0.8000000 0.8000000 0.9333333 1.0000000
4: odfsc 2015 0.2000000 0.2000000 0.2000000 0.6000000 0.4666667 1.0000000
5: offsco 2018 0.3333333 0.3333333 0.3333333 0.5454545 0.6972222 0.7272727
6: odfsc 2018 0.2000000 0.2000000 0.2000000 0.6000000 0.4666667 1.0000000
7: oc 2015 0.4000000 0.4000000 0.4000000 0.5714286 0.8000000 0.5714286
8: osc 2018 0.6000000 0.6000000 0.6000000 0.7500000 0.8666667 0.7500000
9: osc 2015 0.6000000 0.6000000 0.6000000 0.7500000 0.8666667 0.7500000
谢谢。两个更改将使它更干净:
1.在第一步中,您似乎没有真正总结,因此您只需要两个变量的唯一组合
2.您可以在j中用lappy替换for
stringsim假设您的stringsim
功能如下
stringsim <- function(x,variant,method) paste(method, variant, sep = ":")
如果您只想“选择”原始列或计算列以将其存储在新的data.table中,则无需使用:=
。您可以为您的问题添加stringsim
功能的最小实现吗?谢谢!也许dt[,lapply(stringsim_方法,函数(x)stringsim(“oscdf”,variant,method=x)),by=(variant,year)]
不是一个数据。表
方法,但按要求工作:-)呵呵,好的,让我们做一个独特的内部操作。现在它看起来很像data.table,不是吗@RYoda?很好,我真正喜欢的是通过(stringsim_方法):=
来“注入”想要的列名,这避免了我第二步调用集合名
!
stringsim <- function(x,variant,method) 1
dt_red <- dt[,unique(.SD),.SDcols=c("variant","year")]
dt_red[,(stringsim_methods):=lapply(stringsim_methods,function(x)
stringsim("oscdf",variant, method=x)),.(variant,year)]
stringsim <- function(x,variant,method) paste(method, variant, sep = ":")
dt3 <- dt[,
lapply(stringsim_methods, function(x) stringsim("oscdf", variant, method = x)),
by = .(variant, year)]
data.table::setnames(dt3, 3:length(dt3), stringsim_methods)
> dt3
variant year lv osa dl lcs jw qgram
1: osdcf 2018 lv:osdcf osa:osdcf dl:osdcf lcs:osdcf jw:osdcf qgram:osdcf
2: offsco 2015 lv:offsco osa:offsco dl:offsco lcs:offsco jw:offsco qgram:offsco
3: osdcf 2015 lv:osdcf osa:osdcf dl:osdcf lcs:osdcf jw:osdcf qgram:osdcf
4: odfsc 2015 lv:odfsc osa:odfsc dl:odfsc lcs:odfsc jw:odfsc qgram:odfsc
5: offsco 2018 lv:offsco osa:offsco dl:offsco lcs:offsco jw:offsco qgram:offsco
6: odfsc 2018 lv:odfsc osa:odfsc dl:odfsc lcs:odfsc jw:odfsc qgram:odfsc
7: oc 2015 lv:oc osa:oc dl:oc lcs:oc jw:oc qgram:oc
8: osc 2018 lv:osc osa:osc dl:osc lcs:osc jw:osc qgram:osc
9: osc 2015 lv:osc osa:osc dl:osc lcs:osc jw:osc qgram:osc