R 如何将列名作为其他列表的输入传递,并引用其他data.table来汇总分数?
我有两个数据。表DT1和RF,其中DT1是主表,RF是变量值的人行横道R 如何将列名作为其他列表的输入传递,并引用其他data.table来汇总分数?,r,data.table,R,Data.table,我有两个数据。表DT1和RF,其中DT1是主表,RF是变量值的人行横道 DT1 <- data.table(id = c(1:10),Var1=c(1,0,0,0,1,0,1,1,0,0) ,Var2=c(0,0,0,0,1,0,1,0,0,1), Var3=c(1,1,1,0,0,0,1,1,0,0),Var4=c(1,1,0,0,1,0,0,0,0,0), Var5=c(0,0,0,0,1,0,1,1,0,0)) RF &l
DT1 <- data.table(id = c(1:10),Var1=c(1,0,0,0,1,0,1,1,0,0) ,Var2=c(0,0,0,0,1,0,1,0,0,1),
Var3=c(1,1,1,0,0,0,1,1,0,0),Var4=c(1,1,0,0,1,0,0,0,0,0),
Var5=c(0,0,0,0,1,0,1,1,0,0))
RF <- data.table (Variable = c("Var1","Var2","Var3","Var4","Var5","Var6","Var7","Var8",
"Var9","Var10"),
CO = c(1.1,2.3,1.4,1.5,1.0,3.8,2.5,3.7,2.1,2.0),
IN = c(2.1,1.3,1.9,2.5,1.7,2.8,2.9,1.7,1.1,2.0))
我已经尝试过两种方法:
METHOD 1:
L1<- length(List1)
y<-0
DT1 <-DT1[,Score_CO:={for(i in 1:L1){
x<-parse(text = List1[i])
if(DT1[,eval(x)] == 1){
x<-RF[which(RF[,'Variable'] == List1[i],),CO]}
else{as.numeric(0.0)}
y=y+x }
return(y)}]
METHOD 2:
Score_Calc<-function(DT,RF, List, model = 'CO'){
pvar<- 0
pvar<-for(i in 1:nrow(DT)){
for(j in 1:length(List)){
x<-parse(text = List[j])
ifelse(DT[i,eval(x)] == 1, RF[which(RF[,'Variable'] == List[j],),model], 0)
}
pvar <- pvar + pvar
DT[,paste0('Score_',model):= pvar]
}
return(DT)
}
Score_Calc(DT=DT1,RF = RF, List=List1, model = 'CO')
备注:分数=Var1+Var2+Var3(来自RF表格CO列的值)
请看一看,帮我找出我做错了什么。非常感谢您的帮助。这是一个矩阵乘法版本:
as.matrix(DT1[, -1, with=F]) %*% as.matrix(RF[1:5, -1, with=F])
# CO IN
# [1,] 4.0 6.5
# [2,] 2.9 4.4
# [3,] 1.4 1.9
# [4,] 0.0 0.0
# [5,] 5.9 7.6
# [6,] 0.0 0.0
# [7,] 5.8 7.0
# [8,] 3.5 5.7
# [9,] 0.0 0.0
# [10,] 2.3 1.3
data.table版本1。只要小心,一旦你为RF设置了键,矩阵乘法会给你一个不同的答案,因为设置键会重新排序 编辑:执行以下两种计算的备选方案:
setkey(RF, Variable)
fun <- function(DT, col) sum(RF[names(DT), ][, col, with=F] * unlist(DT))
DT1[,list(CO=fun(.SD, "CO"), IN=fun(.SD, "IN")), by=id]
# id CO IN
# 1: 1 4.0 6.5
# 2: 2 2.9 4.4
# 3: 3 1.4 1.9
# 4: 4 0.0 0.0
# 5: 5 5.9 7.6
# 6: 6 0.0 0.0
# 7: 7 5.8 7.0
# 8: 8 3.5 5.7
# 9: 9 0.0 0.0
# 10: 10 2.3 1.3
第一个
sapply
循环遍历数据表中的每个Var#
列,从RF
中的CO
中找到相应的值,并将该列乘以该值(这将生成您在上面看到的修改后的Var1-5
值。CO:=apply(…
bit只计算每行Var1-5
的和,并将它们保存为DT2
中的CO
列。谢谢它看起来不错,我只是想问我如何才能得到DT1的子集,以及列表1中的变量。因为对于CO,我只想检查列表1中的变量(Var1,Var3,Var5)谢谢,我真的很感激。@ NSDSATASI,如果这回答了你的问题,请考虑通过点击复选框来标记它。谢谢。当然,非常感谢。我正准备这样做。用你给出的解决方案,我只添加了一个步骤,我用列对应的数据对数据进行子集,然后应用你的解决方案。继续。再次感谢。我不能投票支持这个解决方案,因为它说它需要超过15个声誉。但我真的想投票支持这个。
OUTPUT:
id Var1 Var2 Var3 Var4 Var5 Score_CO
1 1 0 1 1 0 2.5
2 0 0 1 1 0 1.4
3 0 0 1 0 0 1.4
4 0 0 0 0 0 0
5 1 1 0 1 1 2.1
6 0 0 0 0 0 0
7 1 1 1 0 1 3.5
8 1 0 1 0 1 3.5
9 0 0 0 0 0 0
10 0 1 0 0 0 0
as.matrix(DT1[, -1, with=F]) %*% as.matrix(RF[1:5, -1, with=F])
# CO IN
# [1,] 4.0 6.5
# [2,] 2.9 4.4
# [3,] 1.4 1.9
# [4,] 0.0 0.0
# [5,] 5.9 7.6
# [6,] 0.0 0.0
# [7,] 5.8 7.0
# [8,] 3.5 5.7
# [9,] 0.0 0.0
# [10,] 2.3 1.3
setkey(RF, Variable)
fun <- function(DT, col) sum(RF[names(DT), ][, col, with=F] * unlist(DT))
DT1[,list(CO=fun(.SD, "CO"), IN=fun(.SD, "IN")), by=id]
# id CO IN
# 1: 1 4.0 6.5
# 2: 2 2.9 4.4
# 3: 3 1.4 1.9
# 4: 4 0.0 0.0
# 5: 5 5.9 7.6
# 6: 6 0.0 0.0
# 7: 7 5.8 7.0
# 8: 8 3.5 5.7
# 9: 9 0.0 0.0
# 10: 10 2.3 1.3
setkey(RF, Variable)
DT2 <- DT1[, c(
list(id=id),
sapply(
names(.SD[, -1, with=F]),
function(x) unlist(.SD[, x, with = F] * RF[x, ][, CO]),
simplify=F
)
) ][, CO:=apply(.SD[, -1, with=F], 1, sum)]
DT2
# id Var1 Var2 Var3 Var4 Var5 CO
# 1: 1 1.1 0.0 1.4 1.5 0 4.0
# 2: 2 0.0 0.0 1.4 1.5 0 2.9
# 3: 3 0.0 0.0 1.4 0.0 0 1.4
# 4: 4 0.0 0.0 0.0 0.0 0 0.0
# 5: 5 1.1 2.3 0.0 1.5 1 5.9
# 6: 6 0.0 0.0 0.0 0.0 0 0.0
# 7: 7 1.1 2.3 1.4 0.0 1 5.8
# 8: 8 1.1 0.0 1.4 0.0 1 3.5
# 9: 9 0.0 0.0 0.0 0.0 0 0.0
# 10: 10 0.0 2.3 0.0 0.0 0 2.3