R:如何根据其他变量的分组进行行和?

R:如何根据其他变量的分组进行行和?,r,R,以下是示例数据: df <- data.frame("ID1" = c("A","A","B","C"), "Wt1" = c(0.8,0.6,0.4,0.5), "ID2" = c("B","A","C","B"), "Wt2" = c(0.1,0.4,0.5,0.5), "ID3" = c("C",NA,"C",NA), "Wt3" = c(0.1,NA,0.1,

以下是示例数据:

df <- data.frame("ID1" = c("A","A","B","C"), 
            "Wt1" = c(0.8,0.6,0.4,0.5),
            "ID2" = c("B","A","C","B"),
            "Wt2" = c(0.1,0.4,0.5,0.5),
            "ID3" = c("C",NA,"C",NA), 
            "Wt3" = c(0.1,NA,0.1,NA))

df首先,很难操作这样格式的表。这不是您想要的输出,但我担心您可能会被困在路上

一个建议是设置表的格式,以便我们可以轻松地从中检索信息

为每个观察分配id

df$obs <- 1:nrow(df)
我们通过obs和ID对投票数求和

dt[,total:=sum(Wt,na.rm=TRUE),.(obs,ID)]
这样检索信息就很容易了

dt[,vote:=.SD[which.max(total)],obs]

#dt
#    ID  Wt obs total vote
# 1:  A 0.8   1   0.8    A
# 2:  A 0.6   2   1.0    A
# 3:  B 0.4   3   0.4    C
# 4:  C 0.5   4   0.5    C
# 5:  B 0.1   1   0.1    A
# 6:  A 0.4   2   1.0    A
# 7:  C 0.5   3   0.6    C
# 8:  B 0.5   4   0.5    C
# 9:  C 0.1   1   0.1    A
# 10: NA  NA   2   0.0    A
# 11:  C 0.1   3   0.6    C
# 12: NA  NA   4   0.0    C

df[is.na(df)]谢谢@皮埃尔拉·福琼。这个解决方案很有效,而且非常简洁。。。您能进一步解释吗?您没有为相同的值指定连接断路器谢谢@DJJ我意识到添加行标签并转换为长格式是一个非常好的主意。我可以通过obs将原始表与结果表左键联接。
dt <- as.data.table(df1)
dt[,total:=sum(Wt,na.rm=TRUE),.(obs,ID)]
dt[,vote:=.SD[which.max(total)],obs]

#dt
#    ID  Wt obs total vote
# 1:  A 0.8   1   0.8    A
# 2:  A 0.6   2   1.0    A
# 3:  B 0.4   3   0.4    C
# 4:  C 0.5   4   0.5    C
# 5:  B 0.1   1   0.1    A
# 6:  A 0.4   2   1.0    A
# 7:  C 0.5   3   0.6    C
# 8:  B 0.5   4   0.5    C
# 9:  C 0.1   1   0.1    A
# 10: NA  NA   2   0.0    A
# 11:  C 0.1   3   0.6    C
# 12: NA  NA   4   0.0    C