R 在datatable的子集内迭代计算

R 在datatable的子集内迭代计算,r,data.table,R,Data.table,我对以下数据表有问题: DT <- data.table( A = c(rep("aa",2),rep("bb",2),rep("aa",2)), B = c(rep("H",2),rep("Na",2),rep("H",2)), C = c(1,1,1,1,1,2), Conc = c(1.5,5,5,10,10,10), Area =c(100.25,500,1089,6000.02,1200,10.564), Area_UT =c(90.54,488,1010

我对以下数据表有问题:

DT <- data.table(
  A = c(rep("aa",2),rep("bb",2),rep("aa",2)),
  B = c(rep("H",2),rep("Na",2),rep("H",2)),
  C = c(1,1,1,1,1,2),
  Conc = c(1.5,5,5,10,10,10),
  Area =c(100.25,500,1089,6000.02,1200,10.564),
  Area_UT =c(90.54,488,1010,5999,1099,8)
)
在base R、
data.table
dplyr
中是否有此问题的解决方案

无论如何,非常感谢


Yasel

进行笛卡尔合并然后进行过滤的替代方法是使用独特的组合:

newcols <- c("R_Conc","R_Area_T","R_Area_UT")
res <- DT[,{
  unique_comb=combn(.SD[,.I],2)
  data.table(.SD[unique_comb[1,]],.SD[unique_comb[2,]],check.names = T)[,(newcols) := .(Conc/Conc.1, Area/Area.1, Area_UT/Area_UT.1)]
},A]
newcols使用的非等联接非常适合此问题:

#copy Area column to be used as joining key as it will be overwritten after join
DT[, Aid := Area]

newcols <- c("R_Conc","R_Area_T","R_Area_UT")

#perform non-equi inner join
output <- DT[DT, on=.(A, Aid<Aid), nomatch=0L][, 
    #calculate ratios and update by reference
    (newcols) := .(Conc/i.Conc, Area/i.Area, Area_UT/i.Area_UT)]
与设置顺序(DT_输出、A_1、Conc_1、Area_T_1)相比[]

    A  B C Conc     Area Area_UT     Aid i.B i.C i.Conc  i.Area i.Area_UT   R_Conc    R_Area_T   R_Area_UT
1: aa  H 1  1.5  100.250   90.54  500.00   H   1    5.0  500.00    488.00 0.300000 0.200500000 0.185532787
2: aa  H 1  1.5  100.250   90.54 1200.00   H   1   10.0 1200.00   1099.00 0.150000 0.083541667 0.082383985
3: aa  H 1  5.0  500.000  488.00 1200.00   H   1   10.0 1200.00   1099.00 0.500000 0.416666667 0.444040036
4: aa  H 2 10.0   10.564    8.00  100.25   H   1    1.5  100.25     90.54 6.666667 0.105376559 0.088358736
5: aa  H 2 10.0   10.564    8.00  500.00   H   1    5.0  500.00    488.00 2.000000 0.021128000 0.016393443
6: aa  H 2 10.0   10.564    8.00 1200.00   H   1   10.0 1200.00   1099.00 1.000000 0.008803333 0.007279345
7: bb Na 1  5.0 1089.000 1010.00 6000.02  Na   1   10.0 6000.02   5999.00 0.500000 0.181499395 0.168361394
   A_1 B_1 C_1 Conc_1 Area_T_1 Area_UT_1 A_2 B_2 C_2 Conc_2 Area_T_2 Area_UT_2   R_Conc    R_Area_T   R_Area_UT
1:  aa   H   1    1.5  100.250     90.54  aa   H   1    5.0   500.00    488.00 0.300000 0.200500000 0.185532787
2:  aa   H   1    1.5  100.250     90.54  aa   H   1   10.0  1200.00   1099.00 0.150000 0.083541667 0.082383985
3:  aa   H   1    5.0  500.000    488.00  aa   H   1   10.0  1200.00   1099.00 0.500000 0.416666667 0.444040036
4:  aa   H   2   10.0   10.564      8.00  aa   H   1    1.5   100.25     90.54 6.666667 0.105376559 0.088358736
5:  aa   H   2   10.0   10.564      8.00  aa   H   1    5.0   500.00    488.00 2.000000 0.021128000 0.016393443
6:  aa   H   2   10.0   10.564      8.00  aa   H   1   10.0  1200.00   1099.00 1.000000 0.008803333 0.007279345
7:  bb  Na   1    5.0 1089.000   1010.00  bb  Na   1   10.0  6000.02   5999.00 0.500000 0.181499395 0.168361394
如果需要,可以使用
data.table::setnames
更新
output
的列名。基本上,
i.
对应于您的
\u 2

#copy Area column to be used as joining key as it will be overwritten after join
DT[, Aid := Area]

newcols <- c("R_Conc","R_Area_T","R_Area_UT")

#perform non-equi inner join
output <- DT[DT, on=.(A, Aid<Aid), nomatch=0L][, 
    #calculate ratios and update by reference
    (newcols) := .(Conc/i.Conc, Area/i.Area, Area_UT/i.Area_UT)]
    A  B C Conc     Area Area_UT     Aid i.B i.C i.Conc  i.Area i.Area_UT   R_Conc    R_Area_T   R_Area_UT
1: aa  H 1  1.5  100.250   90.54  500.00   H   1    5.0  500.00    488.00 0.300000 0.200500000 0.185532787
2: aa  H 1  1.5  100.250   90.54 1200.00   H   1   10.0 1200.00   1099.00 0.150000 0.083541667 0.082383985
3: aa  H 1  5.0  500.000  488.00 1200.00   H   1   10.0 1200.00   1099.00 0.500000 0.416666667 0.444040036
4: aa  H 2 10.0   10.564    8.00  100.25   H   1    1.5  100.25     90.54 6.666667 0.105376559 0.088358736
5: aa  H 2 10.0   10.564    8.00  500.00   H   1    5.0  500.00    488.00 2.000000 0.021128000 0.016393443
6: aa  H 2 10.0   10.564    8.00 1200.00   H   1   10.0 1200.00   1099.00 1.000000 0.008803333 0.007279345
7: bb Na 1  5.0 1089.000 1010.00 6000.02  Na   1   10.0 6000.02   5999.00 0.500000 0.181499395 0.168361394
   A_1 B_1 C_1 Conc_1 Area_T_1 Area_UT_1 A_2 B_2 C_2 Conc_2 Area_T_2 Area_UT_2   R_Conc    R_Area_T   R_Area_UT
1:  aa   H   1    1.5  100.250     90.54  aa   H   1    5.0   500.00    488.00 0.300000 0.200500000 0.185532787
2:  aa   H   1    1.5  100.250     90.54  aa   H   1   10.0  1200.00   1099.00 0.150000 0.083541667 0.082383985
3:  aa   H   1    5.0  500.000    488.00  aa   H   1   10.0  1200.00   1099.00 0.500000 0.416666667 0.444040036
4:  aa   H   2   10.0   10.564      8.00  aa   H   1    1.5   100.25     90.54 6.666667 0.105376559 0.088358736
5:  aa   H   2   10.0   10.564      8.00  aa   H   1    5.0   500.00    488.00 2.000000 0.021128000 0.016393443
6:  aa   H   2   10.0   10.564      8.00  aa   H   1   10.0  1200.00   1099.00 1.000000 0.008803333 0.007279345
7:  bb  Na   1    5.0 1089.000   1010.00  bb  Na   1   10.0  6000.02   5999.00 0.500000 0.181499395 0.168361394