在for循环中对数据进行子集划分并执行一系列计算

在for循环中对数据进行子集划分并执行一系列计算,r,R,这是我的数据集: C1 C2 C3 C4 C5 C6 C7 C8 ATOM 1 -4.794 -7.29 6.756 C 12 1 ATOM 1 -4.357 -6.181 6.473 O 16 1 ATOM 2 -5.279 -7.475 5.986 C 12 1 ATOM 2 -7.564 -8.809 6.984 C 12 1 ATOM 2

这是我的数据集:

C1     C2     C3      C4     C5    C6   C7  C8
ATOM    1   -4.794  -7.29   6.756   C   12  1
ATOM    1   -4.357  -6.181  6.473   O   16  1
ATOM    2   -5.279  -7.475  5.986   C   12  1
ATOM    2   -7.564  -8.809  6.984   C   12  1
ATOM    2   -5.822  -7.105  7.238   C   12  1
ATOM    1   -7.515  -10.402 -0.621  C   12  2
ATOM    1   -7.26   -11.716 -0.22   O   16  2
ATOM    1   -8.163  -9.682  0.566   C   12  2
ATOM    2   -6.347  -9.475  -1.255  C   12  2
ATOM    1   -7.302  -8.048  7.702   C   12  3
ATOM    1   -7.676  -8.93   6.667   C   12  3
ATOM    2   -6.864  -9.118  5.529   C   12  3
我的目标是根据C8列的内容对数据进行子集划分,并使用循环运行一系列计算。我目前正在通过运行以下命令手动执行此操作:

sub.1 <- subset(data, C8 == 1)
result.1 <- within(data, {
    multiply.z <- C5 * C7 
    multiply.y <- C4 * C7  
    multiply.x <- C3 * C7 
    Center.z <- sum(multiply.z)/sum(C7) 
    Center.y <- sum(multiply.z)/sum(C7) 
    Center.x <- sum(multiply.z)/sum(C7) 
    #rm(multiply.z,multiply.y,multiply.x) 
})

sub.2 <- subset(data, C8 == 2)
result.2 <- same code as above
sub.3 <- subset(data, C8 == 3)
result.3 <- same code as above

sub.1您一定要仔细查看
library(plyr)
,它包含了您需要用于此类任务的所有工具。它允许您使用功能
ddply
完成此任务

library(plyr)

mySubs <- ddply(dat, .(C8), .fun = function(x) {
    multiply.z = x$C5 * x$C7
    multiply.y = x$C4 * x$C7
    multiply.x = x$C3 * x$C7
    Center.z = sum(multiply.z)/sum(x$C7)
    Center.y = sum(multiply.z)/sum(x$C7)
    Center.x = sum(multiply.z)/sum(x$C7)
    ##rm(multiply.z,multiply.y,multiply.x)
    data.frame(z = Center.z, y = Center.y, x = Center.x)
})

等等。

这是一个
数据表。
解决方案:

library(data.table)

DT <- data.table(df)

DT[,
   structure(
       lapply(list(C5, C4, C3), function(x) sum(x * C7) / sum(C7) ),
       names = c("z", "y", "x")
       )
   , by = C8]

##    C8         z          y         x
## 1:  1  6.674000  -7.297562 -5.487813
## 2:  2 -0.370000 -10.426231 -7.316538
## 3:  3  6.632667  -8.698667 -7.280667
库(data.table)

DT您要覆盖的循环的每次迭代
result.i
> mySubs
  C8         z         y         x
1  1  6.674000  6.674000  6.674000
2  2 -0.370000 -0.370000 -0.370000
3  3  6.632667  6.632667  6.632667
library(data.table)

DT <- data.table(df)

DT[,
   structure(
       lapply(list(C5, C4, C3), function(x) sum(x * C7) / sum(C7) ),
       names = c("z", "y", "x")
       )
   , by = C8]

##    C8         z          y         x
## 1:  1  6.674000  -7.297562 -5.487813
## 2:  2 -0.370000 -10.426231 -7.316538
## 3:  3  6.632667  -8.698667 -7.280667