在for循环中对数据进行子集划分并执行一系列计算
这是我的数据集:在for循环中对数据进行子集划分并执行一系列计算,r,R,这是我的数据集: C1 C2 C3 C4 C5 C6 C7 C8 ATOM 1 -4.794 -7.29 6.756 C 12 1 ATOM 1 -4.357 -6.181 6.473 O 16 1 ATOM 2 -5.279 -7.475 5.986 C 12 1 ATOM 2 -7.564 -8.809 6.984 C 12 1 ATOM 2
C1 C2 C3 C4 C5 C6 C7 C8
ATOM 1 -4.794 -7.29 6.756 C 12 1
ATOM 1 -4.357 -6.181 6.473 O 16 1
ATOM 2 -5.279 -7.475 5.986 C 12 1
ATOM 2 -7.564 -8.809 6.984 C 12 1
ATOM 2 -5.822 -7.105 7.238 C 12 1
ATOM 1 -7.515 -10.402 -0.621 C 12 2
ATOM 1 -7.26 -11.716 -0.22 O 16 2
ATOM 1 -8.163 -9.682 0.566 C 12 2
ATOM 2 -6.347 -9.475 -1.255 C 12 2
ATOM 1 -7.302 -8.048 7.702 C 12 3
ATOM 1 -7.676 -8.93 6.667 C 12 3
ATOM 2 -6.864 -9.118 5.529 C 12 3
我的目标是根据C8列的内容对数据进行子集划分,并使用循环运行一系列计算。我目前正在通过运行以下命令手动执行此操作:
sub.1 <- subset(data, C8 == 1)
result.1 <- within(data, {
multiply.z <- C5 * C7
multiply.y <- C4 * C7
multiply.x <- C3 * C7
Center.z <- sum(multiply.z)/sum(C7)
Center.y <- sum(multiply.z)/sum(C7)
Center.x <- sum(multiply.z)/sum(C7)
#rm(multiply.z,multiply.y,multiply.x)
})
sub.2 <- subset(data, C8 == 2)
result.2 <- same code as above
sub.3 <- subset(data, C8 == 3)
result.3 <- same code as above
sub.1您一定要仔细查看library(plyr)
,它包含了您需要用于此类任务的所有工具。它允许您使用功能ddply
完成此任务
library(plyr)
mySubs <- ddply(dat, .(C8), .fun = function(x) {
multiply.z = x$C5 * x$C7
multiply.y = x$C4 * x$C7
multiply.x = x$C3 * x$C7
Center.z = sum(multiply.z)/sum(x$C7)
Center.y = sum(multiply.z)/sum(x$C7)
Center.x = sum(multiply.z)/sum(x$C7)
##rm(multiply.z,multiply.y,multiply.x)
data.frame(z = Center.z, y = Center.y, x = Center.x)
})
等等。这是一个数据表。
解决方案:
library(data.table)
DT <- data.table(df)
DT[,
structure(
lapply(list(C5, C4, C3), function(x) sum(x * C7) / sum(C7) ),
names = c("z", "y", "x")
)
, by = C8]
## C8 z y x
## 1: 1 6.674000 -7.297562 -5.487813
## 2: 2 -0.370000 -10.426231 -7.316538
## 3: 3 6.632667 -8.698667 -7.280667
库(data.table)
DT您要覆盖的循环的每次迭代result.i
> mySubs
C8 z y x
1 1 6.674000 6.674000 6.674000
2 2 -0.370000 -0.370000 -0.370000
3 3 6.632667 6.632667 6.632667
library(data.table)
DT <- data.table(df)
DT[,
structure(
lapply(list(C5, C4, C3), function(x) sum(x * C7) / sum(C7) ),
names = c("z", "y", "x")
)
, by = C8]
## C8 z y x
## 1: 1 6.674000 -7.297562 -5.487813
## 2: 2 -0.370000 -10.426231 -7.316538
## 3: 3 6.632667 -8.698667 -7.280667