R中电流for回路有哪些优化方法?
我有下面的代码,我想优化,但我目前不知道如何才能做到这一点。首先,让我给你介绍一下这个问题R中电流for回路有哪些优化方法?,r,for-loop,optimization,data.table,R,For Loop,Optimization,Data.table,我有下面的代码,我想优化,但我目前不知道如何才能做到这一点。首先,让我给你介绍一下这个问题 test.data包含大约200000行,这使得下面的实现在R中非常缓慢。我尝试做的第一件事是优化函数并尽可能多地删除测试(if语句),但是在下面的代码中有两个实例我无法做到这一点 library(data.table) test.data <- data.table(person = c("A", "B", "C"), duration = c(12
test.data
包含大约200000行,这使得下面的实现在R中非常缓慢。我尝试做的第一件事是优化函数并尽可能多地删除测试(if
语句),但是在下面的代码中有两个实例我无法做到这一点
library(data.table)
test.data <- data.table(person = c("A", "B", "C"),
duration = c(120,50,30),
time = c(159, 231, 312),
savings = c(140000, 200000, 300000),
ren = c(0.0037, 0.0011, 0.0015),
res = c(55, 10, 30))
set.seed(35)
# Deduction series, note that in this example, they are arbitrary.
# They do not follow a pattern. I believe, this is the core of the problem.
# Which makes it extremely difficult to vectorise, since this would result in
# no closed solution.
c_a <- round(runif(max(test.data$duration)), 2) / 10
# Put in as a constant, but it can vary arbitrary.
c_b <- rep(round((8.5 / 12)/100, digits = 4), max(test.data$duration))
rnew <- 0.25
result <- matrix(0, nrow = 6, ncol = 120)
for(j in 1:nrow(test.data)){
savings <- test.data$savings[j]
duration <- test.data$duration[j]
time <- test.data$time[j]
res <- test.data$res[j]
m <- matrix(nrow = 6, ncol = duration)
for(i in 1:duration){
m[1,i] <- ifelse(i == 1, savings, m[6, i-1])
m[2,i] <- -m[1,i] * c_a[i]
m[3,i] <- -(m[1,i] + m[2,i]) * c_b[i]
m[4,i] <- ifelse(i == duration, -(m[1,i] + m[2,i] + m[3,i]), -(m[1,i] + m[2,i]) / (time + 1 - i))
if(i == res & res < time){
m[5, i] <- -(m[1,i] + m[2,i]) * (1 - rnew)
} else {
m[5, i] <- 0
}
m[6, i] <- m[1,i] + m[2,i] + m[3,i] + m[4,i] + m[5,i]
}
m <- cbind(m, matrix(0, ncol = ncol(result) - ncol(m), nrow = nrow(result)))
result <- matrix(mapply(sum, result, m, MoreArgs=list(na.rm=T)),ncol=ncol(result))
}
库(data.table)
test.data我唯一能建议的是只初始化m
一次,其维度与结果
相同,
并更换外环路的最后2条线路,如下所示。
这将避免重新分配*m
和使用mapply
完成的元素方向的sum
result <- matrix(0, nrow = 6, ncol = 120)
m <- result
for (j in 1:nrow(test.data)) {
savings <- test.data$savings[j]
duration <- test.data$duration[j]
time <- test.data$time[j]
res <- test.data$res[j]
for (i in 1:duration) {
m[1,i] <- ifelse(i == 1, savings, m[6, i-1])
m[2,i] <- -m[1,i] * c_a[i]
m[3,i] <- -(m[1,i] + m[2,i]) * c_b[i]
m[4,i] <- ifelse(i == duration, -(m[1,i] + m[2,i] + m[3,i]), -(m[1,i] + m[2,i]) / (time + 1 - i))
if (i == res & res < time) {
m[5, i] <- -(m[1,i] + m[2,i]) * (1 - rnew)
} else {
m[5, i] <- 0
}
m[6, i] <- m[1,i] + m[2,i] + m[3,i] + m[4,i] + m[5,i]
}
result[, 1:duration] <- result[, 1:duration] + m[, 1:duration]
}
result计算未付金额总和的可能方法(即结果中的OPs第1行)。如果需要,可以轻松计算所有中间值(m[2,j]
,m[3,j]
,m[4,j]
,m[5,j]
)。注意:我没有用实际的dim计时
library(data.table)
calcAmor <- function(ca, cb, rnew, dur, S0, tau, res) {
amortize <- function(S, ca.t) S - ca.t[1L]*S - (1-ca.t[1L])*cb*S - (S - ca.t[1L]*S) / (tau + 1 - ca.t[2L])
ans <- Reduce(amortize,
split(cbind(ca, seq_along(ca)), seq_along(ca)),
init=S0,
accumulate=TRUE)[-(dur+1L)]
ix <- min(res+1L, dur):dur
tmp <- Reduce(amortize,
split(cbind(ca[ix], ix), seq_along(ix)),
init=amortize(ans[res], c(ca[res], res)) - (ans[res] - ans[res]*ca[res])*(1-rnew),
accumulate=TRUE)
ans[ix] <- tmp[-length(tmp)]
ans
}
set.seed(35)
test.data <- data.table(person = c("A", "B", "C"),
duration = c(120,50,30),
time = c(159, 231, 312),
savings = c(140000, 200000, 300000),
res = c(55, 10, 30))
maxd <- test.data[, max(duration)]
c_a <- round(runif(maxd), 2) / 10
rnew <- 0.25
cb <- round((8.5 / 12)/100, digits = 4)
test.data[, .(
dur=seq_len(duration),
S=calcAmor(ca=c_a[seq_len(duration)], cb, rnew, dur=duration, S0=savings, tau=time, res=res)),
by=.(person)][, sum(S), by=.(dur)]
另一种可能是将test.data转换为矩阵(删除不使用的第一列)。矩阵在循环中的处理速度更快,但您必须重写代码以指定列号(或名称,例如test.mat[,“duration”])。
dur V1
1: 1 6.400000e+05
2: 2 5.783318e+05
3: 3 5.711966e+05
4: 4 5.336450e+05
5: 5 4.774502e+05
---
116: 116 7.075169e+00
117: 117 6.788631e+00
118: 118 6.339002e+00
119: 119 5.639335e+00
120: 120 5.297898e+00