R并行处理错误` checkForRemoteErrors(val)中的错误:6个节点产生错误;第一个错误:下标超出范围`

R并行处理错误` checkForRemoteErrors(val)中的错误:6个节点产生错误;第一个错误:下标超出范围`,r,for-loop,parallel-processing,parallel.foreach,doparallel,R,For Loop,Parallel Processing,Parallel.foreach,Doparallel,我正在学习并行处理作为一种处理大型数据集的方法 我预定义了一些变量,如下所示: CV <- function(mean, sd) {(sd / mean) * 100} distThreshold <- 5 # Distance threshold CVThreshold <- 20 # CV threshold LocalCV <- list() Num.CV <- list() 然后将集群参数clust\u cores传递给parspapply: fo

我正在学习并行处理作为一种处理大型数据集的方法

我预定义了一些变量,如下所示:

CV <- function(mean, sd) {(sd / mean) * 100} 
distThreshold <- 5 # Distance threshold 
CVThreshold <- 20 # CV threshold 

LocalCV <- list()
Num.CV <- list()
然后将集群参数
clust\u cores
传递给
parspapply

for (i in seq(YieldData2rd)) {
  LocalCV[[i]] = parSapply(clust_cores, X = 1:length(YieldData2rd[[i]]), 
                   FUN = function(pt) {
                     d = spDistsN1(YieldData2rd[[i]], YieldData2rd[[i]][pt,])
                     ret = CV(mean = mean(YieldData2rd[[i]][d < distThreshold, ]$yield), 
                              sd = sd(YieldData2rd[[i]][d < distThreshold, ]$yield))
                     return(ret)
                   }) # calculate CV in the local neighbour 
}

stopCluster(clust_cores) 

感谢@Omry Atia的评论,我开始研究
foreach
包,并进行了第一次尝试

library(foreach)
library(doParallel)

#setup parallel backend to use many processors
cores=detectCores()
clust_cores <- makeCluster(cores[1]-1) #not to overload your computer
registerDoParallel(clust_cores)

LocalCV = foreach(i = seq(YieldData2rd), .combine=list, .multicombine=TRUE) %dopar% {
                       LocalCV[[i]] = sapply(X = 1:length(YieldData2rd[[i]]), 
                                            FUN = function(pt) {
                                                  d = spDistsN1(YieldData2rd[[i]], YieldData2rd[[i]][pt,])
                                                ret = CV(mean = mean(YieldData2rd[[i]][d < distThreshold, ]$yield), 
                                                 sd = sd(YieldData2rd[[i]][d < distThreshold, ]$yield))
                                                 return(ret)
                                                 }) # calculate CV in the local neighbour 
                       }

stopCluster(clust_cores)
库(foreach)
图书馆(双平行)
#设置并行后端以使用多个处理器
核心=检测核心()

clust_cores能否请您提供
YieldData2rd
的样本?没有它,代码就无法运行。此外,在定义集群之前,请将
i
导出到集群。请查找我编辑的问题。作为可复制示例提供的小样本数据集,在原始
for
循环中运行良好。我将
I
导出到集群,因为否则它会显示我无法找到对象
I
。当我像上面这个修订版本那样运行代码时,我会得到一个不同的错误:checkForRemoteErrors(val)中的错误:4个节点产生错误;第一个错误:找不到对象“i”。原因是对于索引数据,parApply不适合:请查看函数
foreach
,它是for循环的并行版本。希望如此helps@OmryAtia谢谢,我试着用
foreach
软件包重写代码,并在下面发布了一个答案,这对我来说似乎很好。只有一个问题,我怎么知道它实际上比
for
循环快?
library('rgdal')

Yield1 <- data.frame(yield=rnorm(460, mean = 10), x1=rnorm(460, mean = 1843235), x2=rnorm(460,mean = 5802532))
Yield2 <- data.frame(yield=rnorm(408, mean = 10), x1=rnorm(408, mean = 1843235), x2=rnorm(408, mean = 5802532))
Yield3 <- data.frame(yield=rnorm(369, mean = 10), x1=rnorm(369, mean = 1843235), x2=rnorm(369, mean = 5802532))

coordinates(Yield1) <- c('x1', 'x2')
coordinates(Yield2) <- c('x1', 'x2')
coordinates(Yield3) <- c('x1', 'x2')

YieldData2rd <- list(Yield1, Yield2, Yield3)
library(foreach)
library(doParallel)

#setup parallel backend to use many processors
cores=detectCores()
clust_cores <- makeCluster(cores[1]-1) #not to overload your computer
registerDoParallel(clust_cores)

LocalCV = foreach(i = seq(YieldData2rd), .combine=list, .multicombine=TRUE) %dopar% {
                       LocalCV[[i]] = sapply(X = 1:length(YieldData2rd[[i]]), 
                                            FUN = function(pt) {
                                                  d = spDistsN1(YieldData2rd[[i]], YieldData2rd[[i]][pt,])
                                                ret = CV(mean = mean(YieldData2rd[[i]][d < distThreshold, ]$yield), 
                                                 sd = sd(YieldData2rd[[i]][d < distThreshold, ]$yield))
                                                 return(ret)
                                                 }) # calculate CV in the local neighbour 
                       }

stopCluster(clust_cores)