R 在并行执行中使用自定义摘要函数的问题(插入符号)

R 在并行执行中使用自定义摘要函数的问题(插入符号),r,r-caret,doparallel,R,R Caret,Doparallel,我试图使用MAPE作为度量来评估模型的性能 在LOOCV和并行执行的情况下,所有这些都可以正常工作,但如果使用另一种重采样方法,则会出现以下错误: {:任务1失败-“找不到函数“mape”时出错 相反,在串行执行中,这个问题消失了 下面的代码提供了一个示例 library(caret) library(doParallel) data("environmental") registerDoParallel(makeCluster(detectCores(),

我试图使用MAPE作为度量来评估模型的性能

在LOOCV和并行执行的情况下,所有这些都可以正常工作,但如果使用另一种重采样方法,则会出现以下错误:

{:任务1失败-“找不到函数“mape”时出错

相反,在串行执行中,这个问题消失了

下面的代码提供了一个示例

    library(caret)
    library(doParallel)

    data("environmental")

    registerDoParallel(makeCluster(detectCores(), outfile = ''))



    mape <- function(y, yhat) mean(abs((y - yhat)/y))

    mapeSummary <- function (data, lev = NULL, model = NULL) {

                       out <- mape(data$obs, data$pred)
                       names(out) <- "MAPE"

                       out
                     }



    #LOOCV - parallel
    trControlLoocvPar <- trainControl(allowParallel = T,
                                      verboseIter = T, 
                                      method = "LOOCV",
                                      summaryFunction = mapeSummary)

    #LOOCV - serial
    trControlLoocvSer <- trainControl(allowParallel = F,
                                      verboseIter = T, 
                                      method = "LOOCV",
                                      summaryFunction = mapeSummary)

    #Bootstrapping - parallel
    trControlBootPar <- trainControl(allowParallel = T,
                                      verboseIter = T, 
                                      method = "boot",
                                      summaryFunction = mapeSummary)

    #Bootstrapping - serial
    trControlBootSer <- trainControl(allowParallel = F,
                                      verboseIter = T, 
                                      method = "boot",
                                      summaryFunction = mapeSummary)


    trControlList <- list(trControlLoocvSer, 
                          trControlLoocvPar,
                          trControlBootSer,
                          trControlBootPar)


    models <- lapply(trControlList, 
                     function(control) {

                       train(y = environmental$ozone,
                       x = environmental[, -1], 
                       method = "glmnet", 
                       trControl = control, 
                       metric = "MAPE", 
                       maximize = FALSE)
                     })
库(插入符号)
图书馆(双平行)
数据(“环境”)
registerDoParallel(makeCluster(detectCores(),outfile='')

mape如消息所述,并行进程无法找到mape函数

最简单的解决方案是将mape函数放入mapeSummary函数中,如下所示。这样您的并行进程将正常工作

mapeSummary <- function (data, lev = NULL, model = NULL) {
  mape <- function(y, yhat) mean(abs((y - yhat)/y))
  out <- mape(data$obs, data$pred)
  names(out) <- "MAPE"

  out
}
cl <- makePSOCKcluster(detectCores()-1)
clusterEvalQ(cl, mape <- function(y, yhat) mean(abs((y - yhat)/y)))
registerDoParallel(cl)

mapeSummary <- function (data, lev = NULL, model = NULL) {
   out <- mape(data$obs, data$pred)
  names(out) <- "MAPE"
  out
}

#Bootstrapping - parallel
trControlBootPar <- trainControl(allowParallel = T,
                                 verboseIter = T, 
                                 method = "boot",
                                 summaryFunction = mapeSummary)

train(y = environmental$ozone,
      x = environmental[, -1], 
      method = "glmnet", 
      trControl = trControlBootPar, 
      metric = "MAPE", 
      maximize = FALSE)

stopCluster(cl)
registerDoSEQ()