从R中的randomForest包并行化rfcv_R_Random Forest_Feature Selection

从R中的randomForest包并行化rfcv

从R中的randomForest包并行化rfcv,r,random-forest,feature-selection,R,Random Forest,Feature Selection,我试图使用rfcv函数来进行多元随机森林特征选择。我通过以下方式获得了正常的rf命令（构建随机林）模型，以进行并行处理： library(randomForest) library(doMC) nCores <- detectCores(); registerDoMC(nCores) #number of cores on the machine rf.model <- foreach(ntree=rep(round(510/nCores),nCores), .combine=com

我试图使用rfcv函数来进行多元随机森林特征选择。我通过以下方式获得了正常的rf命令（构建随机林）模型，以进行并行处理：

library(randomForest)
library(doMC)
nCores <- detectCores();
registerDoMC(nCores) #number of cores on the machine
rf.model <- foreach(ntree=rep(round(510/nCores),nCores), .combine=combine, .multicombine=TRUE, .packages="randomForest") %dopar% {
    rf <- randomForest(y = outcome, x = predictor, ntree=ntree, mtry=4,      norm.votes=FALSE, importance=TRUE)
  }

  rf.model <- foreach(1:nCores, .packages="randomForest") %dopar% {
    rf.rfcv <- rfcv(ytrain = outcome, xtrain = predictor, scale=4)
  }

库（随机林）
图书馆（doMC）
nCoresrandomForest可以无缝地并行运行，因为randomForest:：combine函数将把4个rf.object减少为一个object。所以在第一个代码示例中，您只训练了4个不同的随机种子的森林模型。使用，combine=combine（implicit combine=randomForest:：combine），您可以指定使用randomForest包中的专用组合函数减少4个模型的输出列表
rfcv没有任何组合功能，也没有简单组合四个输出的意义。在您的代码中，foreach只需运行函数4次，并在列表中返回输出。如果您希望并行运行rfcv，则可以修复以下问题：
my.rfcv = randomForest::rfcv #copy function from package to .Global.env
fix(my.rfcv) #inspect function and perhaps copy entire function to your source functions script

#rewrite for-loop at line 35-57 into a foreach-loop
#write a reducer to combine test results of each fold