mcapply:所有计划的Core在用户代码中都遇到错误_R_Parallel Processing_Mclapply

mcapply:所有计划的Core在用户代码中都遇到错误

r parallel-processing

mcapply:所有计划的Core在用户代码中都遇到错误,r,parallel-processing,mclapply,R,Parallel Processing,Mclapply,下面是我的代码。我正在尝试获取以.idat结尾的所有文件（~20000）的列表，并使用函数illuminaio:：readIDAT读取每个文件 library(illuminaio) library(parallel) library(data.table) # number of cores to use ncores = 8 # this gets all the files with .idat extension ~20000 files files <- list.files

下面是我的代码。我正在尝试获取以

.idat

结尾的所有文件（~20000）的列表，并使用函数

illuminaio:：readIDAT

读取每个文件

library(illuminaio)
library(parallel)
library(data.table)

# number of cores to use
ncores = 8

# this gets all the files with .idat extension ~20000 files
files <- list.files(path = './',
                    pattern = "*.idat",
                    full.names = TRUE)

# function to read the idat file and create a data.table of filename, and two more columns
# write out as csv using fwrite
get.chiptype <- function(x)
{
  idat <- readIDAT(x)
  res <- data.table(filename = x, nSNPs = nrow(idat$Quants), Chip = idat$ChipType)
  fwrite(res, file.path = 'output.csv', append = TRUE)
}

# using mclapply call the function get.chiptype on all 20000 files.
# use 8 cores at a time
mclapply(files, FUN = function(x) get.chiptype(x), mc.cores = ncores)

如何解决此问题？

在某些情况下，调用

mclappy（）

需要指定允许多个随机数流的随机数生成器。 R版本2.14.0实现了Pierre L'Ecuyer的多重伪随机数生成器

尝试在调用

mclappy（）

之前添加以下内容，并为“

my.seed

”预先指定一个值：

set.seed( my.seed, kind = "L'Ecuyer-CMRG" );

什么是destdir和destfile它们只是写入data.table的目录和文件名。我会删除它。你仍然会出错吗？这可能不是问题所在，但我会小心从并行进程附加到单个文件。我不是专家，但这似乎是一个麻烦的秘诀。你知道他们是否以某种方式锁定了文件，这样一次只能写一个吗？这不是问题所在。这些只是指向我的源文件和目标文件的路径。这就是并行处理的问题。

set.seed( my.seed, kind = "L'Ecuyer-CMRG" );