Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/75.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/qt/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R以更快的方式对大量文件执行操作_R - Fatal编程技术网

R以更快的方式对大量文件执行操作

R以更快的方式对大量文件执行操作,r,R,我必须在10200个文本文件上运行此操作: s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],]) 并在每个文件上写入每个操作的一个文件输出,如下所示: ID CHROM POS 20_49715203_T_C_b37 20 49715203 fileNames <- lapply(Sys.glob("ENSG*.txt")

我必须在10200个文本文件上运行此操作:

s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],])
并在每个文件上写入每个操作的一个文件输出,如下所示:

        ID            CHROM      POS
20_49715203_T_C_b37    20      49715203
fileNames <- lapply(Sys.glob("ENSG*.txt"), read.table)
s=read.table("snpPos", header=TRUE)

for (fileName in fileNames) {

  # read original data:
  sample <- read.table(fileName,
  header = TRUE,
   sep = ",")

  # create new data based on contents of original file:
  allEQTLs <- data.frame(
    File = fileName,
    EQTLs = s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],])

  # write new data to separate file:
  write.table(allEQTLs, 
    "EQTLs.txt",
    append = TRUE,
    sep = ",",
    row.names = FALSE,
    col.names = FALSE)
}
因此,我将得到一个包含10200行的文件,如上面所述

现在,我的代码如下所示:

        ID            CHROM      POS
20_49715203_T_C_b37    20      49715203
fileNames <- lapply(Sys.glob("ENSG*.txt"), read.table)
s=read.table("snpPos", header=TRUE)

for (fileName in fileNames) {

  # read original data:
  sample <- read.table(fileName,
  header = TRUE,
   sep = ",")

  # create new data based on contents of original file:
  allEQTLs <- data.frame(
    File = fileName,
    EQTLs = s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],])

  # write new data to separate file:
  write.table(allEQTLs, 
    "EQTLs.txt",
    append = TRUE,
    sep = ",",
    row.names = FALSE,
    col.names = FALSE)
}

fileNames如果读/写操作占用大部分时间,请尝试从data.table包中执行fread和fwrite。(您可以使用Rprofiling工具(例如Rprof函数)检查后一种情况。)

readr包可以提高读取/写入文本文件的性能。