R以更快的方式对大量文件执行操作
我必须在10200个文本文件上运行此操作:R以更快的方式对大量文件执行操作,r,R,我必须在10200个文本文件上运行此操作: s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],]) 并在每个文件上写入每个操作的一个文件输出,如下所示: ID CHROM POS 20_49715203_T_C_b37 20 49715203 fileNames <- lapply(Sys.glob("ENSG*.txt")
s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],])
并在每个文件上写入每个操作的一个文件输出,如下所示:
ID CHROM POS
20_49715203_T_C_b37 20 49715203
fileNames <- lapply(Sys.glob("ENSG*.txt"), read.table)
s=read.table("snpPos", header=TRUE)
for (fileName in fileNames) {
# read original data:
sample <- read.table(fileName,
header = TRUE,
sep = ",")
# create new data based on contents of original file:
allEQTLs <- data.frame(
File = fileName,
EQTLs = s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],])
# write new data to separate file:
write.table(allEQTLs,
"EQTLs.txt",
append = TRUE,
sep = ",",
row.names = FALSE,
col.names = FALSE)
}
因此,我将得到一个包含10200行的文件,如上面所述
现在,我的代码如下所示:
ID CHROM POS
20_49715203_T_C_b37 20 49715203
fileNames <- lapply(Sys.glob("ENSG*.txt"), read.table)
s=read.table("snpPos", header=TRUE)
for (fileName in fileNames) {
# read original data:
sample <- read.table(fileName,
header = TRUE,
sep = ",")
# create new data based on contents of original file:
allEQTLs <- data.frame(
File = fileName,
EQTLs = s[s$POS==sample[tail(which(sample$obs_pval == min(sample$obs_pval)), 1),1],])
# write new data to separate file:
write.table(allEQTLs,
"EQTLs.txt",
append = TRUE,
sep = ",",
row.names = FALSE,
col.names = FALSE)
}
fileNames如果读/写操作占用大部分时间,请尝试从data.table包中执行fread和fwrite。(您可以使用Rprofiling工具(例如Rprof函数)检查后一种情况。)readr包可以提高读取/写入文本文件的性能。