Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/72.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 有效地分割数据帧_R_Split_Dataframe_Apply - Fatal编程技术网

R 有效地分割数据帧

R 有效地分割数据帧,r,split,dataframe,apply,R,Split,Dataframe,Apply,我有这个数据框: set.seed(1) n=20 df <- data.frame(s1 = paste(sample(0:3, n, replace = TRUE),sample(0:3, n, replace = TRUE),sep="/"), s2 = paste(sample(0:3, n, replace = TRUE),sample(0:3, n, replace = TRUE),sep="/"), s3

我有这个
数据框

set.seed(1)
n=20
df <- data.frame(s1 = paste(sample(0:3, n, replace = TRUE),sample(0:3, n, replace = TRUE),sep="/"),
                  s2 = paste(sample(0:3, n, replace = TRUE),sample(0:3, n, replace = TRUE),sep="/"),
                  s3 = paste(sample(0:3, n, replace = TRUE),sample(0:3, n, replace = TRUE),sep="/"),
                  stringsAsFactors = FALSE)

但我想知道是否还有更有效的方法

这里有点奇怪:

library(data.table)
fwrite(df, sep = "/", quote = FALSE,
       col.names = FALSE, file = "df.txt")

NN <- 2L*ncol(df)

DT1 <- fread("df.txt", sep = "/", select = seq(from = 1L, to = NN, by = 2L))
DT2 <- fread("df.txt", sep = "/", select = seq(from = 2L, to = NN, by = 2L))
库(data.table)
fwrite(df,sep=“/”,引号=FALSE,
col.names=FALSE,file=“df.txt”)

建议:使用stri_split_fixed。。。下面显示了一些基准。。。 (代码假定您以矩阵形式读取数据,然后将其转换为字符向量,使用“/”进行拆分,然后使用矩阵(prevOutput,nrow=origNrow,ncol=2*origNcol)

选项(stringsAsFactors=F)
图书馆(rbenchmark)
图书馆(stringi)
图书馆(tidyr)
种子(1)

ncols你有什么机会得到一些基因型数据?你抓住我了。但是,它不是传统的VCF格式。它只有CHROM.POS和GT字段。有什么建议吗
library(data.table)
fwrite(df, sep = "/", quote = FALSE,
       col.names = FALSE, file = "df.txt")

NN <- 2L*ncol(df)

DT1 <- fread("df.txt", sep = "/", select = seq(from = 1L, to = NN, by = 2L))
DT2 <- fread("df.txt", sep = "/", select = seq(from = 2L, to = NN, by = 2L))
options(stringsAsFactors=F)
library(rbenchmark)
library(stringi)
library(tidyr)

set.seed(1)
ncols <- 1
nrows <- 10*1000
strdat <- paste(sample(0:3, nrows*ncols, replace=T),
    sample(0:3, nrows*ncols, replace=T), sep="/")

benchmark(strsplitMtd=lapply(strdat, function(x) strsplit(x,"/")[[1]]),
    striMtd=stri_list2matrix(stri_split_fixed(strdat, "/"), byrow=T),
    tidyrMtd=separate(data.frame(S=strdat), S, c("S1","S2"), "/"))