将R中特定数量的行转换为列,并对大型数据集重复此过程

将R中特定数量的行转换为列,并对大型数据集重复此过程,r,dataframe,bigdata,large-data,R,Dataframe,Bigdata,Large Data,我有一个1500万行的数据集,只有一列。看起来, x_raw A1 A2 A3 A4 B1 B2 B3 B4 C1 C2 I want to convert it to A1 A2 A3 A4 B1 B2 B3 B4 C1 C2 C3 C4 我曾尝试使用“for”循环,它将每4行转置一次,并将它们添加到“final”数据帧中,但由于数据集太大,它将迭代近270万次,效率不高。有没有其他有效的方法或方法?这里有一个使用tidyverse的选项,其中将“x_raw”分隔为两列,然后扩展为“宽”

我有一个1500万行的数据集,只有一列。看起来,

x_raw
A1
A2
A3
A4
B1
B2
B3
B4
C1
C2

I want to convert it to

A1 A2 A3 A4
B1 B2 B3 B4
C1 C2 C3 C4

我曾尝试使用“for”循环,它将每4行转置一次,并将它们添加到“final”数据帧中,但由于数据集太大,它将迭代近270万次,效率不高。有没有其他有效的方法或方法?

这里有一个使用
tidyverse
的选项,其中
将“x_raw”分隔为两列,然后
扩展为“宽”格式

library(dplyr)
library(tidyr)
separate(df1, x_raw, into = c('x', 'rn'), sep="(?=\\d+)", remove = FALSE) %>%
       spread(rn, x_raw) %>% 
       select(-x)
#   1  2    3    4
#1 A1 A2   A3   A4
#2 B1 B2   B3   B4
#3 C1 C2 <NA> <NA>

如果您只想转换为四列数据帧:

as.data.frame(matrix(df$x_raw,ncol=4,byrow = T))

将长度扩展到下一个包含4个值的块,并将其放入矩阵中:

matrix(`length<-`(dat$x_raw, (nrow(dat) %/% 4 + 1) * 4), ncol=4, byrow=TRUE)

#     [,1] [,2] [,3] [,4]
#[1,] "A1" "A2" "A3" "A4"
#[2,] "B1" "B2" "B3" "B4"
#[3,] "C1" "C2" NA   NA
矩阵(`length见此

x_raw <- c("A1","A2","A3","A4","B1","B2","B3","B4","C1","C2","C3","C4","D1","D2","D3","D4")
x <- as.table(matrix(x_raw,ncol=4,byrow = T))
rownames(x) <- NULL
colnames(x) <- NULL
print(x)

您的首字母中没有
C3
C4
example@akrun数据是这样的..C和D等..多达1500万行..如果每四行不能更快地将向量转换为矩阵?@zacdav yea…我在看到评论后得到了..我对这一点不熟悉所以…)我不确定将其纳入一行的努力是否容易理解
x_raw <- c("A1","A2","A3","A4","B1","B2","B3","B4","C1","C2","C3","C4","D1","D2","D3","D4")
x <- as.table(matrix(x_raw,ncol=4,byrow = T))
rownames(x) <- NULL
colnames(x) <- NULL
print(x)
     [,1] [,2] [,3] [,4]
[1,] A1   A2   A3   A4
[2,] B1   B2   B3   B4  
[3,] C1   C2   C3   C4 
[4,] D1   D2   D3   D4