在R中数据帧中每3列连接一次
我有一个包含713列和10行的大数据框架,我想从第6列开始每3列连接一次,变量名从v1到v713 数据如下所示:在R中数据帧中每3列连接一次,r,R,我有一个包含713列和10行的大数据框架,我想从第6列开始每3列连接一次,变量名从v1到v713 数据如下所示: > chr1[,1:10] V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14........ 1 1 rs1 116 T G 1 0 0 0 1 0 0 1 0 2 1 rs2 118 G A 1 0 0 1 0 0 0 1 0 3 1 rs3
> chr1[,1:10]
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14........
1 1 rs1 116 T G 1 0 0 0 1 0 0 1 0
2 1 rs2 118 G A 1 0 0 1 0 0 0 1 0
3 1 rs3 230 A G 1 0 0 1 0 0 0 1 0
所需成果:
V1 V2 V3 V4 V5 V6 V7 V8..........
1 1 rs1 116 T G 100 010 010
2 1 rs2 118 G A 100 100 010
3 1 rs3 230 A G 100 100 010
如何在R中实现这一点?假设要连接的列从第6个位置开始,我们将其作为不同的对象('df2')进行子集,
使用使用gl
创建的分组变量将其拆分为每三列,将行的元素粘贴在一起(do.call(paste0
)通过在data.frame
的列表
,cbind
上循环前5列,并更新列名
df2 <- df1[6:ncol(df1)]
dfN <- cbind(df1[1:5], sapply(split.default(df2, as.integer(gl(ncol(df2),
3, ncol(df2)))), function(x) do.call(paste0, x)))
colnames(dfN) <- paste0("V", seq_along(dfN))
dfN
# V1 V2 V3 V4 V5 V6 V7 V8
#1 1 rs1 116 T G 100 010 010
#2 1 rs2 118 G A 100 100 010
#3 1 rs3 230 A G 100 100 010
数据
df1
library(tidyverse)
df1 %>%
unite(VNew, V6:V14, sep="") %>%
separate(VNew, into = c("V6", "V7", "V8"), sep=c(3, 6))
# V1 V2 V3 V4 V5 V6 V7 V8
#1 1 rs1 116 T G 100 010 010
#2 1 rs2 118 G A 100 100 010
#3 1 rs3 230 A G 100 100 010
df1 <- structure(list(V1 = c(1L, 1L, 1L), V2 = c("rs1", "rs2", "rs3"
), V3 = c(116L, 118L, 230L), V4 = c("T", "G", "A"), V5 = c("G",
"A", "G"), V6 = c(1L, 1L, 1L), V7 = c(0L, 0L, 0L), V8 = c(0L,
0L, 0L), V9 = c(0L, 1L, 1L), V10 = c(1L, 0L, 0L), V11 = c(0L,
0L, 0L), V12 = c(0L, 0L, 0L), V13 = c(1L, 1L, 1L), V14 = c(0L,
0L, 0L)), .Names = c("V1", "V2", "V3", "V4", "V5", "V6", "V7",
"V8", "V9", "V10", "V11", "V12", "V13", "V14"), class = "data.frame",
row.names = c("1", "2", "3"))