Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/66.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用dplyr折叠重复列_R_Dplyr - Fatal编程技术网

使用dplyr折叠重复列

使用dplyr折叠重复列,r,dplyr,R,Dplyr,我从电子表格中读取了以下数据 structure(list(x = c("a", NA, NA, "b", NA, NA, "c", NA), y = c(1, NA, NA, 7, NA, NA, 13, NA), z = c(2, NA, NA, 8, NA, NA, 14, NA ), x.1 = c(NA, "a", "a", NA, "b", "b", NA, "c"), y.1 = c(NA, 3, 5, NA, 9, 11, NA, 15), z.1 = c(NA, 4,

我从电子表格中读取了以下数据

structure(list(x = c("a", NA, NA, "b", NA, NA, "c", NA), y = c(1, 
   NA, NA, 7, NA, NA, 13, NA), z = c(2, NA, NA, 8, NA, NA, 14, NA
), x.1 = c(NA, "a", "a", NA, "b", "b", NA, "c"), y.1 = c(NA, 
3, 5, NA, 9, 11, NA, 15), z.1 = c(NA, 4, 6, NA, 10, 12, NA, 16
)), .Names = c("x", "y", "z", "x.1", "y.1", "z.1"), row.names = c(NA, 
-8L), class = "data.frame")
显示时如下所示:

     x  y  z  x.1 y.1 z.1
1    a  1  2 <NA>  NA  NA
2 <NA> NA NA    a   3   4
3 <NA> NA NA    a   5   6
4    b  7  8 <NA>  NA  NA
5 <NA> NA NA    b   9  10
6 <NA> NA NA    b  11  12
7    c 13 14 <NA>  NA  NA
8 <NA> NA NA    c  15  16
xyzx.1y.1z.1
1 a 12 NA NA
2 NA NA 3 4
3 NA NA a 5 6
4 b 7 8 NA
5 NA NA b 9 10
6 NA NA b 11 12
7 c 13 14 NA
8 NA NA c 15 16

有时,这三组重复列中有一组以上的列。如果我不知道将有多少个块,但我知道这些列将以相同的方式命名,只是使用不同的(但顺序递增的)数字后缀,那么如何将所有数据合并到前3列中?这可以用dplyr实现吗?

使用
dplyr/tidyr

library(dplyr)
library(tidyr)
add_rownames(dfN) %>%
         gather(Var, Val, -1) %>% 
         mutate(Var=sub('\\..*$', '', Var)) %>%
         na.omit() %>% 
         spread(Var, Val) %>%
         select(-rowname) 
#  x  y  z
#1 a  1  2
#2 a  3  4
#3 a  5  6
#4 b  7  8
#5 b  9 10
#6 b 11 12
#7 c 13 14
#8 c 15 16
或使用
base R

dfN[c('x', 'y', 'z')] <- lapply(split(colnames(dfN), sub('\\..*$', '', 
            colnames(dfN))), function(nm) 
                  do.call(pmax, c(dfN[nm], na.rm=TRUE)) )
dfN[1:3]
dfN[c('x','y','z')]
dfN <- structure(list(x = c("a", NA, NA, "b", NA, NA, "c", NA),
y = c(1, 
 NA, NA, 7, NA, NA, 13, NA), z = c(2, NA, NA, 8, NA, NA, 14, NA
), x.1 = c(NA, "a", "a", NA, "b", "b", NA, "c"), y.1 = c(NA, 
 3, 5, NA, 9, 11, NA, 15), z.1 = c(NA, 4, 6, NA, 10, 12, NA, 16
)), .Names = c("x", "y", "z", "x.1", "y.1", "z.1"), row.names = c(NA, 
-8L), class = "data.frame")