R:Spread data.frame/tible与共享密钥和丢失的数据

R:Spread data.frame/tible与共享密钥和丢失的数据,r,dataframe,reshape,tidyr,R,Dataframe,Reshape,Tidyr,我有一个两列的表格,我想把它展开。我知道这是一个非常受欢迎和深入探讨的话题,然而,我尝试了几种方法,并没有得到我想要的。欢迎任何建议和投诉 我的表格里有三位女性的数据。总共有5个类别,通常每个类别都有值。但一些女性数据缺失,导致整行数据缺失-请注意,Jane遗漏了有关体重的信息 a = data.frame(categories = c("name", "sex", "age", "weight", "high", "name", "

我有一个两列的表格,我想把它展开。我知道这是一个非常受欢迎和深入探讨的话题,然而,我尝试了几种方法,并没有得到我想要的。欢迎任何建议和投诉

我的表格里有三位女性的数据。总共有5个类别,通常每个类别都有值。但一些女性数据缺失,导致整行数据缺失-请注意,
Jane
遗漏了有关
体重的信息

a = data.frame(categories = c("name", "sex", "age", "weight", "high", 
                              "name", "sex", "age", "high", 
                              "name", "sex", "age", "weight", "high"),
               values = c("Emma", "female", "32", "72", "175",
                          "Jane", "female", "28", "165",
                          "Emma", "female", "42", "63", "170")) 

   categories values
1        name   Emma
2         sex   female
3         age     32
4      weight     72
5        high    175
6        name   Jane
7         sex female
8         age     28
9        high    165
10       name   Emma
11        sex female
12        age     42
13     weight     63
14       high    170
我想从
类别
-列和
-行中获取。但有两个主要问题:

1) 钥匙是共享的-两个EMMA(因此我不能使用
排列
重塑

2) 某些类别可能缺失-如Jane的体重(因此我不能使用
pivot
split

最后,我想重塑数据以得到如下表:

     name  sex    age  weight  high
     Emma  female 32   72      175
     Jane  female 28   NA      165
     Emma  female 42   63      170

假设每个条目始终存在
'name'
,我们可以创建一个标识符列,并使用
pivot\u wide
对其进行重塑

library(dplyr)

a %>%
  group_by(grp = cumsum(categories == 'name')) %>%
  tidyr::pivot_wider(names_from = categories, values_from = values) %>%
  ungroup %>%
  select(-grp)

#  name  sex    age   weight high 
#  <chr> <chr>  <chr> <chr>  <chr>
#1 Emma  female 32    72     175  
#2 Jane  female 28    NA     165  
#3 Emma  female 42    63     170  

假设所有条目都以
name
开头,并在底部R中使用
magrittr
进行清洁:

library(magrittr)
split(a, cumsum(a$categories == "name")) %>% 
  lapply(function(x) setNames(x[[2L]], x[[1L]])[unique(a$categories)]) %>% 
  do.call(rbind, .) %>% 
  data.frame()

  name    sex age weight high
1 Emma female  32     72  175
2 Jane female  28   <NA>  165
3 Emma female  42     63  170
library(magrittr)
split(a, cumsum(a$categories == "name")) %>% 
  lapply(function(x) setNames(x[[2L]], x[[1L]])[unique(a$categories)]) %>% 
  do.call(rbind, .) %>% 
  data.frame()

  name    sex age weight high
1 Emma female  32     72  175
2 Jane female  28   <NA>  165
3 Emma female  42     63  170
library(data.table)
split(a, cumsum(a$categories == "name")) %>% 
  lapply(transpose, make.names = "categories") %>% 
  rbindlist(fill = TRUE)