R 取消测试不同大小的相关列表列

R 取消测试不同大小的相关列表列,r,list,pivot,unnest,R,List,Pivot,Unnest,解析xml文件后,我得到的数据如下所示: example_df <- tibble(id = "ABC", wage_type = "salary", name = c("Description","Code","Base", "Description","Code",&q

解析xml文件后,我得到的数据如下所示:

example_df <-  
  tibble(id = "ABC",
         wage_type = "salary",
         name = c("Description","Code","Base",
                  "Description","Code","Base",
                  "Description","Code"),
         value = c("wage_element_1","51B","600",
                   "wage_element_2","51C","740",
                   "wage_element_3","51D"))

example_df 

# A tibble: 8 x 4
  id    wage_type name        value         
  <chr> <chr>     <chr>       <chr>         
1 ABC   salary    Description wage_element_1
2 ABC   salary    Code        51B           
3 ABC   salary    Base        600           
4 ABC   salary    Description wage_element_2
5 ABC   salary    Code        51C           
6 ABC   salary    Base        740           
7 ABC   salary    Description wage_element_3
8 ABC   salary    Code        51D      
因此,当我尝试取消cols时,它会抛出一个错误:

example_df%>%
  unnest(cols = c(Description,Code,Base))

Error: Can't recycle `Description` (size 3) to match `Base` (size 2).
我知道这是因为tidyr函数不循环使用,但我找不到解决这个问题的方法,也找不到解决问题的
base r
。我试着和你做一个df
unlist(strsplit(as.character(x))
解决方案符合,但也遇到了列长度问题

所需输出如下:

desired_df <- 
  tibble(
    id=c("ABC","ABC","ABC"),
    wage_type=c("salary","salary","salary"),
    Description = c("wage_element_1","wage_element_2","wage_element_3"),
    Code = c("51B","51C","51D"),
    Base = c("600","740",NA))

desired_df

id    wage_type Description    Code  Base 
  <chr> <chr>     <chr>          <chr> <chr>
1 ABC   salary    wage_element_1 51B   600  
2 ABC   salary    wage_element_2 51C   740  
3 ABC   salary    wage_element_3 51D   NA  

desired_df我建议使用
tidyverse
函数来实现此方法。您遇到的问题是函数如何管理不同的行。因此,通过创建像
id2
这样的id变量,您可以避免在最终重塑的数据中列出输出:

library(tidyverse)
#Code
example_df %>% 
  arrange(name) %>%
  group_by(id,wage_type,name) %>%
  mutate(id2=1:n()) %>% ungroup() %>%
  pivot_wider(names_from = name,values_from=value) %>%
  select(-id2)
输出:

# A tibble: 3 x 5
  id    wage_type Base  Code  Description   
  <chr> <chr>     <chr> <chr> <chr>         
1 ABC   salary    600   51B   wage_element_1
2 ABC   salary    740   51C   wage_element_2
3 ABC   salary    NA    51D   wage_element_3
#一个tible:3 x 5
id工资类型基本代码说明
1 ABC工资600 51B工资要素1
2 ABC工资740 51C工资要素2
3 ABC工资NA 51D工资要素3
# A tibble: 3 x 5
  id    wage_type Base  Code  Description   
  <chr> <chr>     <chr> <chr> <chr>         
1 ABC   salary    600   51B   wage_element_1
2 ABC   salary    740   51C   wage_element_2
3 ABC   salary    NA    51D   wage_element_3