Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
当R中有两列匹配时合并行_R_Dataframe_Dplyr - Fatal编程技术网

当R中有两列匹配时合并行

当R中有两列匹配时合并行,r,dataframe,dplyr,R,Dataframe,Dplyr,我有一个数据帧,如 Species Family Events Groups Monkey A 6,7 G1,G2 Monkey A,B 6,8,9 G1,G2,G4,G8,G12 Elephant B 7,8 G6,G7 Elephant C 9,10 G6 Dog K 10 G90 Dog L,M,N 8,10,9 G90,G91 这个想法是在

我有一个数据帧,如

Species   Family  Events  Groups
Monkey    A       6,7     G1,G2
Monkey    A,B     6,8,9   G1,G2,G4,G8,G12
Elephant  B       7,8     G6,G7
Elephant  C       9,10    G6
Dog       K       10      G90
Dog       L,M,N   8,10,9  G90,G91 
这个想法是在
物种
中合并,至少在
事件
列之间存在匹配

例如在
Monkey
中:

Species   Family  Events  Groups
Monkey    A       6,7     G1,G2
Monkey    A,B     6,8,9   G1,G2,G4,G8,G12
事件6
行1中的
组G1
也在*row2中,因此我合并它们:

Species   Family  Events  Groups
Monkey    A,B       6,7,8,9 G1,G2,G4,G8,G12
最后,预期输出为:

Species   Family  Events  Groups
Monkey    A,B     6,7,8,9 G1,G2,G4,G8,G12
Elephant  B       7,8     G6,G7
Elephant  C       9,10    G6
Dog       K,L,M,N   8,9,10  G90,G91
我没有合并大象,因为
事件
列中没有匹配项

有人知道代码吗,谢谢

以下是数据:

structure(list(Species = structure(c(3L, 3L, 2L, 2L, 1L, 1L), .Label = c("Dog", 
"Elephant", "Monkey"), class = "factor"), Family = structure(1:6, .Label = c("A", 
"A,B", "B", "C", "K", "L,M,N"), class = "factor"), Events = structure(c(2L, 
3L, 4L, 6L, 1L, 5L), .Label = c("10", "6,7", "6,8,9", "7,8", 
"8,10,9", "9,10"), class = "factor"), Groups = structure(c(1L, 
3L, 2L, 4L, 5L, 6L), .Label = c(" G1,G2", " G6,G7", "G1,G2,G4,G8,G12", 
"G6", "G90", "G90,G91"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L))
遵循这一策略

library(tidyverse)

df1 <- df %>% 
  group_by(Species) %>% 
  mutate(across(c(Family, Events, Groups), ~as.character(.))) %>%
  summarise(across(c(Events, Groups), ~ toString(Reduce(intersect, strsplit(., ','))))) %>%
  filter(Events != "" & Groups != "") %>%
  select(Species) 

df1 %>%
  left_join(df %>% mutate(across(c(Family, Events, Groups), ~as.character(.)))) %>%
  group_by(Species) %>%
  summarise(across(c(Family, Events, Groups), ~ toString(Reduce(union, strsplit(., ','))))) %>%
  rbind(df %>% anti_join(df1))

# A tibble: 4 x 4
  Species  Family     Events     Groups             
  <fct>    <chr>      <chr>      <chr>              
1 Dog      K, L, M, N 10, 8, 9   G90, G91           
2 Monkey   A, B       6, 7, 8, 9 G1, G2, G4, G8, G12
3 Elephant B          7,8        G6,G7              
4 Elephant C          9,10       G6
库(tidyverse)
df1%
组别(种类)%>%
变异(跨越(c(家族、事件、群体),~as.character()%%>%
总结(跨c(事件、组),~toString(减少(相交、strsplit(,,,,,,,,,,,,,,,,,,)))%>%
筛选器(事件!=''和组!='')%>%
选择(种类)
df1%>%
左联合(df%>%变异(跨越(c(家族、事件、组)~as.character())))%>%
组别(种类)%>%
总结(跨越(c)(家族、事件、群体),~toString(减少(联合、标准分裂(,,,,,,,,,,,,,,,,,,,,,,,)))%>%
rbind(df%>%反联合(df1))
#一个tibble:4x4
物种家族事件组
1只狗K,L,M,N 10,8,9 G90,G91
2猴子A,B 6,7,8,9 G1,G2,G4,G8,G12
3大象B 7,8 G6,G7
4大象C 9,10 G6

1)分开,2)透视3)过滤那些最大值(n())>1的记录,4)将这些记录串起来,5)将它们与未过滤的记录合并。这是一个错误,抱歉哇,非常感谢你这么做和花时间!!!这很有帮助。我还有另一个类似的问题,也许你也能帮忙??如果你有时间,这里是帖子: