Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/75.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 将一个数据帧中的每一行与另一个数据帧中的多行进行比较,并获得结果_R_Dataframe - Fatal编程技术网

R 将一个数据帧中的每一行与另一个数据帧中的多行进行比较,并获得结果

R 将一个数据帧中的每一行与另一个数据帧中的多行进行比较,并获得结果,r,dataframe,R,Dataframe,我有两个数据集df1和df2 df1 c1 match c3 c4 AA1 AB cat dog AA1 CD dfs abd AA1 EF js hn AA1 GH bsk jtd AA2 AB cat mouse AA2 CD adb mop AA2 EF powas qwert AA2 GH sms mms AA3 AB i

我有两个数据集df1和df2

df1
c1  match   c3      c4
AA1 AB      cat     dog
AA1 CD      dfs     abd
AA1 EF      js      hn
AA1 GH      bsk     jtd
AA2 AB      cat     mouse
AA2 CD      adb     mop
AA2 EF      powas   qwert
AA2 GH      sms     mms
AA3 AB      i       j
AA3 CD      fgh     ejk
AA3 EF      mib     loi
AA3 GH      revit   roger

df2
match   d2      result
AB      cat     friendly
AB      mouse   enemy
CD      dfs     r1
CD      adb     r1
CD      fgh     r2
CD      ejk     r3
EF      mib     some_result
GH      sms     sent
GH      mms     sent
IJ      xxx     yyy
KL      crt     zzz
KL      rrr     qqq
我想通过列“match”匹配df1和df2,并在df1中添加两个新列“result_c1”和“result_c2”。结果_c1通过首先匹配匹配列,然后将df1中的c3匹配到df2中的d2,从而从df2中获得相应的结果。结果_c2通过首先匹配匹配列,然后将df1中的c4匹配到df2中的d2,从而从df2中获得相应的结果。如果没有匹配项,则返回“no_match”。有没有一种有效的方法可以做到这一点

result
c1  match   c3      c4      result_c1   result_c2   
AA1 AB      cat     dog     friendly    no_match    
AA1 CD      dfs     adb     r1          r1          
AA1 EF      js      hn      no_match    no_match    
AA1 GH      bsk     jtd     no_match    no_match    
AA2 AB      cat     mouse   friendly    enemy       
AA2 CD      adb     mop     r1          no_match    
AA2 EF      powas   qwert   no_match    no_match    
AA2 GH      sms     mms     sent        sent        
AA3 AB      i       j       no_match    no_match    
AA3 CD      fgh     ejk     r2          r3          
AA3 EF      mib     loi     some_result no_match    
AA3 GH      revit   roger   no_match    no_match    
数据附于下文:

df1 <- data.frame(list(c1 = c("AA1", "AA1", "AA1", "AA1", "AA2", "AA2", "AA2", "AA2",
                      "AA3", "AA3", "AA3", "AA3"), match = c("AB", "CD", "EF", "GH", 
                                                             "AB", "CD", "EF", "GH", 
                                                             "AB", "CD", "EF", "GH"),
                      c3 = c("cat", "dfs", "js", "bsk", "cat", "adb", "powas", "sms", "i",
                      "fgh", "mib", "revit"), c4 = c("dog", "abd", "hn", "jtd", "mouse",
                                                     "mop", "qwert", "mms", "j", "ejk", "loi", "roger")))

df2 <- data.frame(list(match = c("AB", "AB", "CD", "CD", "CD", "CD", "EF", "GH", "GH", "IJ", "KL", "KL"), 
                       d2 = c("cat", "mouse", "dfs", "adb", "fgh", "ejk", "mib", "sms", "mms", "xxx", "crt", "rrr"),
                       result = c("friendly", "enemy", "r1", "r1", "r2", "r3", "some_result", "sent", "sent", "yyy", "zzz", "qqq")))

df1单向使用自定义函数

apply_fun <- function(x, y, r) {
   inds <- x %in% y
   if (any(inds)) r[match(x[which.max(inds)], y)] else "no_match"
}

library(dplyr)
df1 %>%
  left_join(df2, by = "match") %>%
  mutate_all(as.character) %>%
  group_by(c1, match) %>%
  summarise(result_c1 = apply_fun(c3, d2, result), 
            result_c2 = apply_fun(c4, d2, result))

#   c1    match result_c1   result_c2
#   <chr> <chr> <chr>       <chr>    
# 1 AA1   AB    friendly    no_match 
# 2 AA1   CD    r1          no_match 
# 3 AA1   EF    no_match    no_match 
# 4 AA1   GH    no_match    no_match 
# 5 AA2   AB    friendly    enemy    
# 6 AA2   CD    r1          no_match 
# 7 AA2   EF    no_match    no_match 
# 8 AA2   GH    sent        sent     
# 9 AA3   AB    no_match    no_match 
#10 AA3   CD    r2          r3       
#11 AA3   EF    some_result no_match 
#12 AA3   GH    no_match    no_match 
apply\u fun%
全部变异(如字符)%>%
分组依据(c1,匹配)%>%
总结(结果c1=应用乐趣(c3、d2、结果),
结果c2=应用乐趣(c4、d2、结果))
#c1匹配结果\u c1结果\u c2
#                
#1 AA1 AB友谊赛无比赛
#2 AA1 CD r1不匹配
#3 AA1 EF不匹配不匹配
#4 AA1 GH不匹配不匹配
#5 AA2 AB友好敌人
#6 AA2 CD r1不匹配
#7 AA2 EF不匹配不匹配
#8 AA2 GH已发送
#9 AA3 AB不匹配不匹配
#10 AA3 CD r2 r3
#11 AA3 EF某些结果不匹配
#12 AA3 GH不匹配不匹配

这里是一个使用
基本R
的解决方案:

df1$result_c1 = with(df1,ifelse(is.na(match(paste(match,c3),with(df2,paste(match,d2)))),
                                "no match",
                                as.character(df2$result[match(paste(match,c3),with(df2,paste(match,d2)))])))
df1$result_c2 = with(df1,ifelse(is.na(match(paste(match,c4),with(df2,paste(match,d2)))),
                                "no match",
                                as.character(df2$result[match(paste(match,c4),with(df2,paste(match,d2)))])))
以致

> df1
    c1 match    c3    c4   result_c1 result_c2
1  AA1    AB   cat   dog    friendly  no match
2  AA1    CD   dfs   abd          r1        r1
3  AA1    EF    js    hn    no match  no match
4  AA1    GH   bsk   jtd    no match  no match
5  AA2    AB   cat mouse    friendly     enemy
6  AA2    CD   adb   mop    no match  no match
7  AA2    EF powas qwert    no match  no match
8  AA2    GH   sms   mms        sent      sent
9  AA3    AB     i     j    no match  no match
10 AA3    CD   fgh   ejk          r2        r3
11 AA3    EF   mib   loi some_result  no match
12 AA3    GH revit roger    no match  no match

选择此选项作为正确的解决方案,因为它比其他答案更快。