R 匹配字符串并返回不匹配的单词_R_String

R 匹配字符串并返回不匹配的单词

r string

R 匹配字符串并返回不匹配的单词,r,string,R,String,我想在两列之间匹配一串单词并返回不匹配的单词示例数据帧： data = data.frame(animal1 = c("cat, dog, horse, mouse", "cat, dog, horse", "mouse, frog", "cat, dog, frog, cow"), animal2 = c("dog, horse, mouse", "cat, horse", "frog", "cat, dog, frog")) animal1

我想在两列之间匹配一串单词并返回不匹配的单词

示例数据帧：

data = data.frame(animal1 = c("cat, dog, horse, mouse", "cat, dog, horse", "mouse, frog", "cat, dog, frog, cow"), animal2 = c("dog, horse, mouse", "cat, horse", "frog", "cat, dog, frog"))

                 animal1           animal2 unique_animal
1 cat, dog, horse, mouse dog, horse, mouse           cat
2        cat, dog, horse        cat, horse           dog
3            mouse, frog              frog         mouse
4    cat, dog, frog, cow    cat, dog, frog           cow

我想添加一个新的列“unique_animal”，其中包含生成的数据框：

data = data.frame(animal1 = c("cat, dog, horse, mouse", "cat, dog, horse", "mouse, frog", "cat, dog, frog, cow"), animal2 = c("dog, horse, mouse", "cat, horse", "frog", "cat, dog, frog"))

                 animal1           animal2 unique_animal
1 cat, dog, horse, mouse dog, horse, mouse           cat
2        cat, dog, horse        cat, horse           dog
3            mouse, frog              frog         mouse
4    cat, dog, frog, cow    cat, dog, frog           cow

我已尝试了此问题中的代码：

逗号不是问题，我可以轻松删除它们。但当不匹配的单词位于字符串末尾时，它就不起作用了。出于某种原因，在这种情况下，它不计算元素的总数。你知道如何修改这个代码，使它不会这样做吗？还是另一种方法

谢谢大家!

在拆分

，\\s*

处的列后，我们可以使用

map2

与

setdiff

library(dplyr)
library(purrr)
library(stringr)
data %>%
   mutate(unique_animal = map2_chr(strsplit(as.character(animal1), ",\\s+"), 
                 strsplit(as.character(animal2), ",\\s+"), 
             ~ str_c(setdiff(.x, .y), collapse=", ")))
#                 animal1           animal2 unique_animal
#1 cat, dog, horse, mouse dog, horse, mouse           cat
#2        cat, dog, horse        cat, horse           dog
#3            mouse, frog              frog         mouse
#4    cat, dog, frog, cow    cat, dog, frog           cow