R中的减法字符串

R中的减法字符串,r,string,tidyverse,stringr,R,String,Tidyverse,Stringr,有没有一种简单的方法可以在tibble或data.frame中的列之间减去字符串 例如,在下面的tibble中,有没有一种方法可以轻松地从a列和c列创建b列?类似于我如何从a和b创建c?(即c=a+b,所以b=c-a) 有什么想法吗?您可以编写一个函数来实现这一点 `%-%`=function(x,y)sub(paste0("\\s*",y,"\\s*",collapse="|"),"",x) ex1$c%-%ex1$a # To obtain b ie c-a [1] "ball"

有没有一种简单的方法可以在tibble或data.frame中的列之间减去字符串

例如,在下面的tibble中,有没有一种方法可以轻松地从a列和c列创建b列?类似于我如何从a和b创建c?(即c=a+b,所以b=c-a)


有什么想法吗?

您可以编写一个函数来实现这一点

`%-%`=function(x,y)sub(paste0("\\s*",y,"\\s*",collapse="|"),"",x)
ex1$c%-%ex1$a # To obtain b ie c-a
[1] "ball"        "ball"        "ball"        "hockey puck" "hockey puck" "hockey puck"
ex1$c%-%ex1$b # To obtain a ie c-b
[1] "orange" "green"  "grey"   "orange" "green"  "grey"  

这两种方法中的任何一种都应该有效:

ex1 %>% 
  rowwise() %>% 
  mutate( b = sub(a, "", c) %>% str_trim() )

# # A tibble: 6 x 3
#        a            b                  c
#    <chr>        <chr>              <chr>
# 1 orange         ball        orange ball
# 2  green         ball         green ball
# 3   grey         ball          grey ball
# 4 orange  hockey puck orange hockey puck
# 5  green  hockey puck  green hockey puck
# 6   grey  hockey puck   grey hockey puck

ex1 %>% mutate( b = str_replace(ex1$c, ex1$a, "") %>% str_trim() )

# # A tibble: 6 x 3
#        a           b                  c
#    <chr>       <chr>              <chr>
# 1 orange        ball        orange ball
# 2  green        ball         green ball
# 3   grey        ball          grey ball
# 4 orange hockey puck orange hockey puck
# 5  green hockey puck  green hockey puck
# 6   grey hockey puck   grey hockey puck
ex1%>%
行()
变异(b=sub(a,“,c)%>%str_trim())
##tibble:6 x 3
#a、b、c
#                          
#1个橙色球橙色球
#2绿球绿球
#3灰球灰球
#4橙色冰球冰球橙色冰球冰球
#5绿色冰球冰球绿色冰球冰球
#6灰色冰球冰球灰色冰球冰球
ex1%>%突变(b=str_替换(ex1$c,ex1$a,“”)%>%str_trim())
##tibble:6 x 3
#a、b、c
#                         
#1个橙色球橙色球
#2绿球绿球
#3灰球灰球
#4橙色冰球冰球橙色冰球冰球
#5绿色冰球冰球绿色冰球冰球
#6灰色冰球冰球灰色冰球冰球

这似乎不错,但如果列“a”中的值不如正则表达式工作,会发生什么情况?例如,假设值不是“绿色”,而是“灰色”。我想用rowwise()可以避免这种情况,但是如果a中的值有特殊字符会发生什么?正如我在回答中所写的那样,
sub
stru-replace
都尝试使用
a
中的值来匹配要在
c
中替换的精确子字符串。如果要删除非精确匹配的第一个单词(例如,对于“橙色|绿色|灰色”集合中的任何颜色),则需要改用正则表达式模式。@如果有特殊字符,应在模式周围使用
fixed()
`%-%`=function(x,y)sub(paste0("\\s*",y,"\\s*",collapse="|"),"",x)
ex1$c%-%ex1$a # To obtain b ie c-a
[1] "ball"        "ball"        "ball"        "hockey puck" "hockey puck" "hockey puck"
ex1$c%-%ex1$b # To obtain a ie c-b
[1] "orange" "green"  "grey"   "orange" "green"  "grey"  
ex1 %>% 
  rowwise() %>% 
  mutate( b = sub(a, "", c) %>% str_trim() )

# # A tibble: 6 x 3
#        a            b                  c
#    <chr>        <chr>              <chr>
# 1 orange         ball        orange ball
# 2  green         ball         green ball
# 3   grey         ball          grey ball
# 4 orange  hockey puck orange hockey puck
# 5  green  hockey puck  green hockey puck
# 6   grey  hockey puck   grey hockey puck

ex1 %>% mutate( b = str_replace(ex1$c, ex1$a, "") %>% str_trim() )

# # A tibble: 6 x 3
#        a           b                  c
#    <chr>       <chr>              <chr>
# 1 orange        ball        orange ball
# 2  green        ball         green ball
# 3   grey        ball          grey ball
# 4 orange hockey puck orange hockey puck
# 5  green hockey puck  green hockey puck
# 6   grey hockey puck   grey hockey puck