如果在r中的变量中找到值,则用于子集数据帧的If-else语句

如果在r中的变量中找到值,则用于子集数据帧的If-else语句,r,match,R,Match,我有三个单词和两个短语的数据框架,以及在文本中分别找到的每个短语的计数。以下是一些虚拟数据: trig <- c("took my dog", "took my cat", "took my hat", "ate my dinner", "ate my lunch") trig_count <- c(3, 2, 1, 3, 1) big <- c("took my", "took my", "took my", "ate my", "ate my") b

我有三个单词和两个短语的数据框架,以及在文本中分别找到的每个短语的计数。以下是一些虚拟数据:

   trig <- c("took my dog", "took my cat", "took my hat", "ate my dinner", "ate my lunch")
   trig_count <- c(3, 2, 1, 3, 1)
   big <- c("took my", "took my", "took my", "ate my", "ate my")
   big_count <- c(6,6,6,4,4)
   df <- data.frame(trig, trig_count, big, big_count)
   df$trig <- as.character(df$trig)
   df$big <- as.character(df$big)

          trig    trig_count   big    big_count
   1  took my dog          3  took my         6        2  took my cat                 
   2  took my         6
   3  took my hat          1  took my         6
   4  ate my dinner        3  ate my          4
   5  ate my lunch         1  ate my          4
返回

    "no match"
    "took my dog" "took my cat" "took my hat"
但对于匹配的单词,它不起作用,例如:

    match_test("looked for")
   match_test("took my")
返回

    "no match"
    "took my dog" "took my cat" "took my hat"
我要找的是:

           trig    trig_count   big    big_count
    1  took my dog          3  took my         6
    2  took my cat          2  took my         6
    3  took my hat          1  took my         6

我不明白的是关于%的什么?还是别的什么?非常感谢您的指导

我们可以使用
stru-detect

library(stringr)
library(dplyr)
df %>% 
     filter(str_detect(big, "took my"))
#        trig trig_count     big big_count
#1 took my dog          3 took my         6
#2 took my cat          2 took my         6
#3 took my hat          1 took my         6

您不需要
ifelse
;您只需按照@Ronak Shah的建议将原始df子集即可:

df[grep(match_test, df$big), ]
如果要将其转换为仍然返回不匹配的函数,可以执行以下操作:

match_test <- function(match_string) {

  subset_df <- df[grep(match_string, df$big), ]

  if (nrow(subset_df) < 1) {
    warning("no match")
  } else {
    subset_df
  }  

}

match_test("took my")
#          trig trig_count     big big_count
# 1 took my dog          3 took my         6
# 2 took my cat          2 took my         6
# 3 took my hat          1 took my         6
我们也可以试试这个:

library(stringr)
match_test <- function(x){
  res <- df[which(!is.na(str_match(df$big,x))),]
  if(nrow(res) == 0) return('no match')
  return(res)
}
match_test("looked for")
#[1] "no match"
match_test("took my")
#         trig trig_count     big big_count
#1 took my dog          3 took my         6
#2 took my cat          2 took my         6
#3 took my hat          1 took my         6
match_test("ate my")
#           trig trig_count    big big_count
#4 ate my dinner          3 ate my         4
#5  ate my lunch          1 ate my         4
库(stringr)

也许是这个
df[grep(“拿走了我的”,df$big),]
感谢您的快速响应,Ronak。我想改为使用grep(),但我不知道如何以编程方式使用该函数,即grep(x,df$big)不起作用,因为需要引号。有什么想法吗?会的。试试看<代码>匹配\u测试可能的重复项或感谢大家的输入。在您的帮助下,我已经让函数完成了我需要它做的事情,但是我仍然想理解为什么我的代码不起作用——如果有人有任何想法的话……我确实需要一个函数中的结果,如果没有匹配的话,它将返回字符串“不匹配”(而不是警告),否则我只需要执行df[df$big==x,]),尽管grep也可以工作。谢谢你,菲尔@在这种情况下,将
warning()
替换为
return()