R 将具有两个字符串的行保留在同一行中_R

R 将具有两个字符串的行保留在同一行中

R 将具有两个字符串的行保留在同一行中,r,R,在这样的数据帧中： df <- data.frame(id = c(1,2,3), text = c("hi my name is E","hi what's your name","name here")) 一个简单的答案和两个更复杂的答案，只有当你有两个以上的单词需要检查时，你才真正需要 library(tidyverse) df %>% filter(str_detect(text, 'hi') & str_detect(text, 'name')) df

在这样的数据帧中：

df <- data.frame(id = c(1,2,3), text = c("hi my name is E","hi what's your name","name here"))

一个简单的答案和两个更复杂的答案，只有当你有两个以上的单词需要检查时，你才真正需要

library(tidyverse)

df %>% 
  filter(str_detect(text, 'hi') & str_detect(text, 'name'))

df %>% 
  filter(rowSums(outer(text, c('hi', 'name'), str_detect)) == 2)

df %>% 
  filter(reduce(c('hi', 'name'), ~ .x & str_detect(text, .y), .init = TRUE))

我们还可以使用正则表达式指定是“hi”跟在“name”后面，还是（

）“name”跟在“hi”后面

library(dplyr)
library(stringr)
df %>% 
     filter(str_detect(text, 'hi\\b.*\\bname|name\\b.*\\bhi'))

在Base-R中

f[grep(".*hi.*name.*",f$text),]

输出

  id                text
1  1     hi my name is E
2  2 hi what's your name

另一个

dplyr

和

stringr

选项可以是：

df %>%
 filter(lengths(str_match_all(text, "name|hi")) == 2)

  id                text
1  1     hi my name is E
2  2 hi what's your name

或：

这样做的问题是它会过滤像

hihonames

这样的东西，当然它不是

hi

和

name

这样的东西。问题是它会过滤像

hihonames

这样的东西，当然它不是

hi

和

name

这样的东西，它有边界。对的

  id                text
1  1     hi my name is E
2  2 hi what's your name

df %>%
 filter(lengths(str_match_all(text, "name|hi")) == 2)

  id                text
1  1     hi my name is E
2  2 hi what's your name

df %>%
 rowwise() %>%
 filter(all(c("name", "hi") %in% unlist(str_extract_all(text, "name|hi"))))

df %>%
 filter(str_count(text, "name|hi") == 2)