正则表达式中的R与算子_R_Regex_Operator Keyword_Operations

正则表达式中的R与算子

r regex

正则表达式中的R与算子,r,regex,operator-keyword,operations,R,Regex,Operator Keyword,Operations,我试图得到一个表达式，它需要大量的段落，并在这两行中找到两个特定的单词，所以我在寻找and操作符？有办法吗例如： c <- ("She sold seashells by the seashore, and she had a great time while doing so.") 有什么想法吗非常感谢假设单词的顺序不重要，您可以创建两个捕获组 grep("(sold|great)(?:.+)(sold|great)", c, value = TRUE) 重复的帖子可能会让你开

我试图得到一个表达式，它需要大量的段落，并在这两行中找到两个特定的单词，所以我在寻找and操作符？有办法吗

例如：

c <- ("She sold seashells by the seashore, and she had a great time while doing so.")

有什么想法吗

非常感谢

假设单词的顺序不重要，您可以创建两个捕获组

grep("(sold|great)(?:.+)(sold|great)", c, value = TRUE)

重复的帖子可能会让你开始，但我不认为它直接解决了你的问题

您可以将

stringr:：str_detect

与

all

pos <- ("She sold seashells by the seashore, and she had a great time while doing so.") # contains sold and great
neg <- ("She bought seashells by the seashore, and she had a great time while doing so.") # contains great

pattern <- c("sold", "great")

library(stringr)
all(str_detect(pos,pattern))
# [1] TRUE

all(str_detect(neg,pattern))
# [1] FALSE

pos虽然在大多数情况下，我会使用CPak的回答中已经建议的stringr
软件包，但也有解决方案：
# create the sample string
c <- ("She sold seashells by the seashore, and she had a great time while doing so.")

# match any sold and great string within the text
# ignore case so that Sold and Great are also matched
grep("(sold.*great|great.*sold)", c, value = TRUE, ignore.case = TRUE)

因此，您可能希望使用单词边界，即匹配整个单词：
# \\b is a special character which matches word endings
grep("(\\bsold\\b.*\\bgreat\\b|\\bgreat\\b.*\\bsold\\b)", d, value = TRUE, ignore.case = TRUE)

\\b
匹配字符串中的第一个字符、最后一个字符或两个字符之间的匹配，其中一个字符属于单词，另一个不属于单词：
有关\b元字符的详细信息，请参见：
谢谢，但实际上我正在寻找一行包含这两个词，而不是两个词。如果这条线路卖得不好，我不希望这条线路被退回。@intern14，抱歉，我误解了。见我的编辑上面。
# set up alternative string
d <- ("She saw soldier eating seashells by the seashore, and she had a great time while doing so.")
# even soldier is matched here:
grep("(sold.*great|great.*sold)", d, value = TRUE, ignore.case = TRUE)

# \\b is a special character which matches word endings
grep("(\\bsold\\b.*\\bgreat\\b|\\bgreat\\b.*\\bsold\\b)", d, value = TRUE, ignore.case = TRUE)