正则表达式中的R与算子
我试图得到一个表达式,它需要大量的段落,并在这两行中找到两个特定的单词,所以我在寻找and操作符?有办法吗 例如:正则表达式中的R与算子,r,regex,operator-keyword,operations,R,Regex,Operator Keyword,Operations,我试图得到一个表达式,它需要大量的段落,并在这两行中找到两个特定的单词,所以我在寻找and操作符?有办法吗 例如: c <- ("She sold seashells by the seashore, and she had a great time while doing so.") 有什么想法吗 非常感谢 假设单词的顺序不重要,您可以创建两个捕获组 grep("(sold|great)(?:.+)(sold|great)", c, value = TRUE) 重复的帖子可能会让你开
c <- ("She sold seashells by the seashore, and she had a great time while doing so.")
有什么想法吗
非常感谢 假设单词的顺序不重要,您可以创建两个捕获组
grep("(sold|great)(?:.+)(sold|great)", c, value = TRUE)
重复的帖子可能会让你开始,但我不认为它直接解决了你的问题 您可以将
stringr::str_detect
与all
pos <- ("She sold seashells by the seashore, and she had a great time while doing so.") # contains sold and great
neg <- ("She bought seashells by the seashore, and she had a great time while doing so.") # contains great
pattern <- c("sold", "great")
library(stringr)
all(str_detect(pos,pattern))
# [1] TRUE
all(str_detect(neg,pattern))
# [1] FALSE
pos虽然在大多数情况下,我会使用CPak的回答中已经建议的stringr
软件包,但也有解决方案:
# create the sample string
c <- ("She sold seashells by the seashore, and she had a great time while doing so.")
# match any sold and great string within the text
# ignore case so that Sold and Great are also matched
grep("(sold.*great|great.*sold)", c, value = TRUE, ignore.case = TRUE)
因此,您可能希望使用单词边界,即匹配整个单词:
# \\b is a special character which matches word endings
grep("(\\bsold\\b.*\\bgreat\\b|\\bgreat\\b.*\\bsold\\b)", d, value = TRUE, ignore.case = TRUE)
\\b
匹配字符串中的第一个字符、最后一个字符或两个字符之间的匹配,其中一个字符属于单词,另一个不属于单词:
有关\b
元字符的详细信息,请参见:
谢谢,但实际上我正在寻找一行包含这两个词,而不是两个词。如果这条线路卖得不好,我不希望这条线路被退回。@intern14,抱歉,我误解了。见我的编辑上面。
# set up alternative string
d <- ("She saw soldier eating seashells by the seashore, and she had a great time while doing so.")
# even soldier is matched here:
grep("(sold.*great|great.*sold)", d, value = TRUE, ignore.case = TRUE)
# \\b is a special character which matches word endings
grep("(\\bsold\\b.*\\bgreat\\b|\\bgreat\\b.*\\bsold\\b)", d, value = TRUE, ignore.case = TRUE)