两个字符串之间或字符串末尾的R正则表达式
我有这样的字符串:两个字符串之间或字符串末尾的R正则表达式,r,regex,stringr,R,Regex,Stringr,我有这样的字符串: a = "USER 2020-02-09 01:13SOMETHING INTERESTING HERE.USER 2020-02-10 08:30and something else comes here" 我想提取HH:MM时间和关键字“USER”之间的所有内容 如果我使用正则表达式通过sringr::str\u extract\u all查找时间规范和用户之间的内容,我会得到: str_extract_all(a, pattern = '([0-9]{2,}:[0
a = "USER 2020-02-09 01:13SOMETHING INTERESTING HERE.USER 2020-02-10 08:30and something else comes here"
我想提取HH:MM时间和关键字“USER”之间的所有内容
如果我使用正则表达式通过sringr::str\u extract\u all
查找时间规范和用户之间的内容,我会得到:
str_extract_all(a, pattern = '([0-9]{2,}:[0-9]{2,})(.*)(?=USER)')
# [[1]]
# [1] "01:13SOMETHING INTERESTING HERE."
我可以在正则表达式中添加什么来说明它应该在HH:MM和USER之间搜索,或者在HH:MM和字符串结尾之间搜索(这样我也可以得到
08:30,这里还有其他东西
)?我们可以使用正则表达式lookaround来实现这一点
library(stringr)
str_extract(a, "(?<=\\b\\d{2}:\\d{2}).*(?=USER)")
#[1] "SOMETHING INTERESTING HERE."
您可以使用匹配的用户或断言字符串
(?:\bUSER |$)
和捕获组(.*)
|
比如说
图书馆(stringr)
输出
[1] "SOMETHING INTERESTING HERE." "and something else comes here"
明亮的非常感谢。特别是
str_-extract_-all(a),(?)?
str_extract_all(a, "(?<=\\b\\d{2}:\\d{2})[^0-9]+(?=(USER)|$)")
#[[1]]
#[1] "SOMETHING INTERESTING HERE." "and something else comes here"
str_extract_all(a, "\\b\\d{2}:\\d{2}[^0-9]+(?=(USER)|$)")
#[[1]]
#[1] "01:13SOMETHING INTERESTING HERE." "08:30and something else comes here"
[0-9][0-9]:[0-9][0-9](.*?)(?:\bUSER|$)
a = "USER 2020-02-09 01:13SOMETHING INTERESTING HERE.USER 2020-02-10 08:30and something else comes here"
str_match_all(a, "[0-9][0-9]:[0-9][0-9](.*?)(?:\\bUSER|$)")[[1]][, 2]
[1] "SOMETHING INTERESTING HERE." "and something else comes here"