提取r中字符串中的天数数字

提取r中字符串中的天数数字,r,split,R,Split,我想将字符串中出现的天数数字提取到列表中。如果有人能提出简单的方法,我将不胜感激 x<- 'At 02:04 AM, 09:04 AM, 03:04 PM and 08:04 PM, on day 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 of the month' x我们可以尝试匹配以下模式: \b\d{1,2}\b(?!:\d{2}) 示例脚本: x <- "At

我想将字符串中出现的天数数字提取到列表中。如果有人能提出简单的方法,我将不胜感激

x<- 'At 02:04 AM, 09:04 AM, 03:04 PM and 08:04 PM, on day 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 of the month'

x我们可以尝试匹配以下模式:

\b\d{1,2}\b(?!:\d{2})
示例脚本:

x <- "At 02:04 AM, 09:04 AM, 03:04 PM and 08:04 PM, on day 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 of the month"
m <- gregexpr(" \\b\\d{1,2}\\b(?!:\\d{2})", x, perl=TRUE)
regmatches(x, m)[[1]]

[1] " 21" " 22" " 23" " 24" " 25" " 26" " 27" " 28" " 29" " 30" " 31" " 1" 
[13] " 2"  " 3"  " 4"  " 5"  " 6"  " 7"  " 8"  " 9"  " 10"

请注意,这里非常需要负前瞻
(?!:\d{2})
,因为它可以避免意外地匹配小时/分钟时间戳中的数字。

我会这样做:

library(stringr)
days <- c(
# separated by commas
as.numeric(str_extract_all(str_extract_all(x, ' \\d+,'), '\\d+')[[1]]), 
# in the 'and {day_num} of' text
as.numeric(str_extract_all(str_extract_all(x, 'and \\d+ of'), '\\d+')[[1]])
)
库(stringr)

days阅读regex中的lookarounds,因为它们非常强大(在这里非常必要)。当我觉得答案对于至少对正则表达式有基本了解的人来说并不明显时,我会发表评论:-)
library(stringr)
days <- c(
# separated by commas
as.numeric(str_extract_all(str_extract_all(x, ' \\d+,'), '\\d+')[[1]]), 
# in the 'and {day_num} of' text
as.numeric(str_extract_all(str_extract_all(x, 'and \\d+ of'), '\\d+')[[1]])
)