Regex 正则表达式从字符串中提取时间（例如7:30pm、8:pm、9.05）_Regex

Regex 正则表达式从字符串中提取时间（例如7:30pm、8:pm、9.05）

regex

Regex 正则表达式从字符串中提取时间（例如7:30pm、8:pm、9.05）,regex,Regex,我正在开发一个Rails应用程序，它对一些事件数据使用外部提要，令人烦恼的是，它们只提供一个包含时间的字符串。例如：晚上7:30开门，晚上9点开演我的目标是从这些字符串中提取第一次，并将其放入datetime字段中。系统需要捕获以下类型的值：上午11点下午12点下午1点下午2:15 下午三时三十分 4.45 五点半 06:15 晚上7:30 晚上8:30 晚上九时十五分但不是这些： 105 250 下午三点半下午4点15分晚上74点上午8点40分我认为最好的方法是使用正则表

我正在开发一个Rails应用程序，它对一些事件数据使用外部提要，令人烦恼的是，它们只提供一个包含时间的字符串。例如：

晚上7:30开门，晚上9点开演

我的目标是从这些字符串中提取第一次，并将其放入datetime字段中。系统需要捕获以下类型的值：

上午11点

下午12点

下午1点

下午2:15

下午三时三十分

4.45

五点半

06:15

晚上7:30

晚上8:30

晚上九时十五分

但不是这些：

105

250

下午三点半

下午4点15分

晚上74点

上午8点40分

我认为最好的方法是使用正则表达式，通过一些搜索（尤其是搜索），我得到了以下信息：

[0-9]{1,2}（：|。）？[0-9]{0,2}\s？（上午|下午|上午|下午）

它部分有效，但不排除任何我不想要的角色，似乎只捕捉了2和3中am/pm的第一个角色

这在正则表达式中是可能的吗

谢谢

可能是这样的：

^[01]?[0-9]([:.][0-9]{2})?(\s?[ap]m)?$

\b((0?[1-9]|1[012])([:.][0-5][0-9])?(\s?[ap]m)|([01]?[0-9]|2[0-3])([:.][0-5][0-9]))\b

请注意，这不会处理24小时的时间，12小时的时间也不是那么具体，即它将匹配

19pm

如果您想更具体一些，可以尝试：

^((0?[0-9]|1[012])([:.][0-9]{2})?(\s?[ap]m)|([01]?[0-9]|2[0-3])([:.][0-9]{2})?)$

或者尝试将其作为文本更大部分的一部分进行匹配，您可以使用以下内容：

^[01]?[0-9]([:.][0-9]{2})?(\s?[ap]m)?$

\b((0?[1-9]|1[012])([:.][0-5][0-9])?(\s?[ap]m)|([01]?[0-9]|2[0-3])([:.][0-5][0-9]))\b

它不支持24小时格式，但强制执行有效时间。在正则表达式引擎中添加不区分大小写的标志，不管它是什么语言，如果支持，也可以使用

（i:）

包装正则表达式

正则表达式：

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    (?:                      group, but do not capture:
--------------------------------------------------------------------------------
      0?                       '0' (optional (matching the most
                               amount possible))
--------------------------------------------------------------------------------
      [1-9]                    any character of: '1' to '9'
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
      1                        '1'
--------------------------------------------------------------------------------
      [0-2]                    any character of: '0' to '2'
--------------------------------------------------------------------------------
    )                        end of grouping
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      \d                       digits (0-9)
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
                               ' '
--------------------------------------------------------------------------------
      (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
        [ap]                     any character of: 'a', 'p'
--------------------------------------------------------------------------------
      )                        end of look-ahead
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    [:.]?                    any character of: ':', '.' (optional
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    (?:                      group, but do not capture (optional
                             (matching the most amount possible)):
--------------------------------------------------------------------------------
      (?:                      group, but do not capture:
--------------------------------------------------------------------------------
        [0-5]                    any character of: '0' to '5'
--------------------------------------------------------------------------------
        [0-9]                    any character of: '0' to '9'
--------------------------------------------------------------------------------
      )                        end of grouping
--------------------------------------------------------------------------------
    )?                       end of grouping
--------------------------------------------------------------------------------
    (?:                      group, but do not capture (optional
                             (matching the most amount possible)):
--------------------------------------------------------------------------------
      \s?                      whitespace (\n, \r, \t, \f, and " ")
                               (optional (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      [ap]                     any character of: 'a', 'p'
--------------------------------------------------------------------------------
      m                        'm'
--------------------------------------------------------------------------------
    )?                       end of grouping
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char

？？

？前一个0-or-1的0-or-1？不确定–我从找到的不同位拼凑而成，所以我不完全确定它是如何工作的。@MarcB它实际上是一个非贪婪的0-or-1。如果它存在，它将与前一项匹配，但除非必要，否则不会捕获它。在实践中，我从来没有发现它的实际用途。你可能不想要锚，因为OP似乎正试图从一个更大的字符串中提取时间。感谢你这么快地返回，但它需要将其从字符串中提取出来。看起来非常接近我所拥有的+1详细解释。@p.s.w.g我有嵌套的负面外观，我认为没有这些，你的答案会更好！你的备忘单很有用。谢谢