Regex 试图在捕获的值中捕获值_Regex_Ruby_Fluent Bit

Regex 试图在捕获的值中捕获值

regex ruby

Regex 试图在捕获的值中捕获值,regex,ruby,fluent-bit,Regex,Ruby,Fluent Bit,我试图从这样一行解析数据 "Lorem ipsum dolor sit amet, IP: 111.111.111.111, 222.222.222.222, 333.333.333.333\r\n adipiscing elit, sed do eiusmod\r\n tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud" 我试图捕捉如下价值观：消息：“Lorem

我试图从这样一行解析数据

"Lorem ipsum dolor sit amet, IP: 111.111.111.111, 222.222.222.222, 333.333.333.333\r\n adipiscing elit, sed do eiusmod\r\n tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud"

我试图捕捉如下价值观：

消息：

“Lorem ipsum door sit amet，IP:111.111.111.111、222.222.222、333.333.333\r\n adipiscing Elite，sed do eiusmod\r\n劳工和大财团的临时投资。但最小的投资额是多少


ip:“111.111.111.111、222.222.222、333.333.333.333”


可以有任意多个IP，包括零个
我使用fluent bit和一个正则表达式。这是fluent位解析器定义的一个示例：
[PARSER]
Name syslog-rfc3164
Format regex
Regex /^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
Time_Key    time
Time_Format %b %d %H:%M:%S
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep   On

我尝试将*（0或更多）应用于ip组名匹配，但无法使其工作。你知道我该怎么做吗？
你可以使用/（[0-9]\u\.）+/
作为一个非常基本的regexp（有更好的IPv4 regexp）
然后在字符串上使用.scan（…）
，您将以数组的形式获得结果
str = 'Lorem, IP: 111.111.111.111, 222.222.222.222, 333.333.333.333\r\n adipiscing'


正则表达式按惯例编写为：
/\A(?<whole>.*?(?<ip>(?<four_threes>\d{1,3}(?:\.\d{1,3}){3})(?:, \g<four_threes>)*).*)/

/\A（？.*？（？（？\d{1,3}（？：。\d{1,3}）{3}）（？：，\g）**）。）/

在自由间距模式下定义正则表达式时，必须以某种方式保护空格，否则在解析表达式之前将删除它们。我已经使用了\p{Space}
，但是[[：Space:]
，\s
和[]
（字符类中的空格）也可以使用。（除最后一个字符外，所有字符都匹配一个空白字符。）以常规方式写入正则表达式时，可以使用空格，如上所示
\g
是一个（搜索“子表达式调用”）。它们的使用节省了打字，减少了出错的机会。如果不需要第三个名为capture的文件，当然可以将其替换掉。
感谢您的回答localhostdotdev，但我忘了提到我没有编写此文件，因此我无法调用函数。。。我在fluent位中使用这个正则表达式，所以我仅限于regexThanks Cary，它涵盖了多个IPs部分，但我还需要在另一个标记中捕获整个消息，有没有办法通过子表达式调用来实现这一点？我不明白。变量中确实有字符串。这不简单吗？我不能使用字符串变量。。。我更新了车票，也许会让事情变得明朗一点。消息是全文，但我需要提取IPs\d{3}
→ \d{1,3}
如果您确实想匹配_IP_s。@AlekseiMatiushkin，关于\d{1,3}的观点很好。我一直在等待OP确认您的怀疑，但由于没有发生这种情况，我已经做出了更改。您的初始字符串不是一行，而是一行。@sawa好的，让我们说它是一个字符串。我想捕获整个字符串，并捕获该字符串中的IP，但我需要在两个不同的值/标记中捕获它们。这更有意义吗？离题：@nomad，我有一些想法，但是问题被删除了，我不能发表评论。
str = 'Lorem, IP: 111.111.111.111, 222.222.222.222, 333.333.333.333\r\n adipiscing'

r = /
    \A                     # match the beginning of the string
    (?<whole>              # begin named group 'whole' 
      .*?                  # match >= 0 characters 
      (?<ip>               # begin named group 'ip'
        (?<four_threes>    # begin a named group 'four_threes'
          \d{1,3}          # match 1-3 digits
          (?:              # begin a non-capture group
            \.             # match a period
            \d{1,3}        # match 1-3 digits
          ){3}             # close non-capture group and execute same 3 times
        )                  # close capture group 'four_threes'
        (?:                # begin a non-capture group
          ,\p{Space}       # match ', '
          \g<four_threes>  # execute subexpression named 'four_threes'
        )*                 # close non-capture group and execute same >= 0 times
      )                    # close capture group 'ip'
      .*                   # match >= 0 characters
    )                      # close capture group 'whole'
    /x                     # free-spacing regex definition mode

m = str.match(r)
m[:whole] 
  #=> "Lorem, IP: 111.111.111.111, 222.222.222.222, 333.333.333.333\\r\\n adipiscing" 
m[:ip]
  #=> "111.111.111.111, 222.222.222.222, 333.333.333.333" 

/\A(?<whole>.*?(?<ip>(?<four_threes>\d{1,3}(?:\.\d{1,3}){3})(?:, \g<four_threes>)*).*)/