Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Ruby:确定一行是否在正则表达式匹配的结果中_Ruby_Regex - Fatal编程技术网

Ruby:确定一行是否在正则表达式匹配的结果中

Ruby:确定一行是否在正则表达式匹配的结果中,ruby,regex,Ruby,Regex,我有一个相当复杂的正则表达式,它匹配位于ASCII分隔符之间的文档部分(例如====================)。我需要判断文档中的给定行是否与此正则表达式匹配。到目前为止,我的方法是存储通过将我的文档与正则表达式匹配而返回的MatchData,将其转换为数组并对其进行迭代,以便找到与给定行的匹配 between_separators = lambda { |ln, context| body = context[:body] rx = /^(([

我有一个相当复杂的正则表达式,它匹配位于ASCII分隔符之间的文档部分(例如====================)。我需要判断文档中的给定行是否与此正则表达式匹配。到目前为止,我的方法是存储通过将我的文档与正则表达式匹配而返回的MatchData,将其转换为数组并对其进行迭代,以便找到与给定行的匹配

    between_separators = lambda { |ln, context| 
        body = context[:body]
        rx = /^(([=\*_]){23,}\2{3}(?:\2|[\r\n])+)([\s\S]+?)\1/
        matchdata = body.match(rx)
        matched_lines = matchdata.to_a.map { |m| m.split("\n") }.flatten
        matched_lines.each { |ml| return 1 if ml.match(ln) }
        return 0
    }
虽然这并不常见,但在某些情况下,我最终会出现误报,因为我正在检查的行与第一次匹配结果中的其他行相同

有没有更聪明的方法来解决这个问题

编辑:

让我提供更多的背景

我得到一个纯文本文档,其中包含由“分隔符”行括起的文本块。我想检查文档中的一行是否在这些分隔符之间

下面是我正在处理的一个例子:

    This is some text that should not be matched. As you can see, it is not enclosed
by separator lines.

===========================================================
This part should be matched as it is between two separator lines. Note that the
opening and closing separators are composed of the exact same number of the same
character.
===========================================================
This block should not be matched as it is not enclosed by its own separators,
but rather the closing separator of the previous block and the opening 
separator of the next block.
===========================================================
It is tricky to distinguish between an enclosed and non-enclosed blocks, because
sometimes a matching pair of separators appears to be legal, while it is really
the closing separator of the previous block and the opening separator of the
next one (e.g. the block obove this one).
===========================================================
==================================
=====
This block is enclosed by multiline separators.
==================================
=====
Some more text that should not be matched by the regex.
***************************************



A separator can use one of the following characters: '=' or '*' or '_'.


***************************************
***************************************
*******************
Another example of a multiline separated block.
***************************************
*******************

>Even more text not to be matchedby the regex. This time, preceeded by a
>variable number of '>'.
>>__________________________________________
>>And another type of separator. The block is now also a part of a reply section
>>of the email.
>>__________________________________________

我的目标是能够在_分隔符之间调用
[“此块由多行分隔符包围。”,上下文]
,并得到
1
。虽然我提供的方法在大多数情况下都会成功,但它是不可靠的,我想改进它以避免产生假阳性结果。

听起来像是底层逻辑的问题。请借助示例文本和预期行为来解释问题。@WiktorStribiżew我编辑了该问题,以包含有关该问题上下文的更多信息。我认为您的方法是正确的:尽管您可以使用正则表达式返回带有特定行的“段落”,但这将非常麻烦,或者效率非常低。然而,我会使用Ruby来获取这些段落。我不确定
23
,请根据需要进行调整。