Regex 匹配正则表达式中的文本节_Regex_Regex Lookarounds_Backreference

Regex 匹配正则表达式中的文本节

regex

Regex 匹配正则表达式中的文本节,regex,regex-lookarounds,backreference,Regex,Regex Lookarounds,Backreference,所以我有一个标题，在这里我可以开始匹配文本，然后对于部分的结尾，我使用了标题的backreference来确定部分的结尾：样本数据： Section 1 sub-header here: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam sed interdum erat. Donec sed felis sit amet sem mattis aliquet non in turpis. sub-section

所以我有一个标题，在这里我可以开始匹配文本，然后对于部分的结尾，我使用了标题的backreference来确定部分的结尾：

样本数据：

Section 1
sub-header here:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam sed interdum erat. Donec sed felis sit amet sem mattis aliquet non in turpis. 

sub-section with one newline above
option A
option B


sub-section 2 with two newline above
setting1: value of setting1
setting2: value of setting2


Section 2
sub-header here:
Nulla maximus mollis urna, in lobortis est auctor a. Ut erat enim, volutpat id tortor eget, elementum fermentum nisi.

sub-section with one newline above
option A
option B


sub-section 2 with two newline above
setting1: value of setting1
setting2: value of setting2


Section 3
sub-header here:
Sed suscipit eleifend arcu fringilla pulvinar. Maecenas ullamcorper efficitur fringilla.

sub-section with one newline above
option A
option B


sub-section 2 with two newline above
setting1: value of setting1
setting2: value of setting2

我的正则表达式看起来像：

(?:^|\n)((Section\s*)(\d+))$([\s\S]*?)(?=\2)

这与前两个部分匹配，但与最后一个部分不匹配。

试试这个正则表达式：

(Section\s*\d+)([\s\S]*?)(?=\s*Section\s*\d+|$)

说明：

Section\s*\d+-匹配文本节，后跟0+空格，后跟数字的1+次出现，并捕获组1中的全部内容 [\s\s]*？-匹配任何字符的0+次出现次数，并在组2中捕获该字符？=\s*Section\s*\d+|$-正向前瞻，以确保上面匹配的内容后面必须跟在字符串的末尾，或者跟在0+空格后面的是Section，后面跟在0+空格后面的是1+数字

将？=\2替换为？=\2 |$，但不要使用m修饰符，同时删除您拥有的$。请看，它不会选择第二节@WiktorStribiżewRight，在\2之前必须有\n节，节\s*\d+[\s\s]*？=\s*\2\d+|$可以简化为我认为的那一节。