Regex 用于多模式分组和多行正则表达式的Perl正则表达式

Regex 用于多模式分组和多行正则表达式的Perl正则表达式,regex,perl,grouping,file-handling,multiline,Regex,Perl,Grouping,File Handling,Multiline,我有一个输入txt文件,其中包含上述格式的多行 JMOD_01 :: This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD last line ______________ 5564 numerical digits. This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF ERIEFF FJDKF OIOII

我有一个输入txt文件,其中包含上述格式的多行

JMOD_01 :: This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD 
last line ______________ 5564 numerical digits.

This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD 
till this end ___________ 021542 some random digits.
我正在尝试读取此文件并以分组方式提取搜索的模式

下面是我尝试过的。 我试过了,把第一场比赛分组,它被正确地捕捉到了。 问题是在寻找第二个分组时出现的,因为它没有考虑下一行元素

open(IFH,'<',"file.txt");

while ($line = <IFH>) {
if ($line =~ /^\s*(\w+\_\d*.*)\s*::(.*)/s) {
print "$1\n";
print "$2\n";
}
}
close(IFH);
何时打印2美元#那它应该给我

"This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD last line"

"This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD till this end"
何时打印3美元#那么它应该给

"5564 numerical digits"
"021542 some random digits"
但第二组的实际输出有所不同: 印刷2美元#实际产量

"This is first starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF"

"This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF"

如果我正确理解了这个问题,我们可以使用两个简单的表达式并提取所需的数据,如果可以的话:

([A-Z_0-9]+)\s+::\s+([\s\S]+)
试验 对于提取我们的数字:

([0-9]+\snumerical digits|[0-9]+\ssome random digits)
试验 正则表达式电路 可视化正则表达式:


是的,请忽略这一点。考虑下面的输入:JMODY01::这是开始分组第二kFGJHFG RTIRT-DFB SFJKF ErEFF FJDKF OIIISD SDJKD最后一行的第5564位数值。fdgh_6765_546/456::这是分组第二个KFGJFG RTIRT FSFJKF ERIEFF FJDKF OIISD SDJKD的第二个开始,直到结束为止uuuuuuuu_uuuuu542一些随机数字。谢谢你指出它。我在代码中结合了提到的正则表达式,现在它正确地处理了所有三个分组。谢谢你,爱玛。
use strict;

my $str = 'JMOD_01 :: This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD 
last line ______________ 5564 numerical digits.

This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD 
till this end ___________ 021542 some random digits.

';
my $regex = qr/([A-Z_0-9]+)\s+::\s+([\s\S]+)/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
  # print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
  # print "Capture Group 2 is $2 ... and so on\n";
}

# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}
([0-9]+\snumerical digits|[0-9]+\ssome random digits)
use strict;

my $str = 'JMOD_01 :: This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD 
last line ______________ 5564 numerical digits.

This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD 
till this end ___________ 021542 some random digits.

';
my $regex = qr/([0-9]+\snumerical digits|[0-9]+\ssome random digits)/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
  # print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
  # print "Capture Group 2 is $2 ... and so on\n";
}

# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}