Regex 如何在Perl中将变量设置为递归正则表达式?
我正在为De Bruijn符号Lambda演算编写一个简单的翻译器,这样我就可以理解他的Lambda文件在他的系统中是如何工作的 以下是翻译前的语言示例Regex 如何在Perl中将变量设置为递归正则表达式?,regex,perl,recursion,nested,Regex,Perl,Recursion,Nested,我正在为De Bruijn符号Lambda演算编写一个简单的翻译器,这样我就可以理解他的Lambda文件在他的系统中是如何工作的 以下是翻译前的语言示例primes.blc: 000100011001100101000110100000000101100000100100010101111101111010010001101000011100110100000000001011011100111001111111011110000000011111001101110000001011000001
primes.blc
:
00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110
我在Bruijn.pl的primes.txt文件保存部分之前的注释行中遇到嵌套正则表达式问题:
#!/usr/bin/env perl
#use strict;
use warnings;
use IO::File;
use Cwd; my $originalCwd = getcwd()."/";
#primes.blc as argument for test conversion
#______________________________________________________________________open file
my ($name) = @ARGV;
$FILE = new IO::File;
$FILE->open("< ".$originalCwd."primes.blc") || die("Could not open file!");
#$FILE->open("< ".$name) || die("Could not open file!");
while (<$FILE>){ $field .= $_; }
$FILE->close;
#______________________________________________________________________Translate
$field =~ s/(00|01|(1+0))/$1 /gsm;
$field =~ s/00 /\\ /gsm;
$field =~ s/01 /(a /gsm;
$field =~ s/(1+)0 /length($1)." "/gsme;
$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/;
#$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm;
#______________________________________________________________________save file
#$fh = new IO::File "> ".$name;
$fh = new IO::File "> ".$originalCwd."primes.txt";
if (defined $fh) { print $fh $field; $fh->close; }
当前,注释掉该行后,它将转换为几乎可读的格式,如下所示:
\ (a \ (a 1 (a 1 (a (a \ (a 1 1 \ \ \ (a (a 1 \ \ 1 (a \ (a (a (a 4 4 1 (a \ (a 1 1 \ (a 2 (a 1 1 \ \ \ \ (a (a 1 3 (a 2 (a 6 4 \ \ \ (a 4 (a 1 3 \ \ (a (a 1 \ \ 2 2
它需要找到
的最里面的抽象(a
和2个数字或匹配的括号及其所有内容,并插入一个尾随)
并将a
一直删除到最外面的应用程序。虽然我不理解你的算法,但这行代码是可疑的
$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/
您正在根据包含它的模式是否匹配$\uu
像这样的错误,使用strict
是为了捕捉错误,但不是修复错误,而是将其关闭。那是不明智的
猜测您正试图定义一个递归模式,因此您需要使用qr/
而不是m/
,并在模式中使用(?0)
或(?R)
让我们把它叫做$re
好吗?像这样
my $re = qr/\(([^()]+|(?R))*\)/
而且,这条线很奇怪
$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm
它将$field
的值与正则表达式模式1
进行比较,只要替换更改了$\uuuu
除此之外,如果没有对算法的描述以及您的代码与算法的关系,我将无法帮助您尽管我不理解您的算法,但这一行是可疑的
$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/
您正在根据包含它的模式是否匹配$\uu
像这样的错误,使用strict
是为了捕捉错误,但不是修复错误,而是将其关闭。那是不明智的
猜测您正试图定义一个递归模式,因此您需要使用qr/
而不是m/
,并在模式中使用(?0)
或(?R)
让我们把它叫做$re
好吗?像这样
my $re = qr/\(([^()]+|(?R))*\)/
而且,这条线很奇怪
$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm
它将$field
的值与正则表达式模式1
进行比较,只要替换更改了$\uuuu
除此之外,如果没有对算法以及代码与算法的关系的描述,我将无法帮助您您可能需要这样一个正则表达式
# (\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))
( \(a ) # (1)
( # (2 start)
( # (3 start)
[\s\\]*?
(?:
\d+
|
(?&RecursParens)
)
){2} # (3 end)
) # (2 end)
(?(DEFINE)
(?<RecursParens> # (4 start)
(?>
\(
(?>
(?> [^()]+ )
| (?:
(?= . )
(?&RecursParens)
|
)
)+
\)
)
) # (4 end)
)
use strict;
use warnings;
use feature qw{say};
my $field = "00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110";
$field =~ s/(00|01|(1+0))/$1 /g;
$field =~ s/00 /\\ /g;
$field =~ s/01 /(a /g;
$field =~ s/(1+)0 /length($1)." "/ge;
1 while $field =~ s/(\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))/\($2\)/g;
$field =~ s/\( /\(/g;
say $field;
\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))
\
( # (1 start)
\
( # (2 start)
1
( # (3 start)
1
( # (4 start)
( # (5 start)
\
( 1 1 ) # (6)
\ \ \
( # (7 start)
( 1 \ \ 1 ) # (8)
( # (9 start)
\
( # (10 start)
( # (11 start)
( 4 4 ) # (12)
1
) # (11 end)
( # (13 start)
\
( 1 1 ) # (14)
\
( # (15 start)
2
( 1 1 ) # (16)
) # (15 end)
) # (13 end)
) # (10 end)
\ \ \ \
( # (17 start)
( 1 3 ) # (18)
( # (19 start)
2
( 6 4 ) # (20)
) # (19 end)
) # (17 end)
) # (9 end)
) # (7 end)
) # (5 end)
\ \ \
( # (21 start)
4
( 1 3 ) # (22)
) # (21 end)
) # (4 end)
) # (3 end)
) # (2 end)
\ \
( # (23 start)
( 1 \ \ 2 ) # (24)
2
) # (23 end)
) # (1 end)
可以格式化成这样
# (\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))
( \(a ) # (1)
( # (2 start)
( # (3 start)
[\s\\]*?
(?:
\d+
|
(?&RecursParens)
)
){2} # (3 end)
) # (2 end)
(?(DEFINE)
(?<RecursParens> # (4 start)
(?>
\(
(?>
(?> [^()]+ )
| (?:
(?= . )
(?&RecursParens)
|
)
)+
\)
)
) # (4 end)
)
use strict;
use warnings;
use feature qw{say};
my $field = "00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110";
$field =~ s/(00|01|(1+0))/$1 /g;
$field =~ s/00 /\\ /g;
$field =~ s/01 /(a /g;
$field =~ s/(1+)0 /length($1)." "/ge;
1 while $field =~ s/(\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))/\($2\)/g;
$field =~ s/\( /\(/g;
say $field;
\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))
\
( # (1 start)
\
( # (2 start)
1
( # (3 start)
1
( # (4 start)
( # (5 start)
\
( 1 1 ) # (6)
\ \ \
( # (7 start)
( 1 \ \ 1 ) # (8)
( # (9 start)
\
( # (10 start)
( # (11 start)
( 4 4 ) # (12)
1
) # (11 end)
( # (13 start)
\
( 1 1 ) # (14)
\
( # (15 start)
2
( 1 1 ) # (16)
) # (15 end)
) # (13 end)
) # (10 end)
\ \ \ \
( # (17 start)
( 1 3 ) # (18)
( # (19 start)
2
( 6 4 ) # (20)
) # (19 end)
) # (17 end)
) # (9 end)
) # (7 end)
) # (5 end)
\ \ \
( # (21 start)
4
( 1 3 ) # (22)
) # (21 end)
) # (4 end)
) # (3 end)
) # (2 end)
\ \
( # (23 start)
( 1 \ \ 2 ) # (24)
2
) # (23 end)
) # (1 end)
你可能需要这样的正则表达式
# (\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))
( \(a ) # (1)
( # (2 start)
( # (3 start)
[\s\\]*?
(?:
\d+
|
(?&RecursParens)
)
){2} # (3 end)
) # (2 end)
(?(DEFINE)
(?<RecursParens> # (4 start)
(?>
\(
(?>
(?> [^()]+ )
| (?:
(?= . )
(?&RecursParens)
|
)
)+
\)
)
) # (4 end)
)
use strict;
use warnings;
use feature qw{say};
my $field = "00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110";
$field =~ s/(00|01|(1+0))/$1 /g;
$field =~ s/00 /\\ /g;
$field =~ s/01 /(a /g;
$field =~ s/(1+)0 /length($1)." "/ge;
1 while $field =~ s/(\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))/\($2\)/g;
$field =~ s/\( /\(/g;
say $field;
\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))
\
( # (1 start)
\
( # (2 start)
1
( # (3 start)
1
( # (4 start)
( # (5 start)
\
( 1 1 ) # (6)
\ \ \
( # (7 start)
( 1 \ \ 1 ) # (8)
( # (9 start)
\
( # (10 start)
( # (11 start)
( 4 4 ) # (12)
1
) # (11 end)
( # (13 start)
\
( 1 1 ) # (14)
\
( # (15 start)
2
( 1 1 ) # (16)
) # (15 end)
) # (13 end)
) # (10 end)
\ \ \ \
( # (17 start)
( 1 3 ) # (18)
( # (19 start)
2
( 6 4 ) # (20)
) # (19 end)
) # (17 end)
) # (9 end)
) # (7 end)
) # (5 end)
\ \ \
( # (21 start)
4
( 1 3 ) # (22)
) # (21 end)
) # (4 end)
) # (3 end)
) # (2 end)
\ \
( # (23 start)
( 1 \ \ 2 ) # (24)
2
) # (23 end)
) # (1 end)
可以格式化成这样
# (\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))
( \(a ) # (1)
( # (2 start)
( # (3 start)
[\s\\]*?
(?:
\d+
|
(?&RecursParens)
)
){2} # (3 end)
) # (2 end)
(?(DEFINE)
(?<RecursParens> # (4 start)
(?>
\(
(?>
(?> [^()]+ )
| (?:
(?= . )
(?&RecursParens)
|
)
)+
\)
)
) # (4 end)
)
use strict;
use warnings;
use feature qw{say};
my $field = "00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110";
$field =~ s/(00|01|(1+0))/$1 /g;
$field =~ s/00 /\\ /g;
$field =~ s/01 /(a /g;
$field =~ s/(1+)0 /length($1)." "/ge;
1 while $field =~ s/(\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))/\($2\)/g;
$field =~ s/\( /\(/g;
say $field;
\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))
\
( # (1 start)
\
( # (2 start)
1
( # (3 start)
1
( # (4 start)
( # (5 start)
\
( 1 1 ) # (6)
\ \ \
( # (7 start)
( 1 \ \ 1 ) # (8)
( # (9 start)
\
( # (10 start)
( # (11 start)
( 4 4 ) # (12)
1
) # (11 end)
( # (13 start)
\
( 1 1 ) # (14)
\
( # (15 start)
2
( 1 1 ) # (16)
) # (15 end)
) # (13 end)
) # (10 end)
\ \ \ \
( # (17 start)
( 1 3 ) # (18)
( # (19 start)
2
( 6 4 ) # (20)
) # (19 end)
) # (17 end)
) # (9 end)
) # (7 end)
) # (5 end)
\ \ \
( # (21 start)
4
( 1 3 ) # (22)
) # (21 end)
) # (4 end)
) # (3 end)
) # (2 end)
\ \
( # (23 start)
( 1 \ \ 2 ) # (24)
2
) # (23 end)
) # (1 end)
与其注释掉
使用strict
,不如在启用strict的情况下让代码通过。你也有一群不需要时间的人:雇佣。请仅显示一个。可能,$recurseBranceResregex=m/(?\((?>[^()]+++)(?&rec))*\)/代码>@simbabque它是由我的模板perl文件制作的,我的坏消息,我将编辑这篇文章。@MJSuriya:如果OP要为您在源代码中添加行号,这将意味着所有其他想要运行代码的人都必须删除行号。第22行以$field=~1开头,而为什么要使用IO::File
而不仅仅是open
?如果正则表达式/m
和/s
修饰符对模式没有影响,请不要使用它们。您的代码足够复杂,因为它不是注释掉使用strict
而是在启用strict的情况下使代码通过。你也有一群不需要时间的人:雇佣。请仅显示一个。可能,$recurseBranceResregex=m/(?\((?>[^()]+++)(?&rec))*\)/代码>@simbabque它是由我的模板perl文件制作的,我的坏消息,我将编辑这篇文章。@MJSuriya:如果OP要为您在源代码中添加行号,这将意味着所有其他想要运行代码的人都必须删除行号。第22行以$field=~1开头,而为什么要使用IO::File
而不仅仅是open
?如果正则表达式/m
和/s
修饰符对模式没有影响,请不要使用它们。你的代码很复杂,因为它是hanks@sln!我想你还记得我以前的样子。你是一个regex怪兽(从一个好的方面来说)。@GlassGhost-很高兴它对你有用。那个blc看起来像是带密钥的密码学;同样的原则。我真的相信这是神的语言。谢谢@sln!我想你还记得我以前的样子。你是一个regex怪兽(从一个好的方面来说)。@GlassGhost-很高兴它对你有用。那个blc看起来像是带密钥的密码学;同样的原则。我真的相信这是神的语言。