Regex perl正则表达式匹配n个数字,但前提是它们不完全相同
使用Perl正则表达式,我需要匹配一系列8位数字,例如12345678,但前提是它们不完全相同。00000000和99999999是不应该匹配的典型模式。我试图从现有数据库记录中剔除明显无效的值 我有这个:Regex perl正则表达式匹配n个数字,但前提是它们不完全相同,regex,perl,Regex,Perl,使用Perl正则表达式,我需要匹配一系列8位数字,例如12345678,但前提是它们不完全相同。00000000和99999999是不应该匹配的典型模式。我试图从现有数据库记录中剔除明显无效的值 我有这个: my ($match) = /(\d{8})/; 但是我不能很好地安排背景参考。怎么样: ^(\d)(?!\1{7})\d{7}$ 这将匹配不包含8个相同数字的8位数字 示例代码: my $re = qr/^(\d)(?!\1{7})\d{7}$/; while(<DATA>
my ($match) = /(\d{8})/;
但是我不能很好地安排背景参考。怎么样:
^(\d)(?!\1{7})\d{7}$
这将匹配不包含8个相同数字的8位数字
示例代码:
my $re = qr/^(\d)(?!\1{7})\d{7}$/;
while(<DATA>) {
chomp;
say (/$re/ ? "OK : $_" : "KO : $_");
}
__DATA__
12345678
12345123
123456
11111111
OK : 12345678
OK : 12345123
KO : 123456
KO : 11111111
The regular expression:
(?-imsx:^(\d)(?!\1{7})\d{7}$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
(?! look ahead to see if there is not:
----------------------------------------------------------------------
\1{7} what was matched by capture \1 (7 times)
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
\d{7} digits (0-9) (7 times)
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
说明:
my $re = qr/^(\d)(?!\1{7})\d{7}$/;
while(<DATA>) {
chomp;
say (/$re/ ? "OK : $_" : "KO : $_");
}
__DATA__
12345678
12345123
123456
11111111
OK : 12345678
OK : 12345123
KO : 123456
KO : 11111111
The regular expression:
(?-imsx:^(\d)(?!\1{7})\d{7}$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
(?! look ahead to see if there is not:
----------------------------------------------------------------------
\1{7} what was matched by capture \1 (7 times)
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
\d{7} digits (0-9) (7 times)
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
那么:
^(\d)(?!\1{7})\d{7}$
这将匹配不包含8个相同数字的8位数字
示例代码:
my $re = qr/^(\d)(?!\1{7})\d{7}$/;
while(<DATA>) {
chomp;
say (/$re/ ? "OK : $_" : "KO : $_");
}
__DATA__
12345678
12345123
123456
11111111
OK : 12345678
OK : 12345123
KO : 123456
KO : 11111111
The regular expression:
(?-imsx:^(\d)(?!\1{7})\d{7}$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
(?! look ahead to see if there is not:
----------------------------------------------------------------------
\1{7} what was matched by capture \1 (7 times)
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
\d{7} digits (0-9) (7 times)
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
说明:
my $re = qr/^(\d)(?!\1{7})\d{7}$/;
while(<DATA>) {
chomp;
say (/$re/ ? "OK : $_" : "KO : $_");
}
__DATA__
12345678
12345123
123456
11111111
OK : 12345678
OK : 12345123
KO : 123456
KO : 11111111
The regular expression:
(?-imsx:^(\d)(?!\1{7})\d{7}$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
(?! look ahead to see if there is not:
----------------------------------------------------------------------
\1{7} what was matched by capture \1 (7 times)
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
\d{7} digits (0-9) (7 times)
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
此答案基于以下问题的标题:n匹配数字,但前提是它们不完全相同
因此,我得到了以下表达式:
(\d)\1+\b(*SKIP)(*FAIL)|\d+
这是什么意思
(\d) # Match a digit and put it in group 1
\1+ # Match what was matched in group 1 and repeat it one or more times
\b # Word boundary, we could use (?!\d) to be more specific
(*SKIP)(*FAIL) # Skip & fail, we use this to exclude what we just have matched
| # Or
\d+ # Match a digit one or more times
此正则表达式的优点是,您不需要每次更改n
时都对其进行编辑。当然,如果您只想匹配n
数字,您可以用\d{n}\b
替换上一个替换的\d+
此答案基于以下问题的标题:n匹配数字,但前提是它们不完全相同
因此,我得到了以下表达式:
(\d)\1+\b(*SKIP)(*FAIL)|\d+
这是什么意思
(\d) # Match a digit and put it in group 1
\1+ # Match what was matched in group 1 and repeat it one or more times
\b # Word boundary, we could use (?!\d) to be more specific
(*SKIP)(*FAIL) # Skip & fail, we use this to exclude what we just have matched
| # Or
\d+ # Match a digit one or more times
此正则表达式的优点是,您不需要每次更改n
时都对其进行编辑。当然,如果您只想匹配n
数字,您可以用\d{n}\b
替换上一个替换的\d+
我将用两个正则表达式来实现这一点。一个用来匹配你想要的,一个用来过滤你不想要的 受HamZa答案的启发,我还提供了一个单一的正则表达式解决方案
use strict;
use warnings;
while (my $num = <DATA>) {
chomp $num;
# Single Regex Solution - Inspired by HamZa's code
if ($num =~ /^.*(\d).*\1.*$(*SKIP)(*FAIL)|^\d{8}$/) {
print "Yes - ";
} else {
print "No - ";
}
# Two Regex Solution
if ($num =~ /^\d{8}$/ && $num !~ /(\d).*\1/) {
print "Yes - ";
} else {
print "No - ";
}
print "$num\n";
}
__DATA__
12345678
12345674
00001111
00000000
99999999
87654321
87654351
123456789
我将用两个正则表达式来实现这一点。一个用来匹配你想要的,一个用来过滤你不想要的 受HamZa答案的启发,我还提供了一个单一的正则表达式解决方案
use strict;
use warnings;
while (my $num = <DATA>) {
chomp $num;
# Single Regex Solution - Inspired by HamZa's code
if ($num =~ /^.*(\d).*\1.*$(*SKIP)(*FAIL)|^\d{8}$/) {
print "Yes - ";
} else {
print "No - ";
}
# Two Regex Solution
if ($num =~ /^\d{8}$/ && $num !~ /(\d).*\1/) {
print "Yes - ";
} else {
print "No - ";
}
print "$num\n";
}
__DATA__
12345678
12345674
00001111
00000000
99999999
87654321
87654351
123456789
他们都必须不同吗?@HunterMcMillen他们不一定都不同,只是不一定都一样。00000001是可以的,11111111不是。发生的事情是表单上需要一个数据库字段,所以数据输入人员只需按住一个键来填充该字段。几个月后的某个时候,我的剧本成功了。我需要清除明显无效的值。这是XY问题吗?你必须使用正则表达式吗?我将编写一个简短的子例程,例如
sub-allsame{my%s;$s{$}++for split/,$[0];my$count=keys%s;return$count==1;}
@TLP no sir我根本不需要使用正则表达式。它有一个正则表达式,一直工作正常,直到我开始看到所有这些假数字出现,它们都必须不同吗?@HunterMcMillen它们不一定都不同,只是不完全相同。00000001是可以的,11111111不是。发生的事情是表单上需要一个数据库字段,所以数据输入人员只需按住一个键来填充该字段。几个月后的某个时候,我的剧本成功了。我需要清除明显无效的值。这是XY问题吗?你必须使用正则表达式吗?我将编写一个简短的子例程,例如sub-allsame{my%s;$s{$}++for split/,$[0];my$count=keys%s;return$count==1;}
@TLP no sir我根本不需要使用正则表达式。它有一个正则表达式,工作正常,直到我开始看到所有这些假数字,这是错误的。您所做的只是发布一个正则表达式及其YAPE::regex::Explain
输出的转储。您没有解释它为什么或如何正确工作,并且您的模式不符合要求。请展示测试用例和结果。我不相信自动分析对那些不能阅读正则表达式模式的人有任何帮助。正则表达式所要求的是“任意数字,后跟与第一个数字不同的七个数字”。此筛选出的唯一字符串都是单个数字。例如,00001111传递了这个正则表达式。@Miller:这是要求,当所有数字都相同时阻塞。@M42要求正好相反。只有当八位数字完全唯一时,这才是错误的。您所做的只是发布一个正则表达式及其YAPE::regex::Explain
输出的转储。您没有解释它为什么或如何正确工作,并且您的模式不符合要求。请展示测试用例和结果。我不相信自动分析对那些不能阅读正则表达式模式的人有任何帮助。正则表达式所要求的是“任意数字,后跟与第一个数字不同的七个数字”。此筛选出的唯一字符串都是单个数字。例如,00001111传递了这个正则表达式。@Miller:这是要求,当所有数字都相同时阻塞。@M42要求正好相反。只有当所有八位数字完全唯一时,这也只能过滤掉8个重复的数字。00001111通过你的正则表达式。@Miller这不正是OP想要的吗?当我读到这个问题时,OP想要过滤掉所有不完全唯一的8位数字:“除非它们不完全相同”。然而,我接受了您的解决方案,并对其进行了修改,使其成为一个单一的正则表达式解决方案,该解决方案确实有效。请查看我的答案,了解这些结果。Nevermind,M42指出了我是如何使问题比OP计划的更复杂的。@Miller说实话,我不应该回答这样一个不清楚的问题。我不会争辩OP的“真正意图”是什么,因为他似乎在几乎一天之后都没有回复。我给你一个+1的解释,这也只过滤掉8个重复的数字。00001111通过你的正则表达式。@Miller这不正是OP想要的吗?当我读到这个问题时,OP想要过滤掉所有不是完全唯一的8位数字:“除非它们不是唯一的