在Perl中,如何将多个匹配的组作为数组的同一元素推送?
我需要将所有匹配的组推到一个数组中在Perl中,如何将多个匹配的组作为数组的同一元素推送?,perl,Perl,我需要将所有匹配的组推到一个数组中 #!/usr/bin/perl use strict; open (FILE, "/home/user/name") || die $!; my @lines = <FILE>; close (FILE); open (FH, ">>/home/user/new") || die $!; foreach $_(@lines){ if ($_ =~ /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_
#!/usr/bin/perl
use strict;
open (FILE, "/home/user/name") || die $!;
my @lines = <FILE>;
close (FILE);
open (FH, ">>/home/user/new") || die $!;
foreach $_(@lines){
if ($_ =~ /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/){
print FH "$1 $2 $3 $4 $5 $6 $7\n"; #needs to be first element of array
}
elsif ($_ =~ /CD_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/){
print FH "$1 $2 $3 $4 $5 $6\n"; #needs to be second element of array
}
}close (FH);
我不想将匹配的组写入文件,而是想将它们放入数组中。除了push之外,我想不出任何其他命令,但是这个函数只接受一个参数。做同样的事情最好的方法是什么?将匹配的组推入数组后,输出应如下所示
_输出_
看一看,它返回一个列表中与某个条件匹配的所有元素的列表
顺便说一句,push确实需要不止一个参数:参见
但是,您没有包含导致此问题的代码。。其中一些有点奇怪,例如使用$作为函数调用。确实要这样做吗?请查看,它将返回一个列表中与某个条件匹配的所有元素的列表
顺便说一句,push确实需要不止一个参数:参见
但是,您没有包含导致此问题的代码。。其中一些有点奇怪,例如使用$作为函数调用。确实要这样做吗?使用与打印相同的推送参数:双引号中的字符串
push @array, "$1 $2 $3 $4 $5 $6 $7";
使用与打印相同的推送参数:双引号中的字符串
push @array, "$1 $2 $3 $4 $5 $6 $7";
请记住,列表上下文中的捕获匹配将返回捕获的字段(如果有):
#!/usr/bin/perl
use strict; use warnings;
my $file = '/home/user/name';
open my $in, '<', $file
or die "Cannot open '$file': $!";
my @matched;
while ( <$in> ) {
my @fields;
if (@fields = /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/
or @fields = /CD_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/)
{
push @matched, "@fields";
}
}
use Data::Dumper;
print Dumper \@matched;
取决于您打算如何处理匹配项。请记住,在列表上下文中捕获匹配项会返回捕获的字段(如果有):
#!/usr/bin/perl
use strict; use warnings;
my $file = '/home/user/name';
open my $in, '<', $file
or die "Cannot open '$file': $!";
my @matched;
while ( <$in> ) {
my @fields;
if (@fields = /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/
or @fields = /CD_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/)
{
push @matched, "@fields";
}
}
use Data::Dumper;
print Dumper \@matched;
取决于您打算如何处理匹配项。如果您使用的是Perl 5.10.1或更高版本,我将这样编写它
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1; # or use 5.010;
use autodie;
my @lines = do{
# don't need to check for errors, because of autodie
open( my $file, '<', '/home/user/name' );
grep {chomp} <$file>;
# $file is automatically closed
};
# use 3 arg form of open
open( my $file, '>>', '/home/user/new' );
my @matches;
for( @lines ){
if( /(?:AB|CD)( (?:_[^_]+)+ )_W .+ txt/x ){
my @match = "$1" =~ /_([^_]+)/g;
say {$file} "@match";
push @matches, \@match;
# or
# push @matches, [ "$1" =~ /_([^_]+)/g ];
# if you don't need to print it in this loop.
}
}
close $file;
这对输入有一点允许,但regex应该比原始版本更正确。如果您使用的是Perl 5.10.1或更高版本,我会这样编写它
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1; # or use 5.010;
use autodie;
my @lines = do{
# don't need to check for errors, because of autodie
open( my $file, '<', '/home/user/name' );
grep {chomp} <$file>;
# $file is automatically closed
};
# use 3 arg form of open
open( my $file, '>>', '/home/user/new' );
my @matches;
for( @lines ){
if( /(?:AB|CD)( (?:_[^_]+)+ )_W .+ txt/x ){
my @match = "$1" =~ /_([^_]+)/g;
say {$file} "@match";
push @matches, \@match;
# or
# push @matches, [ "$1" =~ /_([^_]+)/g ];
# if you don't need to print it in this loop.
}
}
close $file;
这对输入有点宽容,但是正则表达式应该比原来的更正确。我想知道使用push和giant正则表达式是否真的是正确的方法
OP说他想要索引为0的以AB开头的行,索引为1的以CD开头的行。
而且,这些正则表达式在我看来就像是一个由内而外的分裂
在下面的代码中,我添加了一些说教性的评论,指出了为什么我做的事情与OP和这里提供的其他解决方案不同
#!/usr/bin/perl
use strict;
use warnings; # best use warnings too. strict doesn't catch everything
my $filename = "/home/user/name";
# Using 3 argument open addresses some security issues with 2 arg open.
# Lexical filehandles are better than global filehandles, they prevent
# most accidental filehandle name colisions, among other advantages.
# Low precedence or operator helps prevent incorrect binding of die
# with open's args
# Expanded error message is more helpful
open( my $inh, '<', $filename )
or die "Error opening input file '$filename': $!";
my @file_data;
# Process file with a while loop.
# This is VERY important when dealing with large files.
# for will read the whole file into RAM.
# for/foreach is fine for small files.
while( my $line = <$inh> ) {
chmop $line;
# Simple regex captures the data type indicator and the data.
if( $line =~ /(AB|CD)_(.*)_W.+txt/ ) {
# Based on the type indicator we set variables
# used for validation and data access.
my( $index, $required_fields ) = $1 eq 'AB' ? ( 0, 7 )
: $1 eq 'CD' ? ( 1, 6 )
: ();
next unless defined $index;
# Why use a complex regex when a simple split will do the same job?
my @matches = split /_/, $2;
# Here we validate the field count, since split won't check that for us.
unless( @matches == $required_fields ) {
warn "Incorrect field count found in line '$line'\n";
next;
}
# Warn if we have already seen a line with the same data type.
if( defined $file_data[$index] ) {
warn "Overwriting data at index $index: '@{$file[$index]}'\n";
}
# Store the data at the appropriate index.
$file_data[$index] = \@matches;
}
else {
warn "Found non-conformant line: $line\n";
}
}
事先警告一下,我刚刚在浏览器窗口中输入了这个。因此,虽然代码应该是正确的,但在未经测试的代码中可能隐藏着拼写错误或遗漏的分号,使用它会有风险。我想知道使用push和巨型正则表达式是否真的是正确的方法
OP说他想要索引为0的以AB开头的行,索引为1的以CD开头的行。
而且,这些正则表达式在我看来就像是一个由内而外的分裂
在下面的代码中,我添加了一些说教性的评论,指出了为什么我做的事情与OP和这里提供的其他解决方案不同
#!/usr/bin/perl
use strict;
use warnings; # best use warnings too. strict doesn't catch everything
my $filename = "/home/user/name";
# Using 3 argument open addresses some security issues with 2 arg open.
# Lexical filehandles are better than global filehandles, they prevent
# most accidental filehandle name colisions, among other advantages.
# Low precedence or operator helps prevent incorrect binding of die
# with open's args
# Expanded error message is more helpful
open( my $inh, '<', $filename )
or die "Error opening input file '$filename': $!";
my @file_data;
# Process file with a while loop.
# This is VERY important when dealing with large files.
# for will read the whole file into RAM.
# for/foreach is fine for small files.
while( my $line = <$inh> ) {
chmop $line;
# Simple regex captures the data type indicator and the data.
if( $line =~ /(AB|CD)_(.*)_W.+txt/ ) {
# Based on the type indicator we set variables
# used for validation and data access.
my( $index, $required_fields ) = $1 eq 'AB' ? ( 0, 7 )
: $1 eq 'CD' ? ( 1, 6 )
: ();
next unless defined $index;
# Why use a complex regex when a simple split will do the same job?
my @matches = split /_/, $2;
# Here we validate the field count, since split won't check that for us.
unless( @matches == $required_fields ) {
warn "Incorrect field count found in line '$line'\n";
next;
}
# Warn if we have already seen a line with the same data type.
if( defined $file_data[$index] ) {
warn "Overwriting data at index $index: '@{$file[$index]}'\n";
}
# Store the data at the appropriate index.
$file_data[$index] = \@matches;
}
else {
warn "Found non-conformant line: $line\n";
}
}
事先警告一下,我刚刚在浏览器窗口中输入了这个。因此,虽然代码应该是正确的,但可能有拼写错误或遗漏的分号潜伏在未经测试的代码中,使用该代码的风险自负。您希望在$a中找到什么?为什么要打印到前面几行关闭的文件中?@Manni:$a是我代码的一部分。粘贴到这里时忘了删除它。也许你应该把你的问题说清楚一点。你希望在$a中找到什么?为什么要打印到你在后面几行关闭的文件中?@Manni:$a是我代码的一部分。粘贴到这里时忘了删除它。也许你应该把问题弄清楚一点push@array,[$1,…]可能更好。push@array,[$1,…]可能更好。