perl正则表达式模式匹配
作为GMF文件输入:perl正则表达式模式匹配,perl,Perl,作为GMF文件输入: CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| | CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| | CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|75000
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|7500000|234446
在perl代码中,我使用下面的命令从行中提取字符串
if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|.*\|(.*?)$/)
{
$tag=$1;
$lineTxt=$2;
$usage = $3;
$amt = $4;
}
输出:
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package usage :: 3126 amt ::
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Smartphone Package usage :: 3126 amt ::
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package - Charged usage :: 3126 amt :: 234446
我如何检索/打印使用的单位是MB或GB。有人能帮我吗。您不能在
\d+
之后捕获列。为此添加括号
*
是贪婪的,即它尽可能匹配。添加一个?
以使其更省钱:
您还可以根据需要重写备选方案
(CUSTEVSUMMROW(?:_GPRS)?)
在
\d+
之后不捕获列。为此添加括号
*
是贪婪的,即它尽可能匹配。添加一个?
以使其更省钱:
您还可以根据需要重写备选方案
(CUSTEVSUMMROW(?:_GPRS)?)
考虑到你所拥有的:
if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|(.*?)$/)
{
$tag=$1;
$lineTxt=$2;
$usage = $3;
$units = $4;
$amt = $5;
}
但我认为这不是解决这个问题的最佳方法——我会考虑使用split
并分别处理第一个字段
可能是这样的:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my @fields = qw ( tag lineTxt usage units amt );
while (<DATA>) {
my ( $first_field, @record ) = split '\|';
#split the first field on _just_ the first space.
unshift( @record, $first_field =~ m/^(\w+) (.*)$/ );
#use a hash slice to put that record into a hash of named keys.
my %data;
@data{@fields} = @record;
print Dumper \%data;
# can of course, make this an array of hashes quite easily.
}
__DATA__
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|7500000|234446
考虑到你所拥有的:
if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|(.*?)$/)
{
$tag=$1;
$lineTxt=$2;
$usage = $3;
$units = $4;
$amt = $5;
}
但我认为这不是解决这个问题的最佳方法——我会考虑使用split
并分别处理第一个字段
可能是这样的:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my @fields = qw ( tag lineTxt usage units amt );
while (<DATA>) {
my ( $first_field, @record ) = split '\|';
#split the first field on _just_ the first space.
unshift( @record, $first_field =~ m/^(\w+) (.*)$/ );
#use a hash slice to put that record into a hash of named keys.
my %data;
@data{@fields} = @record;
print Dumper \%data;
# can of course, make this an array of hashes quite easily.
}
__DATA__
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|7500000|234446