Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/perl/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
perl正则表达式模式匹配_Perl - Fatal编程技术网

perl正则表达式模式匹配

perl正则表达式模式匹配,perl,Perl,作为GMF文件输入: CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| | CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| | CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|75000

作为GMF文件输入:

CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package -  Charged|3126|GB|7500000|234446
在perl代码中,我使用下面的命令从行中提取字符串

if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|.*\|(.*?)$/)
{
    $tag=$1;
    $lineTxt=$2;
    $usage = $3;
    $amt = $4;
}
输出:

tag :: CUSTEVSUMMROW_GPRS  lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package  usage :: 3126  amt ::
tag :: CUSTEVSUMMROW_GPRS  lineTxt :: GPRS - Nova Subscriber Smartphone Package  usage :: 3126  amt ::
tag :: CUSTEVSUMMROW_GPRS  lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package - Charged usage :: 3126 amt :: 234446

我如何检索/打印使用的单位是MB或GB。有人能帮我吗。

您不能在
\d+
之后捕获列。为此添加括号

*
是贪婪的,即它尽可能匹配。添加一个
以使其更省钱:

您还可以根据需要重写备选方案

(CUSTEVSUMMROW(?:_GPRS)?)

\d+
之后不捕获列。为此添加括号

*
是贪婪的,即它尽可能匹配。添加一个
以使其更省钱:

您还可以根据需要重写备选方案

(CUSTEVSUMMROW(?:_GPRS)?)

考虑到你所拥有的:

if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|(.*?)$/)
{
    $tag=$1;
    $lineTxt=$2;
    $usage = $3;
    $units = $4;
    $amt = $5;
}
但我认为这不是解决这个问题的最佳方法——我会考虑使用
split
并分别处理第一个字段

可能是这样的:

#!/usr/bin/env perl
use strict;
use warnings;

use Data::Dumper;

my @fields = qw ( tag lineTxt usage units amt );

while (<DATA>) {
    my ( $first_field, @record )  = split '\|';

    #split the first field on _just_ the first space.
    unshift( @record, $first_field =~ m/^(\w+) (.*)$/ );

    #use a hash slice to put that record into a hash of named keys.
    my %data;
    @data{@fields} = @record;
    print Dumper \%data;

    # can of course, make this an array of hashes quite easily. 
}


__DATA__
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package -  Charged|3126|GB|7500000|234446

考虑到你所拥有的:

if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|(.*?)$/)
{
    $tag=$1;
    $lineTxt=$2;
    $usage = $3;
    $units = $4;
    $amt = $5;
}
但我认为这不是解决这个问题的最佳方法——我会考虑使用
split
并分别处理第一个字段

可能是这样的:

#!/usr/bin/env perl
use strict;
use warnings;

use Data::Dumper;

my @fields = qw ( tag lineTxt usage units amt );

while (<DATA>) {
    my ( $first_field, @record )  = split '\|';

    #split the first field on _just_ the first space.
    unshift( @record, $first_field =~ m/^(\w+) (.*)$/ );

    #use a hash slice to put that record into a hash of named keys.
    my %data;
    @data{@fields} = @record;
    print Dumper \%data;

    # can of course, make this an array of hashes quite easily. 
}


__DATA__
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package -  Charged|3126|GB|7500000|234446