Arrays Perl-将正则表达式之间的行推送到数组的一个元素中_Arrays_Perl_Split

Arrays Perl-将正则表达式之间的行推送到数组的一个元素中

arrays perl

Arrays Perl-将正则表达式之间的行推送到数组的一个元素中,arrays,perl,split,Arrays,Perl,Split,这是我正在处理的日志文件- | blah1a blah1b blah1c | ****blahnothing1 | blah2a blah2b blah2c | blahnothing2 | blah3a blah3b blah3c | blahnothing3 我需要的信息位于两个管道字符之间。有很多以星号开头的行，我跳过了。每行都有windows的行尾字符。这些管道字符之间的数据是连续的，但在linux主机上读取时，它会被windows新行切碎。我编写的perl脚本在两行之间使用了一个范

这是我正在处理的日志文件-

|
blah1a
blah1b
blah1c
|
****blahnothing1
|
blah2a
blah2b
blah2c
|
blahnothing2
|
blah3a
blah3b
blah3c
|
blahnothing3

我需要的信息位于两个管道字符之间。有很多以星号开头的行，我跳过了。每行都有windows的行尾字符。这些管道字符之间的数据是连续的，但在linux主机上读取时，它会被windows新行切碎。我编写的perl脚本在两行之间使用了一个范围运算符，希望所有以管道分隔符开头的内容都被推入数组元素，然后在下一个管道分隔符处停止，然后重新开始。每个数组元素都有两个字符之间的所有行

理想情况下，数组应该是这样的，没有windows控件字符

$lines[0] blah1a blah1b blah1c
$lines[1] blah2a blah2b blah2c
$lines[2] blah3a blah3b blah3c

但是，每个阵列看起来都不是这样

#!/usr/bin/perl

use strict ;
use warnings ;

my $delimiter = "|";
my $filename = $ARGV[0] ;
my @lines ;
open(my $fh, '<:encoding(UTF-8)' , $filename) or die "could not open file $filename $!";

while (my $line = readline $fh) {
    next if ($line =~/^\*+/) ;
    if ($line =~ /$delimiter/ ... $line =~/$delimiter/) {
    push (@lines, $line) ;
    }


}

print  $lines[0] ;
print  $lines[1] ;
print  $lines[2] ;

#/usr/bin/perl
严格使用；
使用警告；
我的$delimiter=“|”；
我的$filename=$ARGV[0]；
我的@行；
open（my$fh，“似乎您想要将|
之间的行合并到一个字符串中，该字符串放置在一个数组中
一种方法是将|
设置为，以便每次在管道之间读取一个块
{  # localize the change to $/

    local $/ = "|";
    open(my $fh, '<:encoding(UTF-8)' , $filename) 
        or die "could not open file $filename $!";

    my @records;
    while (my $section = <$fh>)
    {
        next if $section =~ /^\s*\*/;  
        chomp $section;                # remove the record separator (| here)
        $section =~ s/\R/ /g;          # clean up newlines
        $section =~ s/^\s*//;          # clean up leading spaces
        push @records, $section if $section;
    }
    print "$_\n" for @records;
}


问题中的方法很好，但您需要将“部分”（管道之间）中的行合并，并将每个这样的字符串放置在数组中。因此，您需要一个标记来跟踪何时进入/离开部分。
似乎您希望将|
之间的行合并到一个字符串中，该字符串将放置在数组中
一种方法是将|
设置为，以便每次在管道之间读取一个块
{  # localize the change to $/

    local $/ = "|";
    open(my $fh, '<:encoding(UTF-8)' , $filename) 
        or die "could not open file $filename $!";

    my @records;
    while (my $section = <$fh>)
    {
        next if $section =~ /^\s*\*/;  
        chomp $section;                # remove the record separator (| here)
        $section =~ s/\R/ /g;          # clean up newlines
        $section =~ s/^\s*//;          # clean up leading spaces
        push @records, $section if $section;
    }
    print "$_\n" for @records;
}


问题中的方法很好，但您需要合并“部分”（管道之间）中的行，并将每个这样的字符串放置在数组中。因此，您需要一个标志来跟踪进入/离开部分的时间。
这似乎满足了您的要求
我保留了两行内容blahnothing2
和blahnothing3
，因为我看不出删除它们的理由
\R
regex模式是通用换行符，与任何平台（即CR、LF或CRLF）的换行符序列相匹配
使用严格；
使用“全部”警告；
我的$data=do{
打开我的$fh，“这似乎满足了您的要求
我保留了两行内容blahnothing2
和blahnothing3
，因为我看不出删除它们的理由
\R
regex模式是通用换行符，与任何平台（即CR、LF或CRLF）的换行符序列相匹配
使用严格；
使用“全部”警告；
我的$data=do{
打开我的$fh，“为什么预期输出中缺少blahnothing2
（对于3）？它不是以*
开头的。有很多行是以星号开头的。有些行不是以星号开头的。我真的应该把它排除在外。只有两个管道之间的信息是必要的，其余的什么都不是。好吧，足够公平了。不过，星号也会在下面处理，以及rest（据我所知，这两个答案都有）。请告诉我们其中是否有问题。@capser:但是blahnothing2
是“两条管道之间”。是否有其他方法来区分要保留的块和要丢弃的块？为什么预期输出中缺少blahnothing2
？它不是以*
开头的。有很多行是以星号开头的。有些行不是以星号开头的。我真的应该把它排除在外。只有两个管道之间的信息是必要的，其余的什么都不是。好吧，足够公平了。不过，星号也会在下面处理，以及res（据我所知，这两个答案都有）。如果其中任何一个有问题，请告诉我们。@capser:但是blahnothing2
是“在两个管道之间”。是否有其他方法来区分要保留的块和要丢弃的块？