Regex 组合文本文件中的两列，并在Perl中参考第三列查找计数_Regex_Perl_Hash Of Hashes

Regex 组合文本文件中的两列，并在Perl中参考第三列查找计数

regex perl

Regex 组合文本文件中的两列，并在Perl中参考第三列查找计数,regex,perl,hash-of-hashes,Regex,Perl,Hash Of Hashes,我正在尝试读取in.txt文件，并使用Perl生成输出文件out.txt。我尝试使用哈希，但没有得到确切的输出有没有一种方法可以在Perl中实现这一点合并两栏，并在第三栏的基础上提供评论 in.txt Template,Account,Active 123456,123,N 123456,456,Y 321478,456,Y 123456,123,N 321478,456,Y Account,Template,Active,NotActive 123,123456,0,2 456,3214

我正在尝试读取in.txt文件，并使用Perl生成输出文件out.txt。我尝试使用哈希，但没有得到确切的输出

有没有一种方法可以在Perl中实现这一点

合并两栏，并在第三栏的基础上提供评论

in.txt

Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y

Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0

out.txt

Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y

Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0

这不是一个

perl

解决方案，但它可以与

awk

配合使用：

AWK 1衬板：

awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if($3=="Y"){a[$2 FS $1]++}else{b[$2 FS $1]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n

# BEGIN rule(s)

BEGIN {
        FS = OFS = "," #defines input/output field separator as ,
        print "Account,Template,Active,NotActive" #print the header
}

# Rule(s)

NR > 1 { # from the 2nd line of the file
        if ($3 == "Y") { # if the 3rd field is at Y
                a[$2 FS $1]++ #increment the array  indexed by $2 FS $1 
        } else {
                b[$2 FS $1]++ #do the same when N with the other array
        }
}

# END rule(s)

END {
        for (i in a) { # loop on all values of the arrays and print the content
                print i OFS a[i] OFS (b[i] + 0)
        }
        for (u in b) {
                if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
                        print u OFS (a[u] + 0) OFS b[u]
                }
        }
} #pipe the output to a numerical sort to perform the proper ordering of the output

AWK脚本：

awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if($3=="Y"){a[$2 FS $1]++}else{b[$2 FS $1]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n

# BEGIN rule(s)

BEGIN {
        FS = OFS = "," #defines input/output field separator as ,
        print "Account,Template,Active,NotActive" #print the header
}

# Rule(s)

NR > 1 { # from the 2nd line of the file
        if ($3 == "Y") { # if the 3rd field is at Y
                a[$2 FS $1]++ #increment the array  indexed by $2 FS $1 
        } else {
                b[$2 FS $1]++ #do the same when N with the other array
        }
}

# END rule(s)

END {
        for (i in a) { # loop on all values of the arrays and print the content
                print i OFS a[i] OFS (b[i] + 0)
        }
        for (u in b) {
                if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
                        print u OFS (a[u] + 0) OFS b[u]
                }
        }
} #pipe the output to a numerical sort to perform the proper ordering of the output

演示：

Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0

输入：

$ cat input_file 
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
123457,125,N
123457,125,Y

输出：

Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0

my$filename='input.txt'；
我的%yhash；
我的%nhash；
如果（打开（my$ifh，，'out.txt'）或死亡$！；
打印FH“帐户、模板、活动、非活动\n”；
而（我的（$key，$value）=每个（%nhash））{
我的$account=substr$key，6,3；
my$template=substr$key，0,6；
我的$active=“0”；
我的$notactive=“$value”；
打印FH“$account、$template、$active、$notactive\n”；
}
而（my（$key，$value）=每个（%yhash））{
我的$account=substr$key，6,3；
my$template=substr$key，0,6；
my$active=“$value”；
我的$notactive=“0”；
打印FH“$account、$template、$active、$notactive\n”；
}
关闭（FH）；

这同样有效：

use strict;
use warnings;

my %data;
open my $fh, "<", "in.txt" or die $!;
while (my $line = <$fh>) {
    chomp $line;
    next if($line =~ /Account/);
    my @line = split ',', $line;
    $data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
    $data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
    $data{$line[1]}{$line[0]}{$line[2]} ++;
}
close $fh;
open my $FH, ">", "out.txt" or die $!;
    print $FH "Account,Template,Active,NotActive\n";
    foreach my $key (sort keys %data) {
        foreach my $key2 (sort keys %{$data{$key}}) {
            print $FH "$key,$key2,$data{$key}{$key2}{'Y'},$data{$key}{$key2}{'N'}\n";
        }
    }
close $FH;

与

您能展示一下您尝试过的代码吗？您接受perl以外的其他解决方案吗？如果您懂一点SQL，您可以使用sqlite轻松解决这个问题。这可能比编写程序更容易。了解如何导入CSV文件，然后了解如何将列组合在一起，并仅在列为特定值时进行汇总。对新手的建议：如果答案解决了您的问题，请单击大复选标记接受答案(✓) 在它旁边，也可以选择向上投票（向上投票要求至少15个信誉点）。如果您发现其他答案有帮助，请向上投票。接受和向上投票有助于未来的读者。请参阅[相关帮助中心文章][1][1]：请检查答案…我已经写了，它的工作如预期Hi Rajneesh…请提供对所提供答案的反馈