Regex 组合文本文件中的两列,并在Perl中参考第三列查找计数

Regex 组合文本文件中的两列,并在Perl中参考第三列查找计数,regex,perl,hash-of-hashes,Regex,Perl,Hash Of Hashes,我正在尝试读取in.txt文件,并使用Perl生成输出文件out.txt。我尝试使用哈希,但没有得到确切的输出 有没有一种方法可以在Perl中实现这一点 合并两栏,并在第三栏的基础上提供评论 in.txt Template,Account,Active 123456,123,N 123456,456,Y 321478,456,Y 123456,123,N 321478,456,Y Account,Template,Active,NotActive 123,123456,0,2 456,3214

我正在尝试读取in.txt文件,并使用Perl生成输出文件out.txt。我尝试使用哈希,但没有得到确切的输出

有没有一种方法可以在Perl中实现这一点

合并两栏,并在第三栏的基础上提供评论

in.txt

Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0
out.txt

Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0

这不是一个
perl
解决方案,但它可以与
awk
配合使用:

AWK 1衬板

awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if($3=="Y"){a[$2 FS $1]++}else{b[$2 FS $1]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n
# BEGIN rule(s)

BEGIN {
        FS = OFS = "," #defines input/output field separator as ,
        print "Account,Template,Active,NotActive" #print the header
}

# Rule(s)

NR > 1 { # from the 2nd line of the file
        if ($3 == "Y") { # if the 3rd field is at Y
                a[$2 FS $1]++ #increment the array  indexed by $2 FS $1 
        } else {
                b[$2 FS $1]++ #do the same when N with the other array
        }
}

# END rule(s)

END {
        for (i in a) { # loop on all values of the arrays and print the content
                print i OFS a[i] OFS (b[i] + 0)
        }
        for (u in b) {
                if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
                        print u OFS (a[u] + 0) OFS b[u]
                }
        }
} #pipe the output to a numerical sort to perform the proper ordering of the output
AWK脚本

awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if($3=="Y"){a[$2 FS $1]++}else{b[$2 FS $1]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n
# BEGIN rule(s)

BEGIN {
        FS = OFS = "," #defines input/output field separator as ,
        print "Account,Template,Active,NotActive" #print the header
}

# Rule(s)

NR > 1 { # from the 2nd line of the file
        if ($3 == "Y") { # if the 3rd field is at Y
                a[$2 FS $1]++ #increment the array  indexed by $2 FS $1 
        } else {
                b[$2 FS $1]++ #do the same when N with the other array
        }
}

# END rule(s)

END {
        for (i in a) { # loop on all values of the arrays and print the content
                print i OFS a[i] OFS (b[i] + 0)
        }
        for (u in b) {
                if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
                        print u OFS (a[u] + 0) OFS b[u]
                }
        }
} #pipe the output to a numerical sort to perform the proper ordering of the output
演示:

Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0
输入:

$ cat input_file 
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
123457,125,N
123457,125,Y
输出:

Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0
my$filename='input.txt';
我的%yhash;
我的%nhash;
如果(打开(my$ifh,,'out.txt')或死亡$!;
打印FH“帐户、模板、活动、非活动\n”;
而(我的($key,$value)=每个(%nhash)){
我的$account=substr$key,6,3;
my$template=substr$key,0,6;
我的$active=“0”;
我的$notactive=“$value”;
打印FH“$account、$template、$active、$notactive\n”;
}
而(my($key,$value)=每个(%yhash)){
我的$account=substr$key,6,3;
my$template=substr$key,0,6;
my$active=“$value”;
我的$notactive=“0”;
打印FH“$account、$template、$active、$notactive\n”;
}
关闭(FH);
这同样有效:

use strict;
use warnings;

my %data;
open my $fh, "<", "in.txt" or die $!;
while (my $line = <$fh>) {
    chomp $line;
    next if($line =~ /Account/);
    my @line = split ',', $line;
    $data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
    $data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
    $data{$line[1]}{$line[0]}{$line[2]} ++;
}
close $fh;
open my $FH, ">", "out.txt" or die $!;
    print $FH "Account,Template,Active,NotActive\n";
    foreach my $key (sort keys %data) {
        foreach my $key2 (sort keys %{$data{$key}}) {
            print $FH "$key,$key2,$data{$key}{$key2}{'Y'},$data{$key}{$key2}{'N'}\n";
        }
    }
close $FH;


您能展示一下您尝试过的代码吗?您接受perl以外的其他解决方案吗?如果您懂一点SQL,您可以使用sqlite轻松解决这个问题。这可能比编写程序更容易。了解如何导入CSV文件,然后了解如何将列组合在一起,并仅在列为特定值时进行汇总。对新手的建议:如果答案解决了您的问题,请单击大复选标记接受答案(✓) 在它旁边,也可以选择向上投票(向上投票要求至少15个信誉点)。如果您发现其他答案有帮助,请向上投票。接受和向上投票有助于未来的读者。请参阅[相关帮助中心文章][1][1]:请检查答案…我已经写了,它的工作如预期Hi Rajneesh…请提供对所提供答案的反馈