Perl 是否可以使用单个散列计算两列中的重复数？_Perl_Hash

Perl 是否可以使用单个散列计算两列中的重复数？

perl hash

Perl 是否可以使用单个散列计算两列中的重复数？,perl,hash,Perl,Hash,我的输入数据如下。从下面的数据中，我想唯一化p1 p2。。p5和第一列，并获取它们的计数 ID M N cc1 1 p1 cc1 10 p2 cc1 10 p2 cc2 1 p1 cc2 2 p5 cc3 2 p1 cc3 2 p4 我原以为结果是这样的 ID M p1 p2 p3 p4 p5 cc1 3 1 2 0 0 0 cc3 2 1 0 0 1 0 cc2 2 1 0 0

我的输入数据如下。从下面的数据中，我想唯一化p1 p2。。p5和第一列，并获取它们的计数

ID  M   N 
cc1 1   p1
cc1 10  p2
cc1 10  p2
cc2 1   p1
cc2 2   p5
cc3 2   p1
cc3 2   p4

我原以为结果是这样的

ID  M   p1  p2  p3  p4  p5 
cc1 3   1   2   0   0   0   
cc3 2   1   0   0   1   0   
cc2 2   1   0   0   0   1

为此，我尝试了散列和散列，我得到了我期望的输出。但我怀疑是否可以通过使用单个散列来实现这一点。？因为相同的数据存储在两个不同的散列中

my (%hash,$hash2);
<$fh>;
while (<$fh>)
{
    my($first,$second,$thrid) = split("\t");
    $hash{$first}{$thrid}++; #I tried $hash{$first}++{$thrid}++ It throws syntax error
    $hash2{$first}++; #it is possible to reduce this hash
}
my @ar = qw(p1  p2  p3  p4  p5);
$, = "\t"; 
print @ar,"\n";
foreach (keys %hash)
{
    print "$_\t$hash2{$_}\t";
    foreach my $ary(@ar)
    {
        if(!$hash{$_}{$ary})
        {
            print "0\t"; 
        }
        else
        {
            print "$hash{$_}{$ary}\t";
        }
    }
    print "\n";
}

不需要使用2个哈希。只能使用哈希的哈希。我刚刚修改了你的代码。请参阅下面的代码

use strict;
use warnings;
my %hash;
<DATA>;
while (<DATA>)
{
    chomp;
    my($first,$second,$thrid) = split("\t");
    $hash{$first}{$thrid}++; #I tried $hash{$first}++{$thrid}++ It throws syntax error
}
my @ar = qw(p1  p2  p3  p4  p5);
$, = "\t"; 
print @ar,"\n";
foreach (keys %hash)
{
#    print "$_\t$hash2{$_}\t";
    my @in = values $hash{$_};
    my $cnt = eval(join("+",@in));
    print "$_\t$cnt\t";
    foreach my $ary(@ar)
    {
        if(!$hash{$_}{$ary})
        {
            print "0\t"; 
        }
        else
        {
            print "$hash{$_}{$ary}\t";
        }
    }
    print "\n";
}

您有散列的散列来存储数据。第一个键是id，第二个键是N。只需计算id的值，它就会给出您想要的总值

我可能会这样做：

#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;

my %count_of;

#read the header row 
chomp( my @header = split ' ', <DATA> );

while (<DATA>) {
   my ( $ID, $M, $N ) = split;
   $count_of{ $ID }{ $N }++;
}
#print Dumper \%count_of;

#setup the output headers. We could autodetect, but some of these (p3) are entirely empty. 
my @p_headers = qw ( p1 p2 p3 p4 p5 );
#if you did want to:
#my @p_headers = sort keys %{{map { $_ => 1 } map { keys %{$count_of{$_}} } keys %count_of }};
#will give p1 p2 p4 p5. 

print join "\t", qw ( ID M ), @p_headers, "\n";
foreach my $ID ( sort keys %count_of ) {
   my $total = 0;
   $total += $_ for values %{ $count_of{$ID} };
   print join "\t", 
                   $ID, 
                   $total,
                   ( map { $count_of{$ID}{$_} // 0 } @p_headers ),
                   "\n";
}

__DATA__
ID  M   N 
cc1 1   p1
cc1 10  p2
cc1 10  p2
cc2 1   p1
cc2 2   p5
cc3 2   p1
cc3 2   p4

我不明白你最后一行的内容。为什么p4和p5的值为4和1？输入中没有10_覆盖率_Contigs_contig_3和SSR为的行p5@Borodin抱歉，编辑了一个小的打字错误。