Perl 单个键的哈希中有多个值_Perl_Select_Hash

Perl 单个键的哈希中有多个值

perl select hash

Perl 单个键的哈希中有多个值,perl,select,hash,Perl,Select,Hash,我有两个文件。一个由唯一的列表组成，而另一个是随年龄增长的名称冗余列表比如说 File1: File2: Gaia Gaia 3 Matt Matt 12 Jane Gaia 89 Reuben 4 我的目标是匹配File1和File2，并检索每个名称的最高年龄。到目前为止，我已经编写了以下代码。不太好用的一点是：当在散列中找到相同的键时，打印较大的值欢迎提出任何建议/意见谢谢 #!/usr/bin/per

我有两个文件。一个由唯一的列表组成，而另一个是随年龄增长的名称冗余列表

比如说

File1:      File2:
Gaia        Gaia 3
Matt        Matt 12
Jane        Gaia 89
            Reuben 4

我的目标是匹配File1和File2，并检索每个名称的最高年龄。到目前为止，我已经编写了以下代码。不太好用的一点是：当在散列中找到相同的键时，打印较大的值

欢迎提出任何建议/意见

谢谢

#!/usr/bin/perl -w
use strict;

open (FILE1, $ARGV[0] )|| die "unable to open arg1\n"; #Opens first file for comparison
open (FILE2, $ARGV[1])|| die "unable to open arg2\n"; #2nd for comparison

my @not_red = <FILE1>;
my @exonslength = <FILE2>;

#2)  Produce an Hash of File2. If the key is already in the hash, keep the couple key-          value with the highest value. Otherwise, next.

my %hash_doc2;
my @split_exons;
my $key;
my $value;

foreach my $line (@exonslength) {

    @split_exons = split "\t", $line;

    @hash_doc2 {$split_exons[0]} = ($split_exons[1]);

 if (exists $hash_doc2{$split_exons[0]}) {

    if ( $hash_doc2{$split_exons[0]} > values %hash_doc2) {

     $hash_doc2{$split_exons[0]} = ($split_exons[1]);

    } else {next;}
       }
   }

#3) grep the non redundant list of gene from the hash with the corresponding value

my @a =  grep (@not_red,%hash_doc2);
print "@a\n";

#/usr/bin/perl-w
严格使用；
打开（文件1，$ARGV[0]）| |死亡“无法打开arg1\n”#打开第一个文件进行比较
打开（文件2，$ARGV[1]）| |死亡“无法打开arg2\n”#第二个用于比较
我的@not_red=；
我的@exonslength=；
#2） 生成文件2的散列。如果密钥已经在散列中，则保留具有最高值的耦合密钥值。否则，下一步。
我的%hash_doc2；
我的@split_外显子；
我的$key；
我的美元价值；
foreach my$行（@exonslength）{
@split_exons=split“\t”，$line；
@hash_doc2{$split_exons[0]}=（$split_exons[1]）；
if（存在$hash_doc2{$split_exons[0]}）{
if（$hash_doc2{$split_exons[0]}>值%hash_doc2）{
$hash_doc2{$split_exons[0]}=（$split_exons[1]）；
}else{next；}
}
}
#3） 用相应的值grep哈希中的非冗余基因列表
我的@a=grep（@not_red，%hash_doc2）；
打印“@a\n”；

是否需要保留所有值？如果不是，则只能保留最大值：

@split_exons = split "\t", $line;
if (exists $hash_doc2{$slit_exons[0]}
    and $hash_doc2{$slit_exons[0]} < $split_exons[1]) {
    $hash_doc2{$split_exons[0]} = $split_exons[1];
}

您对

值使用数值比较也没有做到您所想的。您可以逐行循环并处理数据文件，而不是将整个file2读入一个数组（如果它太大，则会很糟糕）：
#!/usr/bin/perl

use strict;
use warnings;
use autodie;
use Data::Dumper;

open( my $nameFh, '<', $ARGV[0]);
open( my $dataFh, '<', $ARGV[1]);

my $dataHash = {};
my $processedHash = {};

while(<$dataFh>){
    chomp;
    my ( $name, $age ) = split /\s+/, $_;
    if(! defined($dataHash->{$name}) or $dataHash->{$name} < $age ){
        $dataHash->{$name} = $age
    }
}

while(<$nameFh>){
    chomp;
    $processedHash->{$_} = $dataHash->{$_} if defined $dataHash->{$_};
}

print Dumper($processedHash);

#/usr/bin/perl
严格使用；
使用警告；
使用自动模具；
使用数据：：转储程序；
打开（my$nameFh），请使用代码包装提交两个输入文件的内容
$hash_doc2{$split_exons[0]} = [ sort @{ $hash_doc2{$split_exons[0]} }, $split_exons[1] ];
# max for $x is at $hash_doc2{$x}[-1]

#!/usr/bin/perl

use strict;
use warnings;
use autodie;
use Data::Dumper;

open( my $nameFh, '<', $ARGV[0]);
open( my $dataFh, '<', $ARGV[1]);

my $dataHash = {};
my $processedHash = {};

while(<$dataFh>){
    chomp;
    my ( $name, $age ) = split /\s+/, $_;
    if(! defined($dataHash->{$name}) or $dataHash->{$name} < $age ){
        $dataHash->{$name} = $age
    }
}

while(<$nameFh>){
    chomp;
    $processedHash->{$_} = $dataHash->{$_} if defined $dataHash->{$_};
}

print Dumper($processedHash);