Perl哈希格式

Perl哈希格式,perl,Perl,我有一个如下的日志文件 ID: COM-1234 Program: Swimming Name: John Doe Description: Joined on July 1st ------------------------------ID: COM-2345 Program: Swimming Name: Brock sen Description: Joined on July 1st ------------------------------ID: COM-9876 Program:

我有一个如下的日志文件

ID: COM-1234
Program: Swimming
Name: John Doe
Description: Joined on July 1st
------------------------------ID: COM-2345
Program: Swimming
Name: Brock sen
Description: Joined on July 1st
------------------------------ID: COM-9876
Program: Swimming
Name: johny boy
Description: Joined on July 1st
------------------------------ID: COM-9090
Program: Running
Name: justin kim
Description: Good Record
------------------------------
我想根据程序(游泳、跑步等)对其进行分组,并希望显示如下:

PROGRAM:  Swimming
==>ID  
    COM-1234
    COM-2345
    COM-9876

PROGRAM:  Running
==>ID   
    COM-9090
我对Perl非常陌生,我写了下面这篇文章(不完整)

#/usr/bin/perl
使用数据::转储程序;
$/ = "%%%%";
打开(文件“D:\\mine\\out.txt”);
而()
{
@temp=拆分(/-{20,}/,$);
}
关闭(文件);
我的%hash=@new;
打印转储程序(\%hash);
我从perl教程中读到,散列键值对将采用具有多个值的唯一键,但不确定如何使用它


我能够读取文件并将其存储到哈希中,但不确定如何处理为上述格式。非常感谢您的帮助。谢谢。

始终放置
使用警告
使用严格位于程序顶部。并始终为
open使用三个参数

open  my $fh, "<", "D:\\mine\\out.txt";
my %hash;
while (<$fh>){

    if(/ID/)
    {
        my $nxt = <$fh>;    
        s/.*?ID: //g;
        $hash{"$nxt==>ID \n"}.="   $_";
    }


}


print %hash;
I您的输入文件
程序
位于
ID
之后的一行。所以我用
my$nxt=
现在程序被存储到
$nxt
变量中

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

my %hash = ();
open my $IN, "<", "your file name here" or die "Error: $!\n";
while (<$IN>) {
    if ($_ =~ m/^\s*-*ID:\s*COM/) {
        (my $id) = ($_ =~ m/\s*ID:\s*(.*)/);
        my $prog_name = <$IN>;
        chomp $prog_name;
        $prog_name =~ s/Program/PROGRAM/;
        $hash{$prog_name} = [] unless $hash{$prog_name};
        push @{$hash{$prog_name}}, $id;
    }
}
close $IN;
print Dumper(\%hash);
让我们看一下这两行:

$hash{$prog_name} = [] unless $hash{$prog_name};
push @{$hash{$prog_name}}, $id;
如果哈希未定义,则第一行将启动一个空数组引用作为值。第二行将ID推送到该数组的末尾(与第一行无关)


此外,第一行不是强制性的。Perl知道如果您只编写
push@{$hash{$prog_name}},$id并将其解释为“转到此键的值”,如果它不存在,则创建它。现在你说这个值是一个数组,然后你把
$id
推到列表中。

我总是喜欢这样写程序,这样它们就可以从STDIN中读取,因为这使它们更灵活

我会这样做:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

# Read one "chunk" of data at a time
local $/ = '------------------------------';

# A hash to store our results.
my %program;

# While there's data on STDIN...
while (<>) {
  # Remove the end of record marker
  chomp;
  # Skip any empty records
  # (i.e. ones with no printable characters)
  next unless /\S/;

  # Extract the information that we want with a regex
  my ($id, $program) = /ID: (.+?)\n.*Program: (.+?)\n/s;
  # Build a hash of arrays containing our data
  push @{$program{$program}}, $id;
}

# Now we have all the data we need, so let's display it.

# Keys in %program are the program names
foreach my $p (keys %program) {
  say "PROGRAM: $p\n==>ID";
  # $program{$p} is a reference to an array of IDs
  say "\t$_" for @{$program{$p}};
  say '';
}
C:/> programs.pl < programs.txt

非常感谢,效果很好。再次感谢您的详细解释。@Goku,不客气。您应该阅读Perl的这一特性。为什么要进行两次匹配?当然,第一个版本可以变成:
if(my($id)=m/\s*id:\s*(.*)/{…}
$hash{$prog\u name}=(),除非$hash{$prog\u name}
这是不必要的(因为自动激活),而且是错误的(应该是
[]
,而不是
()
)。@DaveCross我提到这一行是不必要的,只是想在他不熟悉Perl时尽可能清楚。两个命令的匹配中断的原因相同。当然,你是对的,你可以在一行中做到这一点。关于括号,为什么错了?在没有错误或警告的情况下为我工作。您已经提到了
\n
那么
s
标志在您的模式中有什么用呢?与其使用非贪婪的否定字符类,它还提供相同的结果,但步骤更少。它不需要对该数据执行
/s
。但我假设ID和程序名并不总是在相邻的行上。可能是杀伤力太大了。我猜你指的是
打印转储程序(\%hash)
或类似的东西。
$hash{$prog_name} = [] unless $hash{$prog_name};
push @{$hash{$prog_name}}, $id;
#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

# Read one "chunk" of data at a time
local $/ = '------------------------------';

# A hash to store our results.
my %program;

# While there's data on STDIN...
while (<>) {
  # Remove the end of record marker
  chomp;
  # Skip any empty records
  # (i.e. ones with no printable characters)
  next unless /\S/;

  # Extract the information that we want with a regex
  my ($id, $program) = /ID: (.+?)\n.*Program: (.+?)\n/s;
  # Build a hash of arrays containing our data
  push @{$program{$program}}, $id;
}

# Now we have all the data we need, so let's display it.

# Keys in %program are the program names
foreach my $p (keys %program) {
  say "PROGRAM: $p\n==>ID";
  # $program{$p} is a reference to an array of IDs
  say "\t$_" for @{$program{$p}};
  say '';
}
C:/> programs.pl < programs.txt