Perl哈希格式
我有一个如下的日志文件Perl哈希格式,perl,Perl,我有一个如下的日志文件 ID: COM-1234 Program: Swimming Name: John Doe Description: Joined on July 1st ------------------------------ID: COM-2345 Program: Swimming Name: Brock sen Description: Joined on July 1st ------------------------------ID: COM-9876 Program:
ID: COM-1234
Program: Swimming
Name: John Doe
Description: Joined on July 1st
------------------------------ID: COM-2345
Program: Swimming
Name: Brock sen
Description: Joined on July 1st
------------------------------ID: COM-9876
Program: Swimming
Name: johny boy
Description: Joined on July 1st
------------------------------ID: COM-9090
Program: Running
Name: justin kim
Description: Good Record
------------------------------
我想根据程序(游泳、跑步等)对其进行分组,并希望显示如下:
PROGRAM: Swimming
==>ID
COM-1234
COM-2345
COM-9876
PROGRAM: Running
==>ID
COM-9090
我对Perl非常陌生,我写了下面这篇文章(不完整)
#/usr/bin/perl
使用数据::转储程序;
$/ = "%%%%";
打开(文件“D:\\mine\\out.txt”);
而()
{
@temp=拆分(/-{20,}/,$);
}
关闭(文件);
我的%hash=@new;
打印转储程序(\%hash);
我从perl教程中读到,散列键值对将采用具有多个值的唯一键,但不确定如何使用它
我能够读取文件并将其存储到哈希中,但不确定如何处理为上述格式。非常感谢您的帮助。谢谢。始终放置
使用警告代码>和使用严格代码>位于程序顶部。并始终为open使用三个参数
open my $fh, "<", "D:\\mine\\out.txt";
my %hash;
while (<$fh>){
if(/ID/)
{
my $nxt = <$fh>;
s/.*?ID: //g;
$hash{"$nxt==>ID \n"}.=" $_";
}
}
print %hash;
I您的输入文件程序
位于ID
之后的一行。所以我用
my$nxt=
现在程序被存储到$nxt
变量中
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash = ();
open my $IN, "<", "your file name here" or die "Error: $!\n";
while (<$IN>) {
if ($_ =~ m/^\s*-*ID:\s*COM/) {
(my $id) = ($_ =~ m/\s*ID:\s*(.*)/);
my $prog_name = <$IN>;
chomp $prog_name;
$prog_name =~ s/Program/PROGRAM/;
$hash{$prog_name} = [] unless $hash{$prog_name};
push @{$hash{$prog_name}}, $id;
}
}
close $IN;
print Dumper(\%hash);
让我们看一下这两行:
$hash{$prog_name} = [] unless $hash{$prog_name};
push @{$hash{$prog_name}}, $id;
如果哈希未定义,则第一行将启动一个空数组引用作为值。第二行将ID推送到该数组的末尾(与第一行无关)
此外,第一行不是强制性的。Perl知道如果您只编写push@{$hash{$prog_name}},$id代码>并将其解释为“转到此键的值”,如果它不存在,则创建它。现在你说这个值是一个数组,然后你把$id
推到列表中。我总是喜欢这样写程序,这样它们就可以从STDIN中读取,因为这使它们更灵活
我会这样做:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
# Read one "chunk" of data at a time
local $/ = '------------------------------';
# A hash to store our results.
my %program;
# While there's data on STDIN...
while (<>) {
# Remove the end of record marker
chomp;
# Skip any empty records
# (i.e. ones with no printable characters)
next unless /\S/;
# Extract the information that we want with a regex
my ($id, $program) = /ID: (.+?)\n.*Program: (.+?)\n/s;
# Build a hash of arrays containing our data
push @{$program{$program}}, $id;
}
# Now we have all the data we need, so let's display it.
# Keys in %program are the program names
foreach my $p (keys %program) {
say "PROGRAM: $p\n==>ID";
# $program{$p} is a reference to an array of IDs
say "\t$_" for @{$program{$p}};
say '';
}
C:/> programs.pl < programs.txt
非常感谢,效果很好。再次感谢您的详细解释。@Goku,不客气。您应该阅读Perl的这一特性。为什么要进行两次匹配?当然,第一个版本可以变成:if(my($id)=m/\s*id:\s*(.*)/{…}
$hash{$prog\u name}=(),除非$hash{$prog\u name}
这是不必要的(因为自动激活),而且是错误的(应该是[]
,而不是()
)。@DaveCross我提到这一行是不必要的,只是想在他不熟悉Perl时尽可能清楚。两个命令的匹配中断的原因相同。当然,你是对的,你可以在一行中做到这一点。关于括号,为什么错了?在没有错误或警告的情况下为我工作。您已经提到了\n
那么s
标志在您的模式中有什么用呢?与其使用非贪婪的否定字符类,它还提供相同的结果,但步骤更少。它不需要对该数据执行/s
。但我假设ID和程序名并不总是在相邻的行上。可能是杀伤力太大了。我猜你指的是打印转储程序(\%hash)
或类似的东西。
$hash{$prog_name} = [] unless $hash{$prog_name};
push @{$hash{$prog_name}}, $id;
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
# Read one "chunk" of data at a time
local $/ = '------------------------------';
# A hash to store our results.
my %program;
# While there's data on STDIN...
while (<>) {
# Remove the end of record marker
chomp;
# Skip any empty records
# (i.e. ones with no printable characters)
next unless /\S/;
# Extract the information that we want with a regex
my ($id, $program) = /ID: (.+?)\n.*Program: (.+?)\n/s;
# Build a hash of arrays containing our data
push @{$program{$program}}, $id;
}
# Now we have all the data we need, so let's display it.
# Keys in %program are the program names
foreach my $p (keys %program) {
say "PROGRAM: $p\n==>ID";
# $program{$p} is a reference to an array of IDs
say "\t$_" for @{$program{$p}};
say '';
}
C:/> programs.pl < programs.txt