Perl-将文件文本解析为哈希
我想解析一个文件文本,然后将其放入散列。我的文件看起来像:Perl-将文件文本解析为哈希,perl,parsing,hash,Perl,Parsing,Hash,我想解析一个文件文本,然后将其放入散列。我的文件看起来像: key1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val, val,val,val,val key2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val, val,val,val,val key3 val key4 val,val key5 val,v
key1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key3 val
key4 val,val
key5 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val
我的键在空格之前,我的值是空格之后和每个逗号之前的元素列表。我有一些行没有键,因为值在几行上继续
所以我想要这样的散列(我对Python最熟悉):
我的代码:
`
my%hashNames;
打开infle,“./file.txt”或die$!;
我的@temp=();
while(我的$line=)
{
my@names=split/[\t,]/,$line;
my$ID=$names[0];
如果($line=~/\t/)
{
我的@temp=();
对于(我的$i=1;$i<@names;$i+=1)
{
推送(@temp,$names[$i]);
}
}
其他的
{
对于(我的$i=0;$i<@names;$i+=1)
{
推送(@temp,$names[$i]);
}
}
}`
给你:
my %results;
my $key;
while(my $line = <INFILE>) {
chomp($line);
my @items = split(/, */, $line);
$key = shift @items;
$results{$key} = \@items;
}
my%结果;
我的$key;
while(我的$line=){
chomp($line);
my@items=拆分(/,*/,$行);
$key=shift@items;
$results{$key}=\@items;
}
除了您的陈述外,它适用于简单的情况:
我有一些行没有键,因为值在几行上继续
不过,要处理这个问题,您必须解释如何检测下一行是键还是值。如果知道,则可以将其放入If语句中,并使用previous键向哈希添加新值:
my %results;
my $key;
while(my $line = <INFILE>) {
chomp($line);
my @items = split(/, */, $line);
my $tmpkey = shift @items;
if (is_real_key($tmpkey)) {
$key = shift @items;
$results{$key} = \@items;
} else {
push (@{$results{$key}}, $tmpkey, @items);
}
}
my%结果;
我的$key;
while(我的$line=){
chomp($line);
my@items=拆分(/,*/,$行);
my$tmpkey=shift@items;
如果(是真正的钥匙($tmpkey)){
$key=shift@items;
$results{$key}=\@items;
}否则{
push(@{$results{$key}},$tmpkey,@items);
}
}
您的问题是新行不再分隔记录。因此,处理它的一种方法是禁用无效的默认输入记录分隔符$/
,并模拟一个有效的分隔符:
use strict;
use warnings;
use Data::Dumper;
my %hash;
my $file;
{
local $/; # disable input record separator
$file = <DATA>; # entire file here now!
}
for my $line (split /^(?=\S+ )/m, $file) { # records begin this way now
$line =~ s/\n//g; # remove newlines
my ($key, $val) = split ' ', $line, 2; # divide into two fields
$hash{$key} = [ split /,/, $val ]; # store the data
}
print Dumper \%hash;
__DATA__
key1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key3 val
key4 val,val
key5 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val
使用严格;
使用警告;
使用数据::转储程序;
我的%hash;
我的$file;
{
本地$/;#禁用输入记录分隔符
$file=#现在将整个文件放在这里!
}
对于我的$line(split/^(?=\S+)/m,$file){#记录现在就这样开始了
$line=~s/\n//g;#删除换行符
my($key,$val)=拆分“”,$line,2;#分为两个字段
$hash{$key}=[split/,/,$val];#存储数据
}
打印转储程序\%hash;
__资料__
键1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
瓦尔,瓦尔,瓦尔,瓦尔
键2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
瓦尔,瓦尔,瓦尔,瓦尔
键3 val
键4瓦尔,瓦尔
键5 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val
解释:
- 使用
修饰符在/m
上拆分意味着/^(?=\S+)/m
现在将匹配字符串内的换行符,这将模拟输入记录分隔符^
- 在两个字段中拆分字符串的方法是在
split
- 我们使用一个匿名数组
直接拆分成散列,其中包含一个split语句[…]
$/
设置为正则表达式。这就留下了三个舒适的解决方案:
split/(?)来获得实际记录
my %hash = do {
local $/; # set to undef, for slurp
map {
my ($key, $vals) = split /\s+/, $_, 2; # split on first whitespace, into two strings
$key => [ split /\s*,\s*/, $vals ]; # return a list of a key and a value array
} split /(?<!,)\n/, <FILE>; # split the file into records
};
!/usr/bin/perl
严格使用;
使用警告;
使用特征“说”;
使用数据::转储程序;
我的$res_hash={};
我的($current_key,$values);
我的$push_又来了;
while(我的$line=){
chomp$行;
push({$resu hash->{$current_key},split(/,/,$values)),如果($current_key和$values和($index($line,)>0));
如果(索引($line,)>0){
$push_再次=0;
($current_key,$values)=拆分(/\s/,$line);
}否则{
$values.=$line;
$push_再次=1;
}
};
如果$push\u再次出现,则push(@{$res\u hash->{$current\u key},split(/,/,$values));
说“结果:”.Dumper($res_hash);
__资料__
键1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
瓦尔,瓦尔,瓦尔,瓦尔
键2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
瓦尔,瓦尔,瓦尔,瓦尔
键3 val
键4瓦尔,瓦尔
键5 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val
使用模块
!/usr/bin/env perl
严格使用;
使用警告;
使用Parse::RecDescent;
我们的%hash;
我的$p=Parse::RecDescent->new(q!
散列:条目
条目:键值(s/,/){$::散列{$item[1]}=[@{$item[2]}]}
密钥:/\S+/
值:/([^,\n]\\\,])+/
!);
死“$0:未能创建解析器”,除非定义了$p;
my$text=do{{{local$/;}};
$p->hash($text)或死“$0:parse failed”;
for(排序键%hash){
打印“$\=>valx”,标量@{$hash{$\},“\n”;
}
__资料__
键1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
瓦尔,瓦尔,瓦尔,瓦尔
键2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
瓦尔,瓦尔,瓦尔,瓦尔
键3 val
键4瓦尔,瓦尔
键5 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val
输出:
key1 => val x 22
key2 => val x 22
key3 => val x 1
key4 => val x 2
key5 => val x 52
键1=>val x 22
键2=>val x 22
键3=>val x 1
键4=>val x 2
key5=>val x 52向我们展示您的尝试如果您知道如何使用Python,请向我们展示。我认为我必须阅读每一行,如果有空格,我必须创建一个新键并推送不同的值(在列表中),如果
my %hash = do {
local $/; # set to undef, for slurp
map {
my ($key, $vals) = split /\s+/, $_, 2; # split on first whitespace, into two strings
$key => [ split /\s*,\s*/, $vals ]; # return a list of a key and a value array
} split /(?<!,)\n/, <FILE>; # split the file into records
};
my %hash;
while(<FILE>) {
$_ .= <FILE> while /,\n\z/;
my ($key, $value) = split /\s+/, $_, 2;
push @{ $hash{$key} }, split /\s*,\s*/, $value; # allow multiple occurrences of one key, simply append values to list.
}
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my $res_hash = {};
my ($current_key, $values);
my $push_again;
while ( my $line = <DATA>) {
chomp $line;
push ( @{ $res_hash->{$current_key} }, split(/,/, $values) ) if ( $current_key and $values and ( index($line, ' ') > 0) );
if ( index($line, ' ') > 0 ){
$push_again = 0;
($current_key, $values) = split( /\s/, $line);
} else {
$values .= $line;
$push_again = 1;
}
};
push ( @{ $res_hash->{$current_key} }, split(/,/, $values) ) if $push_again;
say "result:".Dumper($res_hash);
__DATA__
key1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key3 val
key4 val,val
key5 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val
#! /usr/bin/env perl
use strict;
use warnings;
use Parse::RecDescent;
our %hash;
my $p = Parse::RecDescent->new(q!
hash: entry(s?)
entry: key value(s /,/) { $::hash{$item[1]} = [ @{ $item[2] } ] }
key: /\S+/
value: /([^,\n]|\\,])+/
!);
die "$0: failed to create parser" unless defined $p;
my $text = do {{ local $/; <DATA> }};
$p->hash($text) or die "$0: parse failed";
for (sort keys %hash) {
print "$_ => val x ", scalar @{ $hash{$_} }, "\n";
}
__DATA__
key1 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key2 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val
key3 val
key4 val,val
key5 val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,val,
val,val,val,val,val,val,val,val,val,val,val,val,val,val,val
key1 => val x 22
key2 => val x 22
key3 => val x 1
key4 => val x 2
key5 => val x 52