Perl:将制表符分隔的文件放入数组中,并遍历特定列号后的每个元素
我是perl新手,我有一个文本文件,我正在尝试格式化它 以下是制表符分隔文件的示例:Perl:将制表符分隔的文件放入数组中,并遍历特定列号后的每个元素,perl,Perl,我是perl新手,我有一个文本文件,我正在尝试格式化它 以下是制表符分隔文件的示例: Num Let Re Al Samp1 Samp2 Samp3 Samp4 Samp5 1 dog R R ./. 0/0 ./. 0/0 0/0 2 dog S P 0/0 ./. 0/1 ./. ./. 3 cat P P 0/1 ./. ./.
Num Let Re Al Samp1 Samp2 Samp3 Samp4 Samp5
1 dog R R ./. 0/0 ./. 0/0 0/0
2 dog S P 0/0 ./. 0/1 ./. ./.
3 cat P P 0/1 ./. ./. 0/0 0/1
4 horse S S 0/0 0/0 0/0 ./. 0/1
5 cow P R ./. 0/1 ./. ./. ./.
我想让程序检查整个文件。所以,如果要逐行进行,当它达到0/0时,它需要用该行的“Re”值替换它。如果存在0/1,则需要将其替换为该行的“Al”值。对于其他的一切,只要用“NA”来代替它就行了。我也只是加入前两列(Num:Let)。以下是示例输出:
Num:Let Re Al Samp1 Samp2 Samp3 Samp4 Samp5
1:dog R R NA R NA R R
2:dog S P S NA P NA NA
3:cat P P P NA NA P P
4:horse S S S S S NA S
5:cow P R NA R NA NA NA
而且,这只是一个例子。实际文件可能有5个以上的“Samp”列。我尝试将文件拆分成一个数组,并使用foreach循环遍历数组的每个元素。然后,我使用if-else语句检查这些条件,如果条件满足,则替换该元素,但出于某种原因,它只是使所有“Samp”元素与我替换它的字母相同。简言之,我做错了什么,我不知道我做错了什么。谢谢。试试这个尺码:
my @col;
my %data;
while(<DATA>){
chomp;
next unless /^\d+/;
s/\.\/\./NA/g; # swap all ./. for NA
@col = split;
s/0\/0/$col[2]/ foreach @col; # swap all 0/0 for 2nd field
s/0\/1/$col[3]/ foreach @col; # swap all 0/1 for 3rd field
$data{$col[0]} = [ @col[1..8]];
}
print join ("\t", qw(NumLet Re Al Samp1 Samp2 Samp3 Samp4 Samp5));
print "\n";
for (sort keys %data){
print join("\t", $_, @{$data{$_}});
print "\n";
}
这里有一种方法
my $header = <DATA>;
print join("\t", split(' ', $header)), "\n";
while ( my $row = <DATA> ) {
my ($num, $let, $re, $al, @samps) = split(' ', $row);
my @reals = map {
if ( m{0/0} ) { $re }
elsif ( m{0/1} ) { $al }
else { 'NA' }
} @samps;
print join("\t", $num, $let, $re, $al, @reals), "\n";
}
__DATA__
Num Let Re Al Samp1 Samp2 Samp3 Samp4 Samp5
1 dog R R ./. 0/0 ./. 0/0 0/0
2 dog S P 0/0 ./. 0/1 ./. ./.
3 cat P P 0/1 ./. ./. 0/0 0/1
4 horse S S 0/0 0/0 0/0 ./. 0/1
5 cow P R ./. 0/1 ./. ./. ./.
我们也不知道你做错了什么,因为你没有发布任何代码您应该将您的问题包含在不起作用的代码中。(当你在那里的时候,也要修正你的样本数据的格式。)当你用前斜杠处理字符串时,考虑使用卷曲的
{}
(或者{/code>){$col[2]}g
或者s{0/0}$col[2]}g
谢谢。这是非常有用的,它还告诉我为什么我的代码不工作。
my $header = <DATA>;
print join("\t", split(' ', $header)), "\n";
while ( my $row = <DATA> ) {
my ($num, $let, $re, $al, @samps) = split(' ', $row);
my @reals = map {
if ( m{0/0} ) { $re }
elsif ( m{0/1} ) { $al }
else { 'NA' }
} @samps;
print join("\t", $num, $let, $re, $al, @reals), "\n";
}
__DATA__
Num Let Re Al Samp1 Samp2 Samp3 Samp4 Samp5
1 dog R R ./. 0/0 ./. 0/0 0/0
2 dog S P 0/0 ./. 0/1 ./. ./.
3 cat P P 0/1 ./. ./. 0/0 0/1
4 horse S S 0/0 0/0 0/0 ./. 0/1
5 cow P R ./. 0/1 ./. ./. ./.
Num Let Re Al Samp1 Samp2 Samp3 Samp4 Samp5
1 dog R R NA R NA R R
2 dog S P S NA P NA NA
3 cat P P P NA NA P P
4 horse S S S S S NA S
5 cow P R NA R NA NA NA