用于计算正值的Perl脚本不´；t报告所有正值_Perl

用于计算正值的Perl脚本不´；t报告所有正值

perl

用于计算正值的Perl脚本不´；t报告所有正值,perl,Perl,我需要一些关于Perl脚本的帮助。此脚本告诉我零件的值数>=0.5。以下是脚本： use strict; use warnings; my $file = "229E_O.csv"; my @filearray = (); my @array_ids = (); my $thres = 0.5; open (F, $file) or die; while(my $l = <F>) { $l =~ s/\n//g; $l =~ s/\r//g

我需要一些关于Perl脚本的帮助。此脚本告诉我零件的值数>=0.5。以下是脚本：

use strict;
use warnings;

my $file = "229E_O.csv";
my @filearray = ();
my @array_ids = ();
my $thres = 0.5;

open (F, $file) or die;
while(my $l = <F>) {
    $l =~ s/\n//g;
    $l =~ s/\r//g;
    my @cols = split(/\s+/, $l); # divide columns for mora than one space
    next unless (scalar (@cols) == 8); ### If there aren´t 8 column, don´t add to array
    push @filearray, $l;
    my $current_id = $cols[0];
    push @array_ids, $current_id;
}

close F;
my @nr_array_ids = uniq(@array_ids);
foreach my $new_id (@nr_array_ids) { ### for each ID not redundant
    my $counter = 0;
    my $total = 0;
    foreach my $new_L (@filearray) { ### for each line in the file
        my @n_cols = split(/\s+/, $new_L);
        my $potential = $n_cols[5];
        my $idd = $n_cols[0];
        if ($new_id eq $idd) {
            ++$total;
        }

        if ( ($new_id eq $idd) and ($potential >= $thres) ) {
            ++$counter;
        }
    }

    print "$new_id\t$counter\t$total\n";
}
sub uniq {
    my %seen;
    return grep { !$seen{$_}++ } @_;
}

输入有8列（0-8），第5列有对我很重要的值。当该值大于等于0.5时，表示为正。这个脚本工作得很好，但是，我使用了report-other值（>=0.7）。现在，当更改值时，脚本不会在值>=0.5时报告我以下是输出：

APT69890_1_NA   0   197
AFR79257_1_NA   0   198
AGT21345_1_NA   0   200
QJY77970_1_NA   0   199
QJY77962_1_NA   0   200
QEO75985_1_NA   0   199
ARK08620_1_NA   0   202

例如，如果可以看到每个ID，则APT69890_1_NA是输出的“一部分”。第二列为#正值，第三列为所有值这行代码要求每行数据中只有8列，否则该行将被忽略：

next unless (scalar (@cols) == 8); ### If there aren´t 8 column, don´t add to array

但是在您的数据文件中，您想要计算的那一行（数据的第253行-0.552045）添加了一个额外的列“#正值”，共9列

APT69890_1_NA   netOGlyc-4.0.0.13       CARBOHYD        682     682     0.552045        .       .       #POSITIVE

因此，该行被拒绝。您的197条记录比APT69890_1_NA数据文件中的条目总数少一条。进一步证明这是一个数据错误

删除第9列或更改条件以容忍第9列

如果您想记录您的数据，那么在本例中，您只需删除#正值（第9列）并将其移动到前一行即可。然后将忽略它，因为它只包含一列：

#POSITIVE
APT69890_1_NA   netOGlyc-4.0.0.13       CARBOHYD        682     682     0.552045        .       .

或者，您可以更改条件以允许至少8列。这样做的好处是保持数据不变，但会削弱数据的有效性检查：

next unless (scalar (@cols) >= 8); ### If there are less than 8 columns, don´t add to array

这行代码要求每行数据中只有8列，否则该行将被忽略：

next unless (scalar (@cols) == 8); ### If there aren´t 8 column, don´t add to array

但是在您的数据文件中，您想要计算的那一行（数据的第253行-0.552045）添加了一个额外的列“#正值”，共9列

APT69890_1_NA   netOGlyc-4.0.0.13       CARBOHYD        682     682     0.552045        .       .       #POSITIVE

因此，该行被拒绝。您的197条记录比APT69890_1_NA数据文件中的条目总数少一条。进一步证明这是一个数据错误

删除第9列或更改条件以容忍第9列

如果您想记录您的数据，那么在本例中，您只需删除#正值（第9列）并将其移动到前一行即可。然后将忽略它，因为它只包含一列：

#POSITIVE
APT69890_1_NA   netOGlyc-4.0.0.13       CARBOHYD        682     682     0.552045        .       .

或者，您可以更改条件以允许至少8列。这样做的好处是保持数据不变，但会削弱数据的有效性检查：

next unless (scalar (@cols) >= 8); ### If there are less than 8 columns, don´t add to array

请调查以下脚本是否符合您的任务

注意：脚本假定您在上提供的数据文件是一致的

定义
```
$threshold
```
分配哈希以保存
```
%数据
```


在while（）循环中逐行读取文件

尝试拆分行并获取$id
和$value
如果未定义$id
，请跳到下一行
增加此特定$id
如果未定义正计数器，则将其初始化为0
如果$val
>=$threshold
读取所有数据时，输出结果为循环


注意：作为/script.pl数据文件运行
use strict;
use warnings;
use feature 'say';

my $threshold = 0.5;
my %data;

while( <> ) {
    my($id,$val) = (split)[0,5];
    next unless $id;
    $data{$id}{count}++;
    $data{$id}{pos} //= 0;
    $data{$id}{pos}++ if $val >= $threshold;
}


say join("\t",$_,$data{$_}->@{qw/pos count/}) for sort keys %data;


使用严格；
使用警告；
使用特征“说”；
我的$threshold=0.5；
我的%数据；
而（）{
我的（$id，$val）=（分割）[0,5]；
下一步除非$id；
$data{$id}{count}++；
$data{$id}{pos}/=0；
$data{$id}{pos}++if$val>=$threshold；
}
对排序键%data说join（“\t”，$\u，$data{$\u}->@{qw/pos count/}）；
请调查以下脚本是否符合您的任务
注意：脚本假定您在上提供的数据文件是一致的

定义$threshold
分配哈希以保存%数据

在while（）循环中逐行读取文件

尝试拆分行并获取$id
和$value
如果未定义$id
，请跳到下一行
增加此特定$id
如果未定义正计数器，则将其初始化为0
如果$val
>=$threshold
读取所有数据时，输出结果为
循环

注意：作为/script.pl数据文件运行
use strict;
use warnings;
use feature 'say';

my $threshold = 0.5;
my %data;

while( <> ) {
    my($id,$val) = (split)[0,5];
    next unless $id;
    $data{$id}{count}++;
    $data{$id}{pos} //= 0;
    $data{$id}{pos}++ if $val >= $threshold;
}


say join("\t",$_,$data{$_}->@{qw/pos count/}) for sort keys %data;


使用严格；
使用警告；
使用特征“说”；
我的$threshold=0.5；
我的%数据；
而（）{
我的（$id，$val）=（分割）[0,5]；
下一步除非$id；
$data{$id}{count}++；
$data{$id}{pos}/=0；
$data{$id}{pos}++if$val>=$threshold；
}
对排序键%data说join（“\t”，$\u，$data{$\u}->@{qw/pos count/}）；
谢谢您的回答。知道脚本是如何工作的吗！我删除了所有的#POSIVITE值，现在脚本告诉我何时有值>=0.5。谢谢你的时间！！！我真的很感谢你的帮助和你的多种想法！谢谢你的回答。知道脚本是如何工作的吗！我删除了所有的#POSIVITE值，现在脚本告诉我何时有值>=0.5。谢谢你的时间！！！我真的很感谢你的帮助和你的多种想法！