Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/silverlight/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Regex Perl未初始化值哈希查找基因符号_Regex_Perl_Hash_Initialization - Fatal编程技术网

Regex Perl未初始化值哈希查找基因符号

Regex Perl未初始化值哈希查找基因符号,regex,perl,hash,initialization,Regex,Perl,Hash,Initialization,更新(2): 已将代码更改为丢弃标头中的注释,但仍在哈希键/值分配中使用语法: “$geneSymbolToGo{”附近的./convertDataToGeneSymbol.pl第99行出现语法错误 “}”附近的./convertDataToGeneSymbol.pl第101行出现语法错误 我似乎在代码中找不到任何错误,所以我认为数组无法读取$go的值,也许 以下是输入文件3的标题: !!10-20行注释 UniProtK/t BA0A021WW37/t CG17167/t GO:0016021

更新(2):

已将代码更改为丢弃标头中的注释,但仍在哈希键/值分配中使用语法:

“$geneSymbolToGo{”附近的./convertDataToGeneSymbol.pl第99行出现语法错误 “}”附近的./convertDataToGeneSymbol.pl第101行出现语法错误

我似乎在代码中找不到任何错误,所以我认为数组无法读取$go的值,也许

以下是输入文件3的标题:

!!10-20行注释

UniProtK/t BA0A021WW37/t CG17167/t GO:0016021/t GO\u参考号:0000038
(仍在学习如何在此网站上设置格式;/t表示制表符分隔)

顺便说一句,我对这些评论感到抱歉。我的教授要求对我们的节目进行广泛的评论。Strict给了我一些关于这个程序的问题(主要是因为我缺乏经验),但是当我删除它时,我得到了我想要的结果。谢谢你一直以来的帮助

#!/usr/bin/perl
use warnings;
use diagnostics;

# Title: convertDataToGeneSymbol.pl
# Author: Nicholas Bense
# Date: 11/4/15

# Open a filehandle to read file #1
open(INF1,"<",'/scratch/Drosophila/fb_synonym_fb_2014_05.tsv' ) or die $!;

# Open a filehandle to read file #2
open(INF2,"<",'/scratch/Drosophila/FlyRNAi_data_baseline_vs_EGF.txt') or die $!;

# Open a filehandle to read file #3
open(INF3,"<",'/scratch/Drosophila/gene_association.goa_fly') or die $!;

# Open a filehandle to write new file
open(OUTF1,">",'FlyRNAi_data_baseline_vs_EGFSymbol.txt') or die $!;

# Open a filehandle to write new file
open(OUTF2,">",'FlyRNAi_data_baseline_vs_EGF_GO.txt') or die $!;

# Initialize a hash for the gene symbol conversion
my %geneSymbolConversion;

# Read input file 1 line by line
while (<INF1>){

# Get rid of whitespace
        chomp;

# Split the line
        my @inf1Array = split("\t", $_);

# Filter entries starting with FBgn
        if ($inf1Array[0] =~ /(^FBgn\d+)/){

# Assign column 1 to hash key scalar
        my $geneID = $inf1Array[0];

# Assign column 2 to hash value scalar
        my $geneSymbol = $inf1Array[1];

# Assign key and value to hash
        $geneSymbolConversion{$geneID} = $geneSymbol;

}

}

# Discard first line of input file 2
<INF2>;

# Read input file 2 line by line
while (<INF2>){


        # Get rid of whitespace
        chomp;

        # Split the line on tabs
        my ($geneID, $egf_Baseline, $egf_Stimulus) = split("\t", $_);

        # Check if the codon is present in the hash
        if (defined $geneSymbolConversion{$geneID}){

                # Get the value associated with the codon from the hash
                $geneSymbol = $geneSymbolConversion{$geneID};
        }

        # Join data and print to output file
        print OUTF1 join( "\t", $geneSymbol, $egf_Baseline, $egf_Stimulus), "\n";
}

# Initialize hash for GO conversion
my %geneSymbolToGo;

<INF3>;

# Read input file 3 line by line
while (<INF3>){

        # Get rid of whitespace
        chomp;

        # Discard comment lines
        if ($_ !~ /!/){

        # Split the line on tabs
        my @inf3Array = split("\t", $_);

        # Assign column 3 to hash key scalar
        my $geneSymbol = $inf3Array[2];

        # Assign column 4 to hash value scalar
        my $go = $inf3Array[3];

        # Assign key and value to hash
        my $geneSymbolToGo{$geneSymbol} = $go;
        }
}

# Open a filehandle to read file #3
open(INF4,"<",'FLYRNAi_data_baseline_vs_EGFSymbol.txt') or die $!;

# Read input file 4 line by line
while (<INF4>){

        # Remove end of line characters
        chomp;

        # Split the line on tabs
        my ($geneSymbol, $egf_Baseline, $egf_Stimulus), "\n";

        # Check if the gene symbol is present in the hash
        if (defined $geneSymbolToGo{$geneSymbol}){

                # Get the value associated with the codon from the hash
                $go = $geneSymbolToGo{$geneSymbol};

        }

        # Join data and print to output file
        print OUTF2 join( "\t", $go, $egf_Baseline, $egf_Stimulus), "\n";
}
#/usr/bin/perl
使用警告;
使用诊断;
#标题:convertDataToGeneSymbol.pl
#作者:尼古拉斯·本斯
#日期:2015年11月4日
#打开文件句柄以读取文件#1
打开(INF1,“
  • 总是

    在每个Perl程序开始时。
    使用诊断功能
    不太有用,除非您无法理解这两个程序产生的错误消息

  • 如果您有许多磁盘操作要执行,那么
    使用autodie
    非常有用,可以避免在每次操作后编写合理的代码来捕获任何错误,如
    或die$!

  • 始终使用词法文件句柄

    open my $inf1_fh, '<', '/scratch/Drosophila/fb_synonym_fb_2014_05.tsv'
    
    作为

  • 您的正则表达式
    /(^FBgn\d+)/
    捕获匹配的字符串,但从未使用该捕获,因此您应该只编写
    /^FBgn\d+/

  • 我不明白你在
    循环时用
    做什么

    while ( $INF1Array[0] =~ /(^FBgn\d+)/ ) { ... }
    
    因为
    $inf1数组[0]
    (应该是
    $inf1\u数组[0]
    )在循环体中从未更改,所以它永远不会终止。我猜
    while
    应该是
    if

  • 使用Perl定义的or运算符

    my $geneSymbol = "NA";
    
    if ( defined $geneSymbolConversion{$geneID} ) {
        $geneSymbol = $geneSymbolConversion{$geneID};
    }
    
    你应该

    my $gene_symbol = $conversion{$gene_id} // 'NA'
    
这是我在写一些更完美、更实用的东西方面的尝试。它远不是一个复杂的程序,所以我认为它根本不需要任何注释。它们占据的垂直空间比它们在解释中所弥补的更为清晰

#!/usr/bin/perl

use strict;
use warnings 'all';
use autodie;

my %conversion;

{
    open my $in_fh,  '<', '/scratch/Drosophila/fb_synonym_fb_2014_05.tsv';

    while ( <$in_fh> ) {
        chomp;

        my ($gene_id, $gene_symbol) = split /\t/;
        $conversion{$gene_id} = $gene_symbol if $gene_id =~ /^FBgn\d+/;
    }
}

{
    open my $in_fh,  '<', '/scratch/Drosophila/FlyRNAi_data_baseline_vs_EGF.txt';
    open my $out_fh, '>', 'FLYRNAi_data_baseline_vs_EGFSymbol.txt';

    while ( <$in_fh> ) {
        chomp;

        my ( $gene_id, $egf_baseline, $egf_stimulus ) = split /\t/;

        my $gene_symbol = $conversion{$gene_id} // 'NA';

        print $out_fh join("\t", $gene_id, $gene_symbol, $egf_baseline, $egf_stimulus), "\n";
    }
}
!/usr/bin/perl
严格使用;
使用“全部”警告;
使用自动模具;
我的%转换;
{

打开我的$in_fh,'请还包括输入的样本,以便我们可以复制问题。到PerlMonks。第34行似乎是一个无限循环。为什么要
使用诊断
,而不是
使用严格的
?诊断会大大降低速度,对您的问题没有帮助。建议:修复缩进,然后使用
my $geneSymbol = "NA";

if ( defined $geneSymbolConversion{$geneID} ) {
    $geneSymbol = $geneSymbolConversion{$geneID};
}
my $gene_symbol = $conversion{$gene_id} // 'NA'
#!/usr/bin/perl

use strict;
use warnings 'all';
use autodie;

my %conversion;

{
    open my $in_fh,  '<', '/scratch/Drosophila/fb_synonym_fb_2014_05.tsv';

    while ( <$in_fh> ) {
        chomp;

        my ($gene_id, $gene_symbol) = split /\t/;
        $conversion{$gene_id} = $gene_symbol if $gene_id =~ /^FBgn\d+/;
    }
}

{
    open my $in_fh,  '<', '/scratch/Drosophila/FlyRNAi_data_baseline_vs_EGF.txt';
    open my $out_fh, '>', 'FLYRNAi_data_baseline_vs_EGFSymbol.txt';

    while ( <$in_fh> ) {
        chomp;

        my ( $gene_id, $egf_baseline, $egf_stimulus ) = split /\t/;

        my $gene_symbol = $conversion{$gene_id} // 'NA';

        print $out_fh join("\t", $gene_id, $gene_symbol, $egf_baseline, $egf_stimulus), "\n";
    }
}