Macos 打开阅读框程序，不打印氨基酸序列_Macos_Perl

Macos 打开阅读框程序，不打印氨基酸序列

macos perl

Macos 打开阅读框程序，不打印氨基酸序列,macos,perl,Macos,Perl,我正在开发一个程序，可以读取一个基因序列，给我开放阅读框（ORF），然后是每个ORF的蛋白质序列。我已经得到了寻找ORF的代码——但是没有氨基酸会打印出来。我正在Mac上使用Perl 我想得到代码来告诉我从开放阅读框中产生的氨基酸串这是我的密码： #!/usr/bin/perl #ORF_Find.txt -> finds long orfs in a DNA sequence open(CHROM, "chr03.txt"); #Open file chr03.txt conta

我正在开发一个程序，可以读取一个基因序列，给我开放阅读框（ORF），然后是每个ORF的蛋白质序列。我已经得到了寻找ORF的代码——但是没有氨基酸会打印出来。我正在Mac上使用Perl

我想得到代码来告诉我从开放阅读框中产生的氨基酸串

这是我的密码：

#!/usr/bin/perl
#ORF_Find.txt -> finds long orfs in a DNA sequence

open(CHROM, "chr03.txt");   #Open file chr03.txt containing yeastchrom. 3
$DNA = "";                          #start with empty DNA sequence
$header = <CHROM>;                  #get header of sequence

#Read line from file, join to end of $DNA, repeat until end of file

while ($current_line = <CHROM>)
{
    chomp($current_line);           #remove newline from end of current_line
    $DNA= $DNA . $current_line;
}

#length of DNA sequence
$DNA_length = length($DNA);

#flag for ORF finder
$inORF=0;

#number of ORFs found
$numORFs = 0;

#minimum length
$minimum_codons =100;

#search each reading frame
for ($frame =0; $frame<3; $frame++)
{
    print "\nFinding ORFs in frame: +" . ($frame + 1) . "\n";

    #search for sequence match and print position of match if found
    for ($i =frame; $i<=($DNA_length-3);$i += 3)
    {
        #get current codon from sequence
        $codon= substr ($DNA, $i, 3);

        #if not in orf search for ATG, else search for stop codon
        if ($inORF == 0)
        {
            #if current codon is ATG, start ORF
            if ($codon eq "ATG")
            {
                $inORF = 1;
                $ORF_length = 1;
                $ORF_start = $i;
            }
        }
        elsif($inORF ==1)
        {
            #if current codon is a stop codon, end ORF
            if ($codon eq "TGA" || $codon eq "TAG" || $codon eq "TAA")
            {
                #if ORF has at least min number of codons,print location
                if ($ORF_length >= $minimum_codons)
                {
                    print "FOUND ORF AT POSITION $ORF_start,";
                    print "length = $ORF_length\n";
                    $numORFs++;
                }

                #reset ORF variables
                $inORF = 0;
                $ORF_length = 0;
            }
            else
            {
                #increase length of ORF by one codon
                $ORF_length++;
            }
        }
    }
}

#change T to U
$DNA =~ s/T/U/g;

#search each ORF
for ($i=$ORF_start; $i<=($ORF_length-3); $i+=3)
{
    #get codon from each ORF
    $aa_codon= substr($DNA, $i, 3);

    #find amino acids
    foreach ($aa_codon eq "ATG")
    {
        print ("M")     #METHIONINE
    }
    foreach ($aa_codon =~/UU[UC]/)
    {
        print ("F")     #PHENYLALANINE
    }
    foreach ($aa_codon =~/UU[AG]/ || $aa_codon=~/CU[UCAG]/)
    {
        print ("L");    #LEUCINE
    }
    foreach ($aa_codon =~/AU[UAC]/)
    {
        print ("I");    #ISOLEUCINE
    }
    foreach ($aa_codon =~/GU[UACG]/)
    {
        print ("V");    #VALINE
    }
    foreach ($aa_codon =~/UC[UCAG]/ || $aa_codon=~/AG[UC]/)
    {
        print ("S");    #SERINE
    }
    foreach ($aa_codon =~/CC[UCAG]/)
    {
        print ("P");    #PROLINE
    }
    foreach ($aa_codon =~/AC[UCAG]/)
    {
        print ("T");    #THREONINE
    }
    foreach ($aa_codon =~/GC[UCAG]/)
    {
        print ("A");    #ALANINE
    }
    foreach ($aa_codon =~/UA[UC]/)
    {
        print ("Y");    #TYROSINE
    }
    foreach ($aa_codon =~/CA[UC]/)
    {
        print ("H");    #HISTIDINE
    }
    foreach ($aa_codon =~/CA[AG]/)
    {
        print ("G");    #GLUTAMINE
    }
    foreach ($aa_codon =~/AA[UC]/)
    {
        print ("N");    #ASPARAGINE
    }
    foreach ($aa_codon =~/AA[AG]/)
    {
        print ("K");    #LYSINE
    }
    foreach ($aa_codon =~/GA[UC]/)
    {
        print ("D");    #ASPARTIC ACID
    }
    foreach ($aa_codon =~/GA[AG]/)
    {
        print ("E");    #GLUTAMIC ACID
    }
    foreach ($aa_codon =~/UG[UC]/)
    {
        print ("C");    #CYSTINE
    }
    foreach ($aa_codon eq "UGG")
    {
        print ("W");    #TRYPTOPHAN
    }
    foreach ($aa_codon =~/AG[AG]/ || $aa_codon =~/CG[UCAG]/)
    {
        print ("R");    #ARGININE
    }
    foreach ($aa_codon =~/GG[UCAG]/)
    {
        print ("G");    #GLYCINE
    }
    foreach ($aa_codon =~/UA[AG]/|| $aa_codon eq "UGA")
    {
        print ("*")     #STOP
    }

}
#if no ORFS found, print message
if ($numORFs ==0)
{
    print ("NO ORFS FOUND\n");
}
else
{
    print ("\n$num_ORFs ORFS WERE FOUND\n");
}

#/usr/bin/perl
#ORF_Find.txt->在DNA序列中查找长ORF
打开（CHROM，“chr03.txt”）#打开包含yeastchrom的文件chr03.txt。3.
$DNA=“”#从空的DNA序列开始
$header=#获取序列的标题
#从文件中读取行，连接到$DNA的末尾，重复此操作直到文件末尾
while（$current_line=）
{
chomp（$current_line）#从当前_行末尾删除换行符
$DNA=$DNA.$当前_行；
}
#DNA序列长度
$DNA_长度=长度（$DNA）；
#ORF查找器的标志
$inORF=0；
#找到的ORF数
$numORFs=0；
#最小长度
$最小密码子=100；
#搜索每个阅读框
对于（$frame=0；$frame首先，这个问题可能更适合seqAnswers或BioStars这样的论坛。除此之外，编写自己的6帧翻译脚本是一项复杂的任务，特别是如果你想解释IUPAC不明确的核苷酸。已经有很多脚本和工具可以这样做。可能是ea我能提出的最明智的建议是使用现有的工具之一。试试我的，例如：

我的脚本到目前为止还没有公开。我已经打开它，以便您可以使用它。只需运行它即可获得使用情况
除此之外，如果您想让您的版本正常运行，您可以做的第一件事是将您的she bang更改为：
#!/usr/bin/perl -w

注意-w
。然后，将这一行添加到脚本顶部：
use strict;

它将帮助您调试一些问题，例如for循环之一中缺少美元符号：
for ($i =frame; $i<=($DNA_length-3);$i += 3)

关于（$i=frame；$i请看什么是一个好问题。具体来说，一些示例日期，一个实际/期望输出的示例会有所帮助。请记住，我们熟悉编程术语，但不一定是像“ORF”这样的科学术语感谢您的输入-希望我已经说得更清楚一点。如果没有输入文件chr03.txt和预期的输出，我们很难为您调试。在您的故障排除中，哪一行代码具体执行了意外操作？
for ($i =$frame; $i<=($DNA_length-3);$i += 3)