Python 寻找一段DNA的最长回文子串

Python 寻找一段DNA的最长回文子串,python,python-3.7,palindrome,dna-sequence,Python,Python 3.7,Palindrome,Dna Sequence,我必须做一个函数,它可以打印出DNA片段中最长的回文子串。我已经写了一个函数来检查一段DNA本身是否是回文。请参见下面的函数 def make_complement_strand(DNA): complement=[] rules_for_complement={"A":"T","T":"A","C":"G","G":"C"} for letter in DNA: complement.append(rules_for_complement[letter]

我必须做一个函数,它可以打印出DNA片段中最长的回文子串。我已经写了一个函数来检查一段DNA本身是否是回文。请参见下面的函数

def make_complement_strand(DNA):
    complement=[]
    rules_for_complement={"A":"T","T":"A","C":"G","G":"C"}
    for letter in DNA:
        complement.append(rules_for_complement[letter])
    return(complement)

def is_this_a_palindrome(DNA): 
        DNA=list(DNA)
        if DNA!=(make_complement_strand(DNA)[::-1]):     
            print("false")                  
            return False
        else:                             
            print("true")
            return True

is_this_a_palindrome("GGGCCC") 
但是现在:如何使函数打印DNA字符串的最长回文子字符串


回文在遗传学背景下的含义与单词和句子的定义略有不同。由于双螺旋是由5'-3'方向相反的两条成对核苷酸链组成的,并且核苷酸总是以相同的方式配对(腺嘌呤(a)和胸腺嘧啶(T)用于DNA,尿嘧啶(U)用于RNA;胞嘧啶(C)和鸟嘌呤(G)),a(单链)核苷酸序列若等于其反向补体,则称为回文序列。例如,DNA序列ACCTAGGT是回文的,因为它的核苷酸补体是TGGATCCA,而颠倒补体中核苷酸的顺序会得到原始序列。

这里,这应该是获得最长回文子串的合适起点

def make_complement_strand(DNA):
    complement=[]
    rules_for_complement={"A":"T","T":"A","C":"G","G":"C"}
    for letter in DNA:
        complement.append(rules_for_complement[letter])
    return(complement)

def is_this_a_palindrome(DNA): 
        DNA=list(DNA)
        if DNA!=(make_complement_strand(DNA)[::-1]):     
            #print("false")                  
            return False
        else:                             
            #print("true")
            return True


def longest_palindrome_ss(org_dna, palindrone_func):
    '''
    Naive implementation-

    We start with 2 pointers.
    i starts at start of current subsqeunce and j starts from i+1 to end
    increment i with every loop

    Uses palindrome function provided by user

    Further improvements- 
    1. Start with longest sequence instead of starting with smallest. i.e. start with i=0 and j=final_i and decrement.
    '''
    longest_palin=""
    i=j=0
    last_i=len(org_dna)
    while i < last_i:
        j=i+1
        while j < last_i:
            current_subsequence = org_dna[i:j+1]
            if palindrone_func(current_subsequence):
                if len(current_subsequence)>len(longest_palin):
                    longest_palin=current_subsequence
            j+=1
        i+=1
    print(org_dna, longest_palin)
    return longest_palin


longest_palindrome_ss("GGGCCC", is_this_a_palindrome)
longest_palindrome_ss("GAGCTT", is_this_a_palindrome)
longest_palindrome_ss("GGAATTCGA", is_this_a_palindrome)

更清楚地解释你对回文的定义。顺便说一句,您正在将
DNA
参数设置为
函数开头的空列表。这是回文()
函数吗?您只需对照后半部检查一半字符串reversed@muyustan,我添加了一个定义!现在更清楚了吗?@stark,我明白你的意思,但我是一个python新手,不知道怎么做。更好的是,现在,任何人都可以在没有DNA序列信息的情况下解决这个问题:)
mahorir@mahorir-Vostro-3446:~/Desktop$ python3 dna_paln.py 
GGGCCC GGGCCC
GAGCTT AGCT
GGAATTCGA GAATTC