Python 全局对齐序列函数

Python 全局对齐序列函数,python,dynamic-programming,bioinformatics,sequence-alignment,Python,Dynamic Programming,Bioinformatics,Sequence Alignment,我试图实现在全局对齐函数中获得最低分数,但是当两个序列相等时,我没有得到最低分数0,而是得到了8 这个代码有什么问题 alphabet = ["A", "C", "G", "T"] score = [[0, 4, 2, 4, 8], \ [4, 0, 4, 2, 8], \ [2, 4, 0, 4, 8], \ [4, 2, 4, 0, 8], \ [8, 8, 8, 8, 8]] def globalAlignment(x, y): #Dynamic

我试图实现在全局对齐函数中获得最低分数,但是当两个序列相等时,我没有得到最低分数0,而是得到了8

这个代码有什么问题

alphabet = ["A", "C", "G", "T"] 
score = [[0, 4, 2, 4, 8], \
     [4, 0, 4, 2, 8], \
     [2, 4, 0, 4, 8], \
     [4, 2, 4, 0, 8], \
     [8, 8, 8, 8, 8]]

def globalAlignment(x, y):
#Dynamic version very fast
    D = []
    for i in range(len(x)+1):
        D.append([0]* (len(y)+1))

    for i in range(1, len(x)+1):
        D[i][0] = D[i-1][0] + score[alphabet.index(x[i-1])][-1]
    for i in range(len(y)+1):
        D[0][i] = D[0][i-1]+ score[-1][alphabet.index(y[i-1])]

    for i in range(1, len(x)+1):
        for j in range(1, len(y)+1):
            distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
            distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])]
            if x[i-1] == y[j-1]:
                distDiag = D[i-1][j-1]
            else:
                distDiag = D[i-1][j-1] + score[alphabet.index(x[i-1])][alphabet.index(y[j-1])]

            D[i][j] = min(distHor, distVer, distDiag)

    return D[-1][-1]

x = "ACGTGATGCTAGCAT"
y = "ACGTGATGCTAGCAT"
print(globalAlignment(x, y))
至少

distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])]
是可疑的,因为您在初始化中没有对[-1]使用相同的位置, 而且这两个距离不太可能在权重中使用相同的方向…
我想应该是这样

score[alphabet.index(x[i-1])][-1]

但这可能不是唯一的错误…

我解决了这个问题,在最后一个分数列表中用0代替8

alphabet = ["A", "C", "G", "T"] 
score = [[0, 4, 2, 4, 8], \
     [4, 0, 4, 2, 8], \
     [2, 4, 0, 4, 8], \
     [4, 2, 4, 0, 8], \
     [0, 0, 0, 0, 0]]

def globalAlignment(x, y):
#Dynamic version very fast
D = []
for i in range(len(x)+1):
    D.append([0]* (len(y)+1))

for i in range(1, len(x)+1):
    D[i][0] = D[i-1][0] + score[alphabet.index(x[i-1])][-1]
for i in range(len(y)+1):
    D[0][i] = D[0][i-1]+ score[-1][alphabet.index(y[i-1])]

for i in range(1, len(x)+1):
    for j in range(1, len(y)+1):
        distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
        distVer = D[i-1][j]+ score[alphabet.index(x[i-1])][-1]
        if x[i-1] == y[j-1]:
            distDiag = D[i-1][j-1]
        else:
            distDiag = D[i-1][j-1] + score[alphabet.index(x[i-1])][alphabet.index(y[j-1])]

        D[i][j] = min(distHor, distVer, distDiag)

return D[-1][-1]    

只需将范围内i(len(y)+1)的
更改为范围内i(1,len(y)+1):
和->
distVer=D[i-1][j]+分数[-1][alphabet.index(x[i-1])]

distVer = D[i - 1][j] + score[alphabet.index(x[i - 1])][-1]

欢迎来到StackOverflow。请按照您创建此帐户时的建议,阅读并遵循帮助文档中的发布指南,具体来说,你没有做到“让别人更容易帮助你”。你有一些“神奇的”距离计算,你仍然对我们隐瞒。一个字母的变量名、令人眼花缭乱的下标序列——没有文档、解释或调试跟踪。为什么你期望这些计算结果为0?它是如何到达
8
的?失败的中间步骤是什么?请看这个可爱的博客寻求帮助。我从Ben Langmead在coursera给我们带来的DNA测序算法课程中获得了这段代码。代码在他的机器上正常运行,但在我自己的机器上无法得到相同的结果。嗨,我解决了这个问题,在socre分数的最后列表中用0代替8=[[0,4,2,4,8],\[4,0,4,2,8],\[2,4,0,4,8],\[4,2,4,0,8],\[0,0,0,0]]