Python PSET6 DNA:代码仅计算一个STR序列的最长连续重复数
我写了一个代码来计算一长串DNA(取自文本文件)中特定DNA子串序列(取自csv文件)的最长连续重复次数……或者我是这么想的。虽然我的代码可以正确地计算子字符串字典中最后一个DNA子字符串的最长连续重复次数,但它无法计算它前面的其他DNA子字符串的最长连续重复次数。我尝试了多种方法来修复我的代码,但都没有效果。如何计算每个DNA子串(而不仅仅是一个)的最长连续重复次数?欢迎任何意见!这是我的密码:Python PSET6 DNA:代码仅计算一个STR序列的最长连续重复数,python,cs50,Python,Cs50,我写了一个代码来计算一长串DNA(取自文本文件)中特定DNA子串序列(取自csv文件)的最长连续重复次数……或者我是这么想的。虽然我的代码可以正确地计算子字符串字典中最后一个DNA子字符串的最长连续重复次数,但它无法计算它前面的其他DNA子字符串的最长连续重复次数。我尝试了多种方法来修复我的代码,但都没有效果。如何计算每个DNA子串(而不仅仅是一个)的最长连续重复次数?欢迎任何意见!这是我的密码: import sys import csv if len(sys.argv) != 3:
import sys
import csv
if len(sys.argv) != 3:
print("Wrong number of files. Enter correct command-line arguments")
exit(1)
with open (sys.argv[1],'r') as f:
database_reader = csv.reader(f)
strlist = next(database_reader)[1:]
print(strlist)
dna = open(sys.argv[2], "r")
sequence_dna = dna.read()
print(sequence_dna)
long_str = {}
for item in strlist:
long_str[item] = 0
for key in long_str:
i = 0
run = 0
long_run = 0
while i < len(sequence_dna):
if sequence_dna[i:i+len(key)] == key:
run += 1
if run > long_run:
long_run = run
i += len(key)
elif sequence_dna[i:i+len(key)] != key:
if run > long_run:
long_run = run
run = 0
i += 1
long_str[key] = long_run
print(long_str)
for row in database_reader:
individual = row[0]
values = [int(value) for value in row[1:]]
if values == long_str:
print(individual)
break
elif values != long_str:
print("No match")
break
导入系统
导入csv
如果len(sys.argv)!=三:
打印(“错误的文件数。输入正确的命令行参数”)
出口(1)
将open(sys.argv[1],'r')作为f:
数据库_reader=csv.reader(f)
strlist=next(数据库读取器)[1:]
打印(strlist)
dna=打开(系统argv[2],“r”)
序列\ dna=dna.read()
打印(dna序列)
long_str={}
对于strlist中的项目:
long_str[项目]=0
对于长_str中的键:
i=0
运行=0
长期运行=0
而i以下更改的代码应该可以工作。现在的情况是,如果要检查的第一个人(或后面的人)不等于序列文件输入,循环将退出,而不是全部检查,直到找到匹配项(或没有匹配项)。此处的
elif
错误
found = False
for row in database_reader:
individual = row[0]
values = [int(value) for value in row[1:]]
if values == list(long_str.values()):
found = True
print(individual)
break
if found == False:
print('no match')
当我运行我的版本(此处未给出)时,它与本练习给出的答案一致
C:\Old_Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/1.txt
Bob
C:\Old_Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/2.txt
no match
C:\Old_Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/3.txt
no match
C:\Old_Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/4.txt
Alice
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/5.txt
Lavender
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/6.txt
Luna
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/7.txt
Ron
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/8.txt
Ginny
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/9.txt
Draco
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/10.txt
Albus
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/11.txt
Hermione
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/12.txt
Lily
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/13.txt
no match
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/14.txt
Severus
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/15.txt
Sirius
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/16.txt
no match
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/17.txt
Harry
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/18.txt
no match
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/19.txt
Fred
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/20.txt
no match
C:\Old_Data\python\Harvard\dna>python dna2.py databases/large.csv
Usage: python dna2.py <databases/database> <sequences/seq>
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/1.txt
上下快速移动
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/2.txt
没有对手
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/3.txt
没有对手
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/small.csv sequences/4.txt
爱丽丝
C:\Old\u Data\python\Harvard\dna>python dna2.py数据库/large.csv序列/5.txt
薰衣草
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/6.txt
卢娜
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/7.txt
罗恩
C:\Old\u Data\python\Harvard\dna>python dna2.py数据库/large.csv序列/8.txt
金妮
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/9.txt
德拉科
C:\Old\u Data\python\Harvard\dna>python dna2.py数据库/large.csv序列/10.txt
阿不思
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/11.txt
赫敏
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/12.txt
百合花
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/13.txt
没有对手
C:\Old\u Data\python\Harvard\dna>python dna2.py数据库/large.csv序列/14.txt
西弗勒斯
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/15.txt
天狼星
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/16.txt
没有对手
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/17.txt
骚扰
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/18.txt
没有对手
C:\Old\u Data\python\Harvard\dna>python dna2.py databases/large.csv sequences/19.txt
弗莱德
C:\Old\u Data\python\Harvard\dna>python dna2.py数据库/large.csv序列/20.txt
没有对手
C:\Old\u Data\python\Harvard\dna>python dna2.py数据库/large.csv
用法:python dna2.py
哦,我错过了你想知道的事情。可能将缩进long\u str[key]=long\u run
分配给for循环中的所有键。现在,它刚刚被分配了最后一个键和计数。OMG哇,我不能相信我没有认为缩进可能是问题。非常感谢。你的建议奏效了!我可能错了,但最后一个for循环似乎有问题if values==long\u str:
正在检查int列表与作为字典的long\u str
之间是否相等。我认为if的陈述不可能是真的,因为这一点。也许可以这样写:if values==long\u str.values()
,但我不确定这是否行得通。应该改为if values==list(long\u str.values())
谢谢。我这样做了,但现在的问题是,当我将字典值与csv文件中的数据进行比较时,我一直没有得到匹配,而实际上我应该得到匹配。我打印了long\u str
字典,我看到它打印了三次:每个键每次迭代后打印一次。我想这就是为什么我没有得到匹配的输出。现在,我需要帮助找出一种方法来获得最后一次迭代