Python 3.x 如何找到A';s在DNA字符串行中并返回长度?
我遇到了以下问题,不知道如何解决。问题在于找到重复A的最长子字符串的长度,并返回此列表中每个字符串的长度值: ['>KF735813.1 HIV-1分离物喀麦隆1(ViroSeq)HIV DR 02来自喀麦隆 pol蛋白(pol)基因,部分cds', “CCTCAATCATCTTTGGCAACGCACCCTTAGTACAGTTAGATAGAGAGGGACAGTTATAGAGAGAGAGAGGCCCTATTAGACAGAG”, “ggcagatgatacagattagagagataaatttaccagagagaatgagaaccaaatgatgagagaggattagagttta”, “TCAAGTAAGACAGTAGATAGATAGATAGATAGATAGATAGATTGGAAGAGGCATAGTAGGATAGAGATAGAGATAGAGATAGAGATAGAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATTGTGGAAGAGATAGATAGATAGATAGATAGATAGAT, “CCTGTCAATAATGGACGAACATGTGACTCAGATTGGTTGTACTTTAATTCCAATTCCAATTCCATATGTCCATTTAGATTGAACTGT”, “GCCAGTAAATAAGCCAGTATGGATGGCCCAAGGTAAACAAAGATGGCCATGACAGAGAGAAAAAAAAAAGCATAA”, “CAGAAATTTGTACAGAATGGAAGGAGGAAATTCAATGGGCCTGAAATCCATATACTCCAGATTT”, “GCCATAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAATATATAGAATGAGAGAAAGAGAGAGAGAGACTTCTG”, “GGAGATCCAATTAGAATACCTCTCCATCCCGGGATAAAGAAACAATCAGATCAGTAACAGTAGATCATGGGATGCAT”, “ATTTTCAGTTCTCTAGATATAGATAGATAGAGATAGATAGATACACTGCATTCATATAGTTATTAAAATGCAACAGGT”, “Attagataccataccaatgtgcttccaaggatggaaaggatcagatttcaggcaagcaagcagatttcaggcaagcaagcaagcaagcaagcaatttcaggcaagcaagcaagcaagcaaaatctt”, “agagccttagagaaaatatccagaatatagagagatgatctacatatatggatgatttagatagatcagattagaga”, “TagggCagagagagagagagagagagagagagagagagagagagagagagagagagagagagacaaaaaatagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagagaga, “CAGAAAGAACCTCATTCTTGTGGATGACTCCATCCATCCAGCAATGGACAGTCGCCTATACAGTGCCAGA”, “aaaagacagctgactgtcaatatacagaattaggaaactaattgggcaagtcagattatgcaggatta”, “AAGTAAAGCAAGCTGTAGAGCTCCTCAGGGAGCCAAAGCAAGCAAGACTAACAGAGGAGTACCACTAACTGAGGAGAGAGGAGAGATTA”, “gaattggcagataagagggagatctaaagaaagactgtactagagagtatattagaccaaaagctagtagcaga”, “aatacagagagagagagagaggaagagac”] 这是我尝试过的函数,但我知道这是错误的方法:Python 3.x 如何找到A';s在DNA字符串行中并返回长度?,python-3.x,string,list,dna-sequence,Python 3.x,String,List,Dna Sequence,我遇到了以下问题,不知道如何解决。问题在于找到重复A的最长子字符串的长度,并返回此列表中每个字符串的长度值: ['>KF735813.1 HIV-1分离物喀麦隆1(ViroSeq)HIV DR 02来自喀麦隆 pol蛋白(pol)基因,部分cds', “CCTCAATCATCTTTGGCAACGCACCCTTAGTACAGTTAGATAGAGAGGGACAGTTATAGAGAGAGAGAGGCCCTATTAGACAGAG”, “ggcagatgatacagattagagagataaatttacc
for c in range(len(fastarec_Lines)):
if fastarec_Lines[c].count('A') == current:
count += 1
else:
count = 1
current = fastarec_Lines[c]
maximum = max(count,maximum)
return maximum
有人能帮我吗?一种方法是在模式
a+
上执行regex find all搜索。然后,根据长度对结果字符串进行排序,并打印出最后一个元素:
seq = "AATTGGCCAAAAATTGCA"
matches = re.findall(r'A+', seq)
matches.sort(lambda x,y: cmp(len(x), len(y)))
print("longest string is " + matches[-1] + " with a length of " + str(len(matches[-1])))
这张照片是:
longest string is AAAAA with a length of 5
一种方法是在模式
a+
上执行regex find all搜索。然后,根据长度对结果字符串进行排序,并打印出最后一个元素:
seq = "AATTGGCCAAAAATTGCA"
matches = re.findall(r'A+', seq)
matches.sort(lambda x,y: cmp(len(x), len(y)))
print("longest string is " + matches[-1] + " with a length of " + str(len(matches[-1])))
这张照片是:
longest string is AAAAA with a length of 5