Python 计算氨基酸频率的问题
我正在写一个程序来确定给定序列中每个氨基酸的百分比。我试图让它对任何序列都有用,通过让它输出每个氨基酸的百分比,并告诉我字典中哪些氨基酸不存在。我在这里遇到了一些困难,希望能得到一些指导 因此,更详细的说明是,我试图使其输出显示所提供字符串中每个氨基酸的百分比,包括那些不在字符串中的氨基酸 这是我目前的代码:Python 计算氨基酸频率的问题,python,Python,我正在写一个程序来确定给定序列中每个氨基酸的百分比。我试图让它对任何序列都有用,通过让它输出每个氨基酸的百分比,并告诉我字典中哪些氨基酸不存在。我在这里遇到了一些困难,希望能得到一些指导 因此,更详细的说明是,我试图使其输出显示所提供字符串中每个氨基酸的百分比,包括那些不在字符串中的氨基酸 这是我目前的代码: protein = """MKLFWLLFTIGFCWAQYSSNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIHNPFRPWWERYQPVS
protein = """MKLFWLLFTIGFCWAQYSSNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIHNPFRPWWERYQPVSYKLCTRSGNEDEFRNMVTRCNNVGVRIYVDAVINHMCGNAVSAGTSSTCGSYFNPGSRDFPAVPYSGWDFNDGKCKTGSGDIENYNDATQVRDCRLSGLLDLALGKDYVRSKIAEYMNHLIDIGVAGFRIDASKHMWPGDIKAILDKLHNLNSNWFPEGSKPFIYQEVIDLGGEPIKSSDYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKMAVGFMLAHPYGFTRVMSSYRWPRYFENGKDVNDWVGPPNDNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVNFRNVVDGQPFTNWYDNGSNQVAFGRGNRGFIVFNNDDWTFSLTLQTGLPAGTYCDVISGDKINGNCTGIKIYVSDDGKAHFSISNSAEDPFIAIHAESKL""" #exchange sequence for unique analysis
amino_acid = ['C', 'D', 'S', 'Q', 'K', 'P', 'T', 'F', 'A', 'X', 'G', 'I', 'E', 'L', 'H', 'R', 'W', 'M', 'N', 'Y', 'V']
for a in amino_acid:
if a in protein:
print "Percentage of" + amino_acid[a] + "is" + ((protein.count(a)) * 100 / len(protein))
else:
print amino_acid[a] + "is not in sequence"
这是我以前做过的,有效,但不会显示完全不出现的氨基酸(0%)
关于您试图解决的问题,有必要进行更详细的讨论,但根据您上面提供的要点,我非常确定您要查找的是
计数器
对象
具体而言:
>>> from collections import Counter:
>>> test = Counter("""MKLFWLLFTIGFCWAQYSSNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIHNPFRPWWERYQPVSYKLCTRSGNEDEFRNMVTRCNNVGVRIYVDAVINHMCGNAVSAGTSSTCGSYFNPGSRDFPAVPYSGWDFNDGKCKTGSGDIENYNDATQVRDCRLSGLLDLALGKDYVRSKIAEYMNHLIDIGVAGFRIDASKHMWPGDIKAILDKLHNLNSNWFPEGSKPFIYQEVIDLGGEPIKSSDYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKMAVGFMLAHPYGFTRVMSSYRWPRYFENGKDVNDWVGPPNDNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVNFRNVVDGQPFTNWYDNGSNQVAFGRGNRGFIVFNNDDWTFSLTLQTGLPAGTYCDVISGDKINGNCTGIKIYVSDDGKAHFSISNSAEDPFIAIHAESKL""")
>>> test
Counter({'G': 52, 'N': 41, 'D': 35, 'V': 35, 'S': 33, 'F': 29, 'I': 28, 'R': 28, 'A': 27, 'L': 27, 'K': 24, 'T': 23, 'P': 22, 'Y': 21, 'E': 20, 'W': 19, 'C': 12, 'H': 12, 'Q': 12, 'M': 11})
这应该足够你继续下去了。如果你有任何问题,请告诉我。尽量不要用勺子把答案全舀出来。你能提供一个比“不工作”更有用的问题描述吗?你不能使用
氨基酸[a]
,只要使用a
。for的循环迭代数组的值,因此a
将是'C'
,'D'
,…是的,“不工作”不是很重要。。。给我们一个输出示例,包括任何错误消息,以及您期望的输出。(编辑你的问题并将其添加到那里,不要只是在评论中告诉我们)你应该检查一下,他们有很多很好的代码来做这些类型的分析我的气味作业……你可以将计数器
s键与氨基酸
列表进行比较(使用设置
s,因为顺序不相关)。
protein = """MKLFWLLFTIGFCWAQYSSNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIHNPFRPWWERYQPVSYKLCTRSGNEDEFRNMVTRCNNVGVRIYVDAVINHMCGNAVSAGTSSTCGSYFNPGSRDFPAVPYSGWDFNDGKCKTGSGDIENYNDATQVRDCRLSGLLDLALGKDYVRSKIAEYMNHLIDIGVAGFRIDASKHMWPGDIKAILDKLHNLNSNWFPEGSKPFIYQEVIDLGGEPIKSSDYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKMAVGFMLAHPYGFTRVMSSYRWPRYFENGKDVNDWVGPPNDNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVNFRNVVDGQPFTNWYDNGSNQVAFGRGNRGFIVFNNDDWTFSLTLQTGLPAGTYCDVISGDKINGNCTGIKIYVSDDGKAHFSISNSAEDPFIAIHAESKL""" #exchange sequence for unique analysis
amino_acid = ['C', 'D', 'S', 'Q', 'K', 'P', 'T', 'F', 'A', 'X', 'G', 'I', 'E', 'L', 'H', 'R', 'W', 'M', 'N', 'Y', 'V']
counts = {}
for amino in amino_acid: counts[amino] = 0
total = 0
for a in amino_acid:
if a in protein:
counts[a] = protein.count(a)
fraction = float(counts[a]) / float(len(protein))
percent = fraction * 100
print "Percentage of " + a + " is: %.2f%%" % percent
total += percent
else:
print "Percentage of " + a + " is: 0.0%"
print 'Total: ', str(total) + '%'
print 'Amino Acid Counts: ', counts
protein = """MKLFWLLFTIGFCWAQYSSNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIHNPFRPWWERYQPVSYKLCTRSGNEDEFRNMVTRCNNVGVRIYVDAVINHMCGNAVSAGTSSTCGSYFNPGSRDFPAVPYSGWDFNDGKCKTGSGDIENYNDATQVRDCRLSGLLDLALGKDYVRSKIAEYMNHLIDIGVAGFRIDASKHMWPGDIKAILDKLHNLNSNWFPEGSKPFIYQEVIDLGGEPIKSSDYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKMAVGFMLAHPYGFTRVMSSYRWPRYFENGKDVNDWVGPPNDNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVNFRNVVDGQPFTNWYDNGSNQVAFGRGNRGFIVFNNDDWTFSLTLQTGLPAGTYCDVISGDKINGNCTGIKIYVSDDGKAHFSISNSAEDPFIAIHAESKL""" #exchange sequence for unique analysis
amino_acid = ['C', 'D', 'S', 'Q', 'K', 'P', 'T', 'F', 'A', 'X', 'G', 'I', 'E', 'L', 'H', 'R', 'W', 'M', 'N', 'Y', 'V']
counts = {}
for amino in amino_acid: counts[amino] = 0
total = 0
for a in amino_acid:
if a in protein:
counts[a] = protein.count(a)
fraction = float(counts[a]) / float(len(protein))
percent = fraction * 100
print "Percentage of " + a + " is: %.2f%%" % percent
total += percent
else:
print "Percentage of " + a + " is: 0.0%"
print 'Total: ', str(total) + '%'
print 'Amino Acid Counts: ', counts