List 从具有多个相似值和键的文件构造Python字典
我对python还不熟悉,也不熟悉一般的编码,我正试图用它来分析工作中的一些数据。我有这样一个文件:List 从具有多个相似值和键的文件构造Python字典,list,dictionary,typeerror,file-io,List,Dictionary,Typeerror,File Io,我对python还不熟悉,也不熟悉一般的编码,我正试图用它来分析工作中的一些数据。我有这样一个文件: HWI-ST591_0064:5:1101:1228:2111#0/1 + 7included 11 A>G - - HWI-ST591_0064:5:1101:1205:2125#0/1 + genomic 17 A>G - - HWI-ST591_0064:5:1101:1178:2129#0/1 + 7included 6
HWI-ST591_0064:5:1101:1228:2111#0/1 + 7included 11 A>G - -
HWI-ST591_0064:5:1101:1205:2125#0/1 + genomic 17 A>G - -
HWI-ST591_0064:5:1101:1178:2129#0/1 + 7included 6 A>C 8 A>T
HWI-ST591_0064:5:1101:1176:2164#0/1 + 7included 6 A>T 8 A>G
HWI-ST591_0064:5:1101:1199:2234#0/1 + 7included 14 T>C 21 G>A
HWI-ST591_0064:5:1101:1208:2249#0/1 + 7included 32 C>T - -
{'32C>T--': ['HWI-ST591_0064:5:1101:1208:2249#0/1'],
'6A>C8A>C': ['HWI-ST591_0064:5:1101:1318:2090#0/1'],
'36A>G--': ['HWI-ST591_0064:5:1101:1425:2093#0/1'],
'----': ['HWI-ST591_0064:5:1101:1222:2225#0/1'],
'6A>C8A>T': ['HWI-ST591_0064:5:1101:1178:2129#0/1','HWIST591_0064:5:1101:1176:2164#0/1']}
制表符分隔。我试图创建一个字典,其中包含行的第一个值,一个唯一标识符,作为一个值列表,与作为键的最后4个值相匹配,如下所示:
HWI-ST591_0064:5:1101:1228:2111#0/1 + 7included 11 A>G - -
HWI-ST591_0064:5:1101:1205:2125#0/1 + genomic 17 A>G - -
HWI-ST591_0064:5:1101:1178:2129#0/1 + 7included 6 A>C 8 A>T
HWI-ST591_0064:5:1101:1176:2164#0/1 + 7included 6 A>T 8 A>G
HWI-ST591_0064:5:1101:1199:2234#0/1 + 7included 14 T>C 21 G>A
HWI-ST591_0064:5:1101:1208:2249#0/1 + 7included 32 C>T - -
{'32C>T--': ['HWI-ST591_0064:5:1101:1208:2249#0/1'],
'6A>C8A>C': ['HWI-ST591_0064:5:1101:1318:2090#0/1'],
'36A>G--': ['HWI-ST591_0064:5:1101:1425:2093#0/1'],
'----': ['HWI-ST591_0064:5:1101:1222:2225#0/1'],
'6A>C8A>T': ['HWI-ST591_0064:5:1101:1178:2129#0/1','HWIST591_0064:5:1101:1176:2164#0/1']}
这样,我就可以得到一个唯一标识的列表,并进行计数或排序,或者做我需要做的其他事情。我可以制作字典,但是当我试图将它输出到文件时,我得到了一个错误。我想问题是因为这是一个列表,我不断地得到错误
文件trial.py,第33行,在
outFile.write%s\t%s\n%'\t'.joinkey,mutReadDict[key]
TypeError:不可损坏的类型:“列表”
有没有办法让它工作起来,这样我就可以把它归档?我试过了。我试着在for循环中编写字典,但似乎不起作用。谢谢,这是我的代码:
inFile = open('path', 'rU')
outFile = open('path', 'w')
from collections import defaultdict
mutReadDict = defaultdict(list)
for line in inFile:
entry = line.strip('\n').split('\t')
fastQ_ID = entry[0]
strand = entry[1]
chromosome = entry[2]
mut1pos = entry[3]
mut1base = entry[4]
mut2pos = entry[5]
mut2base = entry[6]
mutKey = mut1pos + mut1base + mut2pos + mut2base
if chromosome == '7included':
mutReadDict[mutKey].append(fastQ_ID)
else:
pass
keyList = [mutReadDict.keys()]
keyList.sort()
for key in keyList:
outFile.write("%s\t%s\n" % ('\t' .join(key, mutReadDict[key])))
outFile.close()
我想你想要:
keyList = mutReadDict.keys()
而不是
keyList = [mutReadDict.keys()]
你可能也是这个意思:
for key in keyList:
outFile.write("%s\t%s\n" % (key, '\t'.join(mutReadDict[key])))
啊哈!非常感谢。我还必须修复。加入。键需要位于联接函数的外部。好极了,不用担心。如果答案回答了您的问题,请接受。