List 从具有多个相似值和键的文件构造Python字典

List 从具有多个相似值和键的文件构造Python字典,list,dictionary,typeerror,file-io,List,Dictionary,Typeerror,File Io,我对python还不熟悉,也不熟悉一般的编码,我正试图用它来分析工作中的一些数据。我有这样一个文件: HWI-ST591_0064:5:1101:1228:2111#0/1 + 7included 11 A>G - - HWI-ST591_0064:5:1101:1205:2125#0/1 + genomic 17 A>G - - HWI-ST591_0064:5:1101:1178:2129#0/1 + 7included 6

我对python还不熟悉,也不熟悉一般的编码,我正试图用它来分析工作中的一些数据。我有这样一个文件:

    HWI-ST591_0064:5:1101:1228:2111#0/1 +   7included   11  A>G -   -
    HWI-ST591_0064:5:1101:1205:2125#0/1 +   genomic 17  A>G -   -
    HWI-ST591_0064:5:1101:1178:2129#0/1 +   7included   6   A>C 8   A>T
    HWI-ST591_0064:5:1101:1176:2164#0/1 +   7included   6   A>T 8   A>G
    HWI-ST591_0064:5:1101:1199:2234#0/1 +   7included   14  T>C 21  G>A
    HWI-ST591_0064:5:1101:1208:2249#0/1 +   7included   32  C>T -   -
     {'32C>T--': ['HWI-ST591_0064:5:1101:1208:2249#0/1'], 
    '6A>C8A>C': ['HWI-ST591_0064:5:1101:1318:2090#0/1'], 
    '36A>G--': ['HWI-ST591_0064:5:1101:1425:2093#0/1'], 
     '----': ['HWI-ST591_0064:5:1101:1222:2225#0/1'], 
    '6A>C8A>T': ['HWI-ST591_0064:5:1101:1178:2129#0/1','HWIST591_0064:5:1101:1176:2164#0/1']}
制表符分隔。我试图创建一个字典,其中包含行的第一个值,一个唯一标识符,作为一个值列表,与作为键的最后4个值相匹配,如下所示:

    HWI-ST591_0064:5:1101:1228:2111#0/1 +   7included   11  A>G -   -
    HWI-ST591_0064:5:1101:1205:2125#0/1 +   genomic 17  A>G -   -
    HWI-ST591_0064:5:1101:1178:2129#0/1 +   7included   6   A>C 8   A>T
    HWI-ST591_0064:5:1101:1176:2164#0/1 +   7included   6   A>T 8   A>G
    HWI-ST591_0064:5:1101:1199:2234#0/1 +   7included   14  T>C 21  G>A
    HWI-ST591_0064:5:1101:1208:2249#0/1 +   7included   32  C>T -   -
     {'32C>T--': ['HWI-ST591_0064:5:1101:1208:2249#0/1'], 
    '6A>C8A>C': ['HWI-ST591_0064:5:1101:1318:2090#0/1'], 
    '36A>G--': ['HWI-ST591_0064:5:1101:1425:2093#0/1'], 
     '----': ['HWI-ST591_0064:5:1101:1222:2225#0/1'], 
    '6A>C8A>T': ['HWI-ST591_0064:5:1101:1178:2129#0/1','HWIST591_0064:5:1101:1176:2164#0/1']}
这样,我就可以得到一个唯一标识的列表,并进行计数或排序,或者做我需要做的其他事情。我可以制作字典,但是当我试图将它输出到文件时,我得到了一个错误。我想问题是因为这是一个列表,我不断地得到错误

文件trial.py,第33行,在 outFile.write%s\t%s\n%'\t'.joinkey,mutReadDict[key] TypeError:不可损坏的类型:“列表”

有没有办法让它工作起来,这样我就可以把它归档?我试过了。我试着在for循环中编写字典,但似乎不起作用。谢谢,这是我的代码:

inFile = open('path', 'rU')
outFile = open('path', 'w')

from collections import defaultdict

mutReadDict = defaultdict(list)

 for line in inFile:
entry               = line.strip('\n').split('\t')
fastQ_ID            = entry[0]
strand              = entry[1]
chromosome          = entry[2]
mut1pos             = entry[3]
mut1base            = entry[4]
mut2pos             = entry[5]
mut2base            = entry[6]

mutKey = mut1pos + mut1base + mut2pos + mut2base

if chromosome == '7included':
    mutReadDict[mutKey].append(fastQ_ID)
else:
    pass

keyList = [mutReadDict.keys()]
keyList.sort()

for key in keyList:
outFile.write("%s\t%s\n" % ('\t' .join(key, mutReadDict[key])))

outFile.close()
我想你想要:

keyList = mutReadDict.keys()
而不是

keyList = [mutReadDict.keys()]
你可能也是这个意思:

for key in keyList:
    outFile.write("%s\t%s\n" % (key, '\t'.join(mutReadDict[key])))

啊哈!非常感谢。我还必须修复。加入。键需要位于联接函数的外部。好极了,不用担心。如果答案回答了您的问题,请接受。