List 如何根据特定字符制作列表？_List_Python 2.7_Sublist_Listproperty_List Processing

List 如何根据特定字符制作列表？

list python-2.7

List 如何根据特定字符制作列表？,list,python-2.7,sublist,listproperty,list-processing,List,Python 2.7,Sublist,Listproperty,List Processing,我有一个文件包含以下行： B99990001 1 2 3 4 B99990001 1 3 3 4 B99990002 1 2 3 4 B99990002 1 3 3 4 B99990003 1 2 3 4 B99990003 1 3 3 4 因此，我的目标是创建一个主列表，该列表应包含三个子列表，分别基于行的第一列B99990001、B99990002、B99990003： Mainlist=[ ['B99990001 1 2 3 4','B99990001 1 3 3

我有一个文件包含以下行：

B99990001 1 2 3 4
B99990001 1 3 3 4
B99990002 1 2 3 4
B99990002 1 3 3 4
B99990003 1 2 3 4
B99990003 1 3 3 4

因此，我的目标是创建一个主列表，该列表应包含三个子列表，分别基于行的第一列B99990001、B99990002、B99990003：

Mainlist=[ 
          ['B99990001 1 2 3 4','B99990001 1 3 3 4'],#sublist1 has B99990001
          ['B99990002 1 2 3 4','B99990002 1 3 3 4'],#sublist2 has B99990002
          ['B99990002 1 2 3 4','B99990002 1 3 3 4'] #sublist3 has B99990002
                                                                                ]

我希望，我的问题是可以理解的。如果有人知道，你能帮我解决这个问题吗

先谢谢你

看看我的真实例子：

import os
import re
pdbPathAndName = ['/Users/Mahesh/Documents/MAHESH_INTERNSHIP_2014  /ENZOWP2/2WC5_090715_170128/E3P/E3P.B99990001.pdb','/Users/Mahesh/Documents/MAHESH_INTERNSHIP_2014/ENZOWP2/2WC5_090715_170128/E3P/E3P.B99990002.pdb']

''' /Users/Mahesh/Documents/MAHESH_INTERNSHIP_2014/ENZOWP2/2WC5_090715_170128/E3P/E3P.B99990001.pdb=[
                    'ATOM    138  SG  CYS    19       4.499   4.286   8.260  1.00 71.96           S',
                    'ATOM    397  SG  CYS    50      14.897   3.238   9.338  1.00 34.60           S',
                    'ATOM    424  SG  CYS    54       5.649   5.914   8.639  1.00 42.68           S',
                    'ATOM    774  SG  CYS    97      12.114  -6.864  23.897  1.00 62.23           S',
                    'ATOM    865  SG  CYS   108      15.200   3.910  11.227  1.00 54.49           S'    ]

/Users/Mahesh/Documents/MAHESH_INTERNSHIP_2014/ENZOWP2/2WC5_090715_170128/E3P/E3P.B99990002.pdb=[
                    'ATOM    929  SG  CYS   117      13.649  -6.894  22.589  1.00106.90           S',
                    'ATOM    138  SG  CYS    19       4.499   4.286   8.260  1.00 71.96           S',
                    'ATOM    397  SG  CYS    50      14.897   3.238   9.338  1.00 34.60           S',
                    'ATOM    424  SG  CYS    54       5.649   5.914   8.639  1.00 42.68           S',
                    'ATOM    774  SG  CYS    97      12.114  -6.864  23.897  1.00 62.23           S',
                    'ATOM    865  SG  CYS   108      15.200   3.910  11.227  1.00 54.49           S',
                    'ATOM    929  SG  CYS   117      13.649  -6.894  22.589  1.00106.90           S'    ] '''


for path in pdbPathAndName:
    f = open(path, 'r').readlines()
    f = map(lambda x: x.strip(), f)
    for line in f:
        if "SG" in line and line.endswith("S"):
             print (path.split("/")[-1] + "_" + re.split('\s+', line)[1] + ":" + re.split('\s+', line)[5] + ":" +re.split('\s+', line)[6] + ":" + re.split('\s+', line)[7])

#PRINTED OUTPUT
'''E3P.B99990001.pdb_138:6.923:0.241:6.116
   E3P.B99990001.pdb_397:15.856:3.506:8.144
   E3P.B99990001.pdb_424:8.558:1.315:6.627
   E3P.B99990001.pdb_774:14.204:-5.490:24.812
   E3P.B99990001.pdb_865:15.545:4.258:10.007
   E3P.B99990001.pdb_929:16.146:-6.081:24.770

   E3P.B99990002.pdb_138:4.499:4.286:8.260
   E3P.B99990002.pdb_397:14.897:3.238:9.338
   E3P.B99990002.pdb_424:5.649:5.914:8.639
   E3P.B99990002.pdb_774:12.114:-6.864:23.897
   E3P.B99990002.pdb_865:15.200:3.910:11.227
   E3P.B99990002.pdb_929:13.649:-6.894:22.589'''

  #MY EXPECTED OUTPUT 
''' MainlIst=[
            ['E3P.B99990001.pdb_138:6.923:0.241:6.116'
            'E3P.B99990001.pdb_397:15.856:3.506:8.144'
            'E3P.B99990001.pdb_424:8.558:1.315:6.627'
            'E3P.B99990001.pdb_774:14.204:-5.490:24.812'
            'E3P.B99990001.pdb_865:15.545:4.258:10.007'
            'E3P.B99990001.pdb_929:16.146:-6.081:24.770']#sublist1

            ['E3P.B99990002.pdb_138:4.499:4.286:8.260'
            'E3P.B99990002.pdb_397:14.897:3.238:9.338'
            'E3P.B99990002.pdb_424:5.649:5.914:8.639'
            'E3P.B99990002.pdb_774:12.114:-6.864:23.897'
            'E3P.B99990002.pdb_929:13.649:-6.894:22.589']#sublist2
                                                            ]'''
#then use thes sublists to make combinations
    for sublists in mainlist:
         Combinatedlist=map(dict,itertools.combinations(sublists.iteritems(), 2))
#since it is sublist there wont be any crossing between sublist1 and  sublist2 while doing combinations

但如果你能给我建议你的方法，我仍然没有得到正确的结果

嗨，伙计们，我得到了一个答案，我只是在每个博客之间加入了特定的模式，并根据相同的模式创建子列表，然后将其组合起来

My code:

import fileinput
import os
import re
import itertools
import math
import sys

pdbPathAndName = ['/Users/Mahesh/Documents/MAHESH_INTERNSHIP_2014/ENZOWP2/2WC5_090715_170128/E3P/E3P.B99990001.pdb','/Users/Mahesh/Documents/MAHESH_INTERNSHIP_2014/ENZOWP2/2WC5_090715_170128/E3P/E3P.B99990002.pdb']

ATOM_COORDINATE=[]
for path in pdbPathAndName:
    f = open(path, 'r').readlines()
    f = map(lambda x: x.strip(), f)
    for line in f:
        if "SG" in line and line.endswith("S"):
        ATOM_COORDINATE.append(path.split("/")[-1] + "_" + re.split('\s+', line)[1] + ":" + re.split('\s+', line)[5] + ":" +re.split('\s+', line)[6] + ":" + re.split('\s+', line)[7])
ATOM_COORDINATE.append("foo")

#Making Mainlist with sublists by splitting "foo" pattern
sub = []
for item in ATOM_COORDINATE:
    if item == 'foo':
         ATOM_COORDINATE.append(sub)
         sub = []
    else:
        sub.append(item)
 #Making combinations out of sublists
 COMBINATION=[]
 for sublists in sub:
     for L in range(2, len(sublists), 4):
        for subset in itertools.combinations(sublists, L):
            COMBINATION.append(subset)

OUTPUT:
MainlistWithSublists:
[['E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_397:15.856:3.506:8.144', 'E3P.B99990001.pdb_424:8.558:1.315:6.627', 'E3P.B99990001.pdb_774:14.204:-5.490:24.812', 'E3P.B99990001.pdb_865:15.545:4.258:10.007', 'E3P.B99990001.pdb_929:16.146:-6.081:24.770'], ['E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_397:14.897:3.238:9.338', 'E3P.B99990002.pdb_424:5.649:5.914:8.639', 'E3P.B99990002.pdb_774:12.114:-6.864:23.897', 'E3P.B99990002.pdb_865:15.200:3.910:11.227', 'E3P.B99990002.pdb_929:13.649:-6.894:22.589']]
Combination out of sublists:
[('E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_397:15.856:3.506:8.144'), ('E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_424:8.558:1.315:6.627'), ('E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_774:14.204:-5.490:24.812'), ('E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_865:15.545:4.258:10.007'), ('E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_929:16.146:-6.081:24.770'), ('E3P.B99990001.pdb_397:15.856:3.506:8.144', 'E3P.B99990001.pdb_424:8.558:1.315:6.627'), ('E3P.B99990001.pdb_397:15.856:3.506:8.144', 'E3P.B99990001.pdb_774:14.204:-5.490:24.812'), ('E3P.B99990001.pdb_397:15.856:3.506:8.144', 'E3P.B99990001.pdb_865:15.545:4.258:10.007'), ('E3P.B99990001.pdb_397:15.856:3.506:8.144', 'E3P.B99990001.pdb_929:16.146:-6.081:24.770'), ('E3P.B99990001.pdb_424:8.558:1.315:6.627', 'E3P.B99990001.pdb_774:14.204:-5.490:24.812'), ('E3P.B99990001.pdb_424:8.558:1.315:6.627', 'E3P.B99990001.pdb_865:15.545:4.258:10.007'), ('E3P.B99990001.pdb_424:8.558:1.315:6.627', 'E3P.B99990001.pdb_929:16.146:-6.081:24.770'), ('E3P.B99990001.pdb_774:14.204:-5.490:24.812', 'E3P.B99990001.pdb_865:15.545:4.258:10.007'), ('E3P.B99990001.pdb_774:14.204:-5.490:24.812', 'E3P.B99990001.pdb_929:16.146:-6.081:24.770'), ('E3P.B99990001.pdb_865:15.545:4.258:10.007', 'E3P.B99990001.pdb_929:16.146:-6.081:24.770'), ('E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_397:14.897:3.238:9.338'), ('E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_424:5.649:5.914:8.639'), ('E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_774:12.114:-6.864:23.897'), ('E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_865:15.200:3.910:11.227'), ('E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_929:13.649:-6.894:22.589'), ('E3P.B99990002.pdb_397:14.897:3.238:9.338', 'E3P.B99990002.pdb_424:5.649:5.914:8.639'), ('E3P.B99990002.pdb_397:14.897:3.238:9.338', 'E3P.B99990002.pdb_774:12.114:-6.864:23.897'), ('E3P.B99990002.pdb_397:14.897:3.238:9.338', 'E3P.B99990002.pdb_865:15.200:3.910:11.227'), ('E3P.B99990002.pdb_397:14.897:3.238:9.338', 'E3P.B99990002.pdb_929:13.649:-6.894:22.589'), ('E3P.B99990002.pdb_424:5.649:5.914:8.639', 'E3P.B99990002.pdb_774:12.114:-6.864:23.897'), ('E3P.B99990002.pdb_424:5.649:5.914:8.639', 'E3P.B99990002.pdb_865:15.200:3.910:11.227'), ('E3P.B99990002.pdb_424:5.649:5.914:8.639', 'E3P.B99990002.pdb_929:13.649:-6.894:22.589'), ('E3P.B99990002.pdb_774:12.114:-6.864:23.897', 'E3P.B99990002.pdb_865:15.200:3.910:11.227'), ('E3P.B99990002.pdb_774:12.114:-6.864:23.897', 'E3P.B99990002.pdb_929:13.649:-6.894:22.589'), ('E3P.B99990002.pdb_865:15.200:3.910:11.227', 'E3P.B99990002.pdb_929:13.649:-6.894:22.589')]

多亏了所有的

，如果可以的话，只需使用字典即可：

from collections import defaultdict

s = """B99990001 1 2 3 4
B99990001 1 3 3 4
B99990002 1 2 3 4
B99990002 1 3 3 4
B99990003 1 2 3 4
B99990003 1 3 3 4"""

d = defaultdict(list)
for line in s.split('\n'):
    index, values = line.split(maxsplit=1)
    d[index].append(values)

输出字典d：

如果您确实需要使用列表列表而不是dict，您可以将其转换回列表：

l = [['%s %s' % (index, value) for value in d[index]] for index in d]

如果您喜欢排序版本，可以使用sortedl对其进行排序。

如果您希望获得完全相同的输出：

输出：

为什么不用口述，你需要一份清单吗？谢谢。。。因为下一步我想把所有的单子列表组合起来，并从单子列表中读取所有偶数行。所以，如果这是一个列表，这一步将很容易做到。但是，如果你知道如何通过dict来做，你也可以提出同样的建议。如果我不想在下一步中进行组合，那么这将非常有用：我的下一个想法是将每个子列表进行组合。但在这种情况下，如果我进行组合，它将在同一个位置使用这三个柱B99901,02,03combination@user3805057你打算做什么样的组合？你能通过pastebin.com或其他网站提供一个输入/输出的更大示例吗？当然，我明天会添加真实示例，谢谢你的回复

l = [['%s %s' % (index, value) for value in d[index]] for index in d]

from collections import OrderedDict

d = OrderedDict()
with open('file.txt') as f:
    for line in f:
        splitted = line.strip().split()
        key = splitted[0]
        if key not in d:
            d[key] = []
        d[key].append(' '.join( splitted[1:] ))

mainList = [ [key + ' ' + item for item in d[key] ] for key in d ]
print mainList

[['B99990001 1 2 3 4', 'B99990001 1 3 3 4'],
 ['B99990002 1 2 3 4', 'B99990002 1 3 3 4'],
 ['B99990003 1 2 3 4', 'B99990003 1 3 3 4']]