Python 处理一组唯一元组_Python_Tuples

Python 处理一组唯一元组

python

Python 处理一组唯一元组,python,tuples,Python,Tuples,我有一组独特的元组，如下所示。第一个值是名称，第二个值是ID，第三个值是类型（'9'，'0000022'，'LRA'）（'45'，'0000016'，'PBM'）（'16'，'0000048'，'PBL'）（'304'，'0000042'，'PBL'）（'7'，'0000014'，'IBL'）（'12'，'0000051'，'LRA'）（'7'，'0000014'，'PBL'）（'68'，'0000002'，'PBM'）（'356'，'0000049'，'PBL'）（'12'

我有一组独特的元组，如下所示。第一个值是名称，第二个值是ID，第三个值是类型

（'9'，'0000022'，'LRA'）
（'45'，'0000016'，'PBM'）
（'16'，'0000048'，'PBL'）
（'304'，'0000042'，'PBL'）
（'7'，'0000014'，'IBL'）
（'12'，'0000051'，'LRA'）
（'7'，'0000014'，'PBL'）
（'68'，'0000002'，'PBM'）
（'356'，'0000049'，'PBL'）
（'12'，'0000051'，'PBL'）
（'15'，'0000015'，'PBL'）
（'32'，'0000046'，'PBL'）
（'9'，'0000022'，'PBL'）
（'10'，'0000007'，'PBM'）
（'7'，'0000014'，'LRA'）
（'439'，'0000005'，'PBL'）
（'4'，'0000029'，'LRA'）
（'41'，'0000064'，'PBL'）
（'10'，'0000007'，'IBL'）
（'8'，'0000006'，'PBL'）
（'331'，'0000040'，'PBL'）
（'9'、'0000022'、'IBL'）

此集合包括名称/ID组合的副本，但它们的类型各不相同。例如：

（'9'，'0000022'，'LRA'）
（'9'，'0000022'，'PBL'）
（'9'、'0000022'、'IBL'）

我想做的是处理这组元组，以便创建一个新列表，其中每个名称/ID组合只出现一次，但包括所有类型。此列表应仅包括具有多个类型的名称/ID组合。例如，我的输出如下所示：

（'9'，'0000022'，'LRA'，'PBL'，'IBL'）
（'7'，'0000014'，'IBL'，'PBL'，'LRA'）

但我的输出不应包括只有一种类型的名称/ID组合：

（'45'，'0000016'，'PBM'）
（‘16’、‘0000048’、‘PBL’）

感谢您的帮助

通过对其输出内容进行一些额外处理，可以完成以下工作：

from itertools import groupby

data = {
    ('9', '0000022', 'LRA'),
    ('45', '0000016', 'PBM'),
    ('16', '0000048', 'PBL'),
    ...
}

def group_by_name_and_id(s):
    grouped = groupby(sorted(s), key=lambda (name, id_, type_): (name_, id))
    for (name, id_), items in grouped:
        types = tuple(type_ for _, _, type_ in items)
        if len(types) > 1:
            yield (name, id_) + types

print '\n'.join(str(x) for x in group_by_name_and_id(data))

产出：

('10', '0000007', 'PBM', 'IBL')
('12', '0000051', 'LRA', 'PBL')
('7', '0000014', 'LRA', 'PBL', 'IBL')
('9', '0000022', 'LRA', 'PBL', 'IBL')

p.S.但我并不喜欢这种设计：类型可以/应该是包含在元组第三项中的列表，而不是元组本身的一部分。。。因为这样元组的长度是动态的，这很难看。。。元组不应该这样使用。所以最好换一个

        types = tuple(type_ for _, _, type_ in items)
        yield (name, id_) + types

与

生产更干净的外观

('10', '0000007', ['IBL', 'PBM'])
('12', '0000051', ['LRA', 'PBL'])
('7', '0000014', ['IBL', 'LRA', 'PBL'])
('9', '0000022', ['IBL', 'LRA', 'PBL'])

例如，然后您可以使用

对转换后的\u数据中的名称、id、类型进行迭代：

，并对其输出的内容进行一些额外的处理：

from itertools import groupby

data = {
    ('9', '0000022', 'LRA'),
    ('45', '0000016', 'PBM'),
    ('16', '0000048', 'PBL'),
    ...
}

def group_by_name_and_id(s):
    grouped = groupby(sorted(s), key=lambda (name, id_, type_): (name_, id))
    for (name, id_), items in grouped:
        types = tuple(type_ for _, _, type_ in items)
        if len(types) > 1:
            yield (name, id_) + types

print '\n'.join(str(x) for x in group_by_name_and_id(data))

产出：

('10', '0000007', 'PBM', 'IBL')
('12', '0000051', 'LRA', 'PBL')
('7', '0000014', 'LRA', 'PBL', 'IBL')
('9', '0000022', 'LRA', 'PBL', 'IBL')

        types = tuple(type_ for _, _, type_ in items)
        yield (name, id_) + types

与

生产更干净的外观

('10', '0000007', ['IBL', 'PBM'])
('12', '0000051', ['LRA', 'PBL'])
('7', '0000014', ['IBL', 'LRA', 'PBL'])
('9', '0000022', ['IBL', 'LRA', 'PBL'])

例如，您可以使用

对转换后的数据中的名称、id、类型进行迭代：

使用

defaultdict

进行累加，然后过滤：

from collections import defaultdict

d = defaultdict(list)
for tup in list_of_tuples:
    d[(tup[0],tup[1])].append(tup[2])

d
Out[15]: defaultdict(<class 'list'>, {('16', '0000048'): ['PBL'], ('9', '0000022'): ['LRA', 'PBL', 'IBL'], ('12', '0000051'): ['LRA', 'PBL'], ('304', '0000042'): ['PBL'], ('331', '0000040'): ['PBL'], ('41', '0000064'): ['PBL'], ('356', '0000049'): ['PBL'], ('15', '0000015'): ['PBL'], ('8', '0000006'): ['PBL'], ('4', '0000029'): ['LRA'], ('7', '0000014'): ['IBL', 'PBL', 'LRA'], ('32', '0000046'): ['PBL'], ('68', '0000002'): ['PBM'], ('439', '0000005'): ['PBL'], ('10', '0000007'): ['PBM', 'IBL'], ('45', '0000016'): ['PBM']})

[(key,val) for key,val in d.items() if len(val) > 1]
Out[29]: 
[(('9', '0000022'), ['LRA', 'PBL', 'IBL']),
 (('12', '0000051'), ['LRA', 'PBL']),
 (('7', '0000014'), ['IBL', 'PBL', 'LRA']),
 (('10', '0000007'), ['PBM', 'IBL'])]

如果你真的想把它恢复到原来的格式：

from itertools import chain

[tuple(chain.from_iterable(tup)) for tup in d.items() if len(tup[1]) > 1]
Out[27]: 
[('9', '0000022', 'LRA', 'PBL', 'IBL'),
 ('12', '0000051', 'LRA', 'PBL'),
 ('7', '0000014', 'IBL', 'PBL', 'LRA'),
 ('10', '0000007', 'PBM', 'IBL')]

尽管我认为最有意义的做法是将其保留为

dict

，并使用（name，id）元组作为键，正如我们在第一步中生成的那样。

使用

defaultdict

进行累加，然后进行筛选非常简单：

from collections import defaultdict

d = defaultdict(list)
for tup in list_of_tuples:
    d[(tup[0],tup[1])].append(tup[2])

d
Out[15]: defaultdict(<class 'list'>, {('16', '0000048'): ['PBL'], ('9', '0000022'): ['LRA', 'PBL', 'IBL'], ('12', '0000051'): ['LRA', 'PBL'], ('304', '0000042'): ['PBL'], ('331', '0000040'): ['PBL'], ('41', '0000064'): ['PBL'], ('356', '0000049'): ['PBL'], ('15', '0000015'): ['PBL'], ('8', '0000006'): ['PBL'], ('4', '0000029'): ['LRA'], ('7', '0000014'): ['IBL', 'PBL', 'LRA'], ('32', '0000046'): ['PBL'], ('68', '0000002'): ['PBM'], ('439', '0000005'): ['PBL'], ('10', '0000007'): ['PBM', 'IBL'], ('45', '0000016'): ['PBM']})

[(key,val) for key,val in d.items() if len(val) > 1]
Out[29]: 
[(('9', '0000022'), ['LRA', 'PBL', 'IBL']),
 (('12', '0000051'), ['LRA', 'PBL']),
 (('7', '0000014'), ['IBL', 'PBL', 'LRA']),
 (('10', '0000007'), ['PBM', 'IBL'])]

如果你真的想把它恢复到原来的格式：

from itertools import chain

[tuple(chain.from_iterable(tup)) for tup in d.items() if len(tup[1]) > 1]
Out[27]: 
[('9', '0000022', 'LRA', 'PBL', 'IBL'),
 ('12', '0000051', 'LRA', 'PBL'),
 ('7', '0000014', 'IBL', 'PBL', 'LRA'),
 ('10', '0000007', 'PBM', 'IBL')]

尽管我认为最合理的做法是将其保留为一个

dict

，以（name，id）元组作为键，正如我们在第一步中生成的那样。

科学的一行（其他答案更具可读性，可能更正确）：

收益率：

[('9', '0000022', 'LRA', 'PBL', 'IBL'),
 ('12', '0000051', 'LRA', 'PBL'), 
 ('10', '0000007', 'PBM', 'IBL'), 
 ('7', '0000014', 'IBL', 'PBL', 'LRA')]

一行是科学（其他答案更具可读性，可能更正确）：

收益率：

[('9', '0000022', 'LRA', 'PBL', 'IBL'),
 ('12', '0000051', 'LRA', 'PBL'), 
 ('10', '0000007', 'PBM', 'IBL'), 
 ('7', '0000014', 'IBL', 'PBL', 'LRA')]

到目前为止你试过什么？你的问题足够详细（我认为），但它有助于看到你迄今为止的进展。到目前为止你做了哪些尝试？您的问题足够详细（我认为），但它有助于了解您目前的进展情况。推荐阅读：……在许多情况下，您可以直接使用生成器表达式，而无需先转换为列表（即避免使用

[]

）；另外，你有一个嵌套理解太多了。我知道pep8，我只是想看看在这种情况下是否可以做到。更多的是我认为其他人可能会笑的个人练习。推荐阅读：……在许多情况下，您可以直接使用生成器表达式，而无需先转换为列表（即避免使用

[]

）；另外，你有一个嵌套理解太多了。我知道pep8，我只是想看看在这种情况下是否可以做到。更多的是个人锻炼，我想其他人可能会笑。