Python 筛选元组中最长项的元组列表

Python 筛选元组中最长项的元组列表,python,Python,假设我有这些数据 my_list_of_tuples = [ ('bill', [(4, ['626']), (4, ['253', '30', '626']), (4, ['253', '30', '626']), (4, ['626']), (4, ['626']), (4, ['626'])]), ('sarah', [(2, ['6']), (2, ['2', '6']), (2, ['2', '6']),

假设我有这些数据

my_list_of_tuples = [
    ('bill', [(4, ['626']), (4, ['253', '30', '626']),
              (4, ['253', '30', '626']), (4, ['626']),
              (4, ['626']), (4, ['626'])]),
    ('sarah', [(2, ['6']), (2, ['2', '6']), (2, ['2', '6']),
               (2, ['6']), (2, ['6']), (2, ['6'])]),
    ('fred', [(1, ['6']), (1, ['2']), (1, ['2'])])
]
我想保留子元组列表元素中最长的所有项,并删除重复项,这样我就可以

my_output_list_of_tuples = [
    ('bill',  [(4, ['253', '30', '626'])]),
    ('sarah',  [(2, ['2', '6'])]),
    ('fred',  [(1, ['6']), (1, ['2'])])]
到目前为止我试过了

my_output_list_of_tuples = [(x[0], max(x[1], key=lambda tup: len(tup[1]))) for x in my_list_of_tuples] 
但这对fred不起作用,因为max函数只返回一项。我也尝试了几次地图尝试和lamba,但没走多远

我可以像这样分手

for my_list_of_tuples_by_person_name in my_list_of_tuples:
    #Do something with my_list_of_tuples_by_person_name[1]
有什么想法吗


提前感谢:)

如果您想像这样保留副本,您不能只调用
max
,您必须将每个值与
max
的结果进行比较

最具可读性的方法可能是构建一个dict映射键到最大长度,然后将每个元组与之进行比较:

result = []
for name, sublist in my_list_of_tuples:
    d = {}
    for key, subsub in sublist:
        if len(subsub) > d.get(key, 0):
            d[key] = len(subsub)
    lst =[(key, subsub) for key, subsub in sublist if len(subsub) == d[key]]
    result.append((name, lst))
你可以压缩大部分内容,但这可能只会让事情变得更不透明,更不易维护。请注意,将两次循环压缩为单个表达式(每次计算
max
)的简单方法将其转换为嵌套(二次)循环,因此它将比您想象的更详细


由于您已经完全改变了问题,现在显然只需要最长的子列表(可能是在存在重复项或不重复但长度相同的值时任意拾取),所以事情就简单了:

result = []
for name, sublist in my_list_of_tuples:
    keysubsub = max(sublist, key=lambda keysubsub: len(keysubsub[1]))
    result.append((name, keysubsub))
但这基本上就是你已经拥有的。你说它的问题是“……但这对fred不起作用,因为max函数只返回一项”,但我不确定你想要什么而不是一项


如果您要查找的是最大长度的所有不同列表,则可以在第一个答案中使用
集合
有序集合
,而不是
列表
。stdlib中没有
OrderedSet
,但就我们的目的而言应该可以。但是,让我们使用集合和列表手动执行此操作:

result = []
for name, sublist in my_list_of_tuples:
    d = {}
    for key, subsub in sublist:
        if len(subsub) > d.get(key, 0):
            d[key] = len(subsub)
    lst, seen = [], set()
    for key, subsub in sublist:
        if len(subsub) == d[key] and tuple(subsub) not in seen:
            seen.add(tuple(subsub))
            lst.append((key, subsub))
    result.append((name, lst))

我认为最后一个问题正好提供了您更新后的问题所要求的输出,并且没有做任何难以理解的事情。

如果您想保留这样的副本,您不能只调用
max
,您必须将每个值与
max
的结果进行比较

最具可读性的方法可能是构建一个dict映射键到最大长度,然后将每个元组与之进行比较:

result = []
for name, sublist in my_list_of_tuples:
    d = {}
    for key, subsub in sublist:
        if len(subsub) > d.get(key, 0):
            d[key] = len(subsub)
    lst =[(key, subsub) for key, subsub in sublist if len(subsub) == d[key]]
    result.append((name, lst))
你可以压缩大部分内容,但这可能只会让事情变得更不透明,更不易维护。请注意,将两次循环压缩为单个表达式(每次计算
max
)的简单方法将其转换为嵌套(二次)循环,因此它将比您想象的更详细


由于您已经完全改变了问题,现在显然只需要最长的子列表(可能是在存在重复项或不重复但长度相同的值时任意拾取),所以事情就简单了:

result = []
for name, sublist in my_list_of_tuples:
    keysubsub = max(sublist, key=lambda keysubsub: len(keysubsub[1]))
    result.append((name, keysubsub))
但这基本上就是你已经拥有的。你说它的问题是“……但这对fred不起作用,因为max函数只返回一项”,但我不确定你想要什么而不是一项


如果您要查找的是最大长度的所有不同列表,则可以在第一个答案中使用
集合
有序集合
,而不是
列表
。stdlib中没有
OrderedSet
,但就我们的目的而言应该可以。但是,让我们使用集合和列表手动执行此操作:

result = []
for name, sublist in my_list_of_tuples:
    d = {}
    for key, subsub in sublist:
        if len(subsub) > d.get(key, 0):
            d[key] = len(subsub)
    lst, seen = [], set()
    for key, subsub in sublist:
        if len(subsub) == d[key] and tuple(subsub) not in seen:
            seen.add(tuple(subsub))
            lst.append((key, subsub))
    result.append((name, lst))

我认为最后一个问题正好提供了您更新的问题所要求的输出,并且没有做任何难以理解的事情。

您可以使用
max

my_list_of_tuples = my_list_of_tuples = [('bill', [(4, ['626']), (4, ['253', '30', '626']), (4, ['253', '30', '626']), (4, ['626']), (4, ['626']), (4, ['626'])]), ('sarah', [(2, ['6']), (2, ['2', '6']), (2, ['2', '6']), (2, ['6']), (2, ['6']), (2, ['6'])]), ('fred', [(1, ['6']), (1, ['2']), (1, ['2'])])]
final_result = [(a, [(c, d) for c, d in b if len(d) == max(map(len, [h for _, h in b]))]) for a, b in my_list_of_tuples]
new_result = [(a, [c for i, c in enumerate(b) if c not in b[:i]]) for a, b in final_result]
输出:

[('bill', [(4, ['253', '30', '626'])]), ('sarah', [(2, ['2', '6'])]), ('fred', [(1, ['6']), (1, ['2'])])]

您可以使用
max

my_list_of_tuples = my_list_of_tuples = [('bill', [(4, ['626']), (4, ['253', '30', '626']), (4, ['253', '30', '626']), (4, ['626']), (4, ['626']), (4, ['626'])]), ('sarah', [(2, ['6']), (2, ['2', '6']), (2, ['2', '6']), (2, ['6']), (2, ['6']), (2, ['6'])]), ('fred', [(1, ['6']), (1, ['2']), (1, ['2'])])]
final_result = [(a, [(c, d) for c, d in b if len(d) == max(map(len, [h for _, h in b]))]) for a, b in my_list_of_tuples]
new_result = [(a, [c for i, c in enumerate(b) if c not in b[:i]]) for a, b in final_result]
输出:

[('bill', [(4, ['253', '30', '626'])]), ('sarah', [(2, ['2', '6'])]), ('fred', [(1, ['6']), (1, ['2'])])]

首先定义一个函数

def f(ls):
    max_length = max(len(y) for (x, y) in ls)

    result = []

    for (x, y) in ls:
        if len(y) == max_length and (x, y) not in result:
            result.append((x, y))

    return result
现在就这样说吧

>>> from pprint import pprint
>>> pprint([(name, f(y)) for name, y in my_list_of_tuples])
[('bill', [(4, ['253', '30', '626'])]),
 ('sarah', [(2, ['2', '6'])]),
 ('fred', [(1, ['6']), (1, ['2'])])]

首先定义一个函数

def f(ls):
    max_length = max(len(y) for (x, y) in ls)

    result = []

    for (x, y) in ls:
        if len(y) == max_length and (x, y) not in result:
            result.append((x, y))

    return result
现在就这样说吧

>>> from pprint import pprint
>>> pprint([(name, f(y)) for name, y in my_list_of_tuples])
[('bill', [(4, ['253', '30', '626'])]),
 ('sarah', [(2, ['2', '6'])]),
 ('fred', [(1, ['6']), (1, ['2'])])]

对于更复杂的问题,通常的方法是首先将代码编写为嵌套循环,然后将代码缩减为一行表达式。你能在多行中编写逻辑代码吗?对于更复杂的问题,通常的方法是先将代码编写成嵌套循环,然后将代码简化为一行表达式。你能把逻辑编码成多行吗?对不起,我编辑了我的问题。我想保留相同长度的项目,当我在处理这个问题时,我在脑海中产生了一种困惑,即保留重复项具有预期的效果。@Joylove我无法从您的描述中看出您想要的输出是什么。你好像在问如何把你原来拥有的变成你原来拥有的?如果不清楚,很抱歉。我正在寻找独一无二的超集。@Joylove好的,我已经再次更新了答案。但这些答案的目的是让你了解它们是如何工作的,而不仅仅是复制和粘贴它们。如果这还不够简单,您无法理解并对其进行细微更改,请解释哪一部分太复杂。@Joylove在我们讨论时,您可能还想看看
itertrools
docs中的一些配方,例如,
unique\u everseen
,它概括了(并作为一个很好的示例)我正在用那套
seed
set做什么。我不确定您对迭代器转换管道样式有多熟悉,但请参阅一篇精彩的介绍)如果答案是“一点也不”。这就是我简化它的方向,一旦我让它工作起来。(或者,更确切地说,我首先会用那种风格来写。)对不起,我编辑了我的问题。我想保留相同长度的项目,当我在处理这个问题时,我在脑海中产生了一种困惑,即保留重复项具有预期的效果。@Joylove我无法从您的描述中看出您想要的输出是什么。你好像在问如何把你原来拥有的变成你原来拥有的?如果不清楚,很抱歉。我正在寻找独一无二的超集。@Joylove好的,我已经再次更新了答案。但这些答案的目的是让你了解它们是如何工作的,而不是JU