Python 当顺序很重要时，如何从元组列表中删除重复项_Python_List_Duplicates_Tuples

Python 当顺序很重要时，如何从元组列表中删除重复项

python list

Python 当顺序很重要时，如何从元组列表中删除重复项,python,list,duplicates,tuples,Python,List,Duplicates,Tuples,我已经看到了一些类似的答案，但我找不到这个案例的具体答案我有一个元组列表： [（5,0）、（3,1）、（3,2）、（5,3）、（6,4）] 我想要的是，仅当元组的第一个元素以前出现在列表中，并且剩余的元组应该具有最小的第二个元素时，才从列表中删除元组因此，输出应如下所示： [（5,0）、（3,1）、（6,4）]这将满足您的需要： # I switched (5, 3) and (5, 0) to demonstrate sorting capabilities. list_a = [(5,

我已经看到了一些类似的答案，但我找不到这个案例的具体答案

我有一个元组列表：

[（5,0）、（3,1）、（3,2）、（5,3）、（6,4）]

我想要的是，仅当元组的第一个元素以前出现在列表中，并且剩余的元组应该具有最小的第二个元素时，才从列表中删除元组

因此，输出应如下所示：

[（5,0）、（3,1）、（6,4）]

这将满足您的需要：

# I switched (5, 3) and (5, 0) to demonstrate sorting capabilities.
list_a = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)]

# Create a list to contain the results
list_b = []

# Create a list to check for duplicates
l = []

# Sort list_a by the second element of each tuple to ensure the smallest numbers
list_a.sort(key=lambda i: i[1])

# Iterate through every tuple in list_a
for i in list_a:

    # Check if the 0th element of the tuple is in the duplicates list; if not:
    if i[0] not in l:

        # Add the tuple the loop is currently on to the results; and
        list_b.append(i)

        # Add the 0th element of the tuple to the duplicates list
        l.append(i[0])

>>> print(list_b)
[(5, 0), (3, 1), (6, 4)]

希望这有帮助

使用

enumerate（）

和列表理解：使用

enumerate（）

和for循环：测试

这里有一个线性时间方法，需要在原始列表上进行两次迭代

t = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)] # test case 1
#t = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)] # test case 2
smallest = {}
inf = float('inf')

for first, second in t:
    if smallest.get(first, inf) > second:
        smallest[first] = second

result = []
seen = set()

for first, second in t:
    if first not in seen and second == smallest[first]:
        seen.add(first)
        result.append((first, second))

print(result) # [(5, 0), (3, 1), (6, 4)] for test case 1
              # [(3, 1), (5, 0), (6, 4)] for test case 2

这是我使用OrderedDict设计的一个紧凑版本，如果新值大于旧值，则跳过替换

from collections import OrderedDict

a = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)]
d = OrderedDict()

for item in a:

    # Get old value in dictionary if exist
    old = d.get(item[0])

    # Skip if new item is larger than old
    if old:
        if item[1] > old[1]:
            continue
        #else:
        #    del d[item[0]]

    # Assign
    d[item[0]] = item

list(d.values())

[(5, 0), (3, 1), (6, 4)]

或者如果您使用else语句（注释掉）：

在我看来，你需要知道两件事：

对于每个第一个元素，具有最小第二个元素的元组

索引新列表中每个第一个元素的顺序

我们可以使用

itertools.groupby

和

min

函数得到#1

import itertools
import operator

lst = [(3, 1), (5, 3), (5, 0), (3, 2), (6, 4)]
# I changed this slightly to make it harder to accidentally succeed.
# correct final order should be [(3, 1), (5, 0), (6, 4)]

tmplst = sorted(lst, key=operator.itemgetter(0))
groups = itertools.groupby(tmplst, operator.itemgetter(0))
# group by first element, in this case this looks like:
# [(3, [(3, 1), (3, 2)]), (5, [(5, 3), (5, 0)]), (6, [(6, 4)])]
# note that groupby only works on sorted lists, so we need to sort this first

min_tuples = {min(v, key=operator.itemgetter(1)) for _, v in groups}
# give the best possible result for each first tuple. In this case:
# {(3, 1), (5, 0), (6, 4)}
# (note that this is a set comprehension for faster lookups later.

现在我们知道了结果集的样子，我们可以重新处理

lst

，以使它们按正确的顺序排列

seen = set()
result = []
for el in lst:
    if el not in min_tuples:  # don't add to result
        continue
    elif el not in seen:      # add to result and mark as seen
        result.append(el)
        seen.add(el)

我没有看到@Anton vBR的答案就有了这个想法

import collections

inp = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)]

od = collections.OrderedDict()
for i1, i2 in inp:
    if i2 <= od.get(i1, i2):
        od.pop(i1, None)
        od[i1] = i2
outp = list(od.items())
print(outp)

导入集合
inp=[（5,0）、（3,1）、（3,2）、（5,3）、（6,4）]
od=collections.OrderedDict（）
对于inp中的i1和i2：
如果i2在t=[（5,3），（3,1），（3,2），（5,0），（6,4）]
@AdamSmith中失败，这是怎么回事？如果结果不是[（5,3），（3,1），（6,4）]
？根据问题：“剩下的元组应该有最小的第二个元素”，因此它仍然应该是[（5,0），…]
，您可以通过颠倒测试顺序来节省时间（如果第一个未出现，第二个==最小的[第一个]
）但这是最微小的优化。看起来很棒@接球很好。顺便说一句，我假设在第二个测试用例中，应该保持原始顺序，即第一个条目是（3,1）
，并且（5,0）
不会向前滑动。如果正确的顺序与第二个元素顺序（即[（3,1），（5,3），…）不对应，则此操作失败
根据问题得出错误的结果。新的解决方案应该有效，但我担心OrderedDicts的“插入顺序”是如何定义的。如果d[first]=item
在d

中处于第一位，是否将其视为插入项？如果不是，您可能会得到无序的结果。@AdamSmith是的，我也感兴趣。我的测试表明了这一点。@AdamSmith“d[first]=如果在d中第一个，项目是否算作插入”@timgeb好的，我接受你得到了一个有效点：）。如果所需的输出是[（3,1），（5,0），（6,4）]，我添加了一个else语句请注意，这不是一个特别有效的解决方案，但它可以工作。对于大型列表，这比构建一组

seen

并检查成员身份要慢，而且与其他答案存在相同的问题，因为它没有正确选择要保留的元素。对于

some_list=[（5,3）、（3,1）、（3,2）而言，此代码失败，例如，（5,0），（6,4）]

。

[(3, 1), (5, 0), (6, 4)]

import itertools
import operator

lst = [(3, 1), (5, 3), (5, 0), (3, 2), (6, 4)]
# I changed this slightly to make it harder to accidentally succeed.
# correct final order should be [(3, 1), (5, 0), (6, 4)]

tmplst = sorted(lst, key=operator.itemgetter(0))
groups = itertools.groupby(tmplst, operator.itemgetter(0))
# group by first element, in this case this looks like:
# [(3, [(3, 1), (3, 2)]), (5, [(5, 3), (5, 0)]), (6, [(6, 4)])]
# note that groupby only works on sorted lists, so we need to sort this first

min_tuples = {min(v, key=operator.itemgetter(1)) for _, v in groups}
# give the best possible result for each first tuple. In this case:
# {(3, 1), (5, 0), (6, 4)}
# (note that this is a set comprehension for faster lookups later.

seen = set()
result = []
for el in lst:
    if el not in min_tuples:  # don't add to result
        continue
    elif el not in seen:      # add to result and mark as seen
        result.append(el)
        seen.add(el)

import collections

inp = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)]

od = collections.OrderedDict()
for i1, i2 in inp:
    if i2 <= od.get(i1, i2):
        od.pop(i1, None)
        od[i1] = i2
outp = list(od.items())
print(outp)