Python 当顺序很重要时,如何从元组列表中删除重复项
我已经看到了一些类似的答案,但我找不到这个案例的具体答案 我有一个元组列表:Python 当顺序很重要时,如何从元组列表中删除重复项,python,list,duplicates,tuples,Python,List,Duplicates,Tuples,我已经看到了一些类似的答案,但我找不到这个案例的具体答案 我有一个元组列表: [(5,0)、(3,1)、(3,2)、(5,3)、(6,4)] 我想要的是,仅当元组的第一个元素以前出现在列表中,并且剩余的元组应该具有最小的第二个元素时,才从列表中删除元组 因此,输出应如下所示: [(5,0)、(3,1)、(6,4)]这将满足您的需要: # I switched (5, 3) and (5, 0) to demonstrate sorting capabilities. list_a = [(5,
[(5,0)、(3,1)、(3,2)、(5,3)、(6,4)]
我想要的是,仅当元组的第一个元素以前出现在列表中,并且剩余的元组应该具有最小的第二个元素时,才从列表中删除元组
因此,输出应如下所示:
[(5,0)、(3,1)、(6,4)]
这将满足您的需要:
# I switched (5, 3) and (5, 0) to demonstrate sorting capabilities.
list_a = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)]
# Create a list to contain the results
list_b = []
# Create a list to check for duplicates
l = []
# Sort list_a by the second element of each tuple to ensure the smallest numbers
list_a.sort(key=lambda i: i[1])
# Iterate through every tuple in list_a
for i in list_a:
# Check if the 0th element of the tuple is in the duplicates list; if not:
if i[0] not in l:
# Add the tuple the loop is currently on to the results; and
list_b.append(i)
# Add the 0th element of the tuple to the duplicates list
l.append(i[0])
>>> print(list_b)
[(5, 0), (3, 1), (6, 4)]
希望这有帮助 使用enumerate()
和列表理解:
使用enumerate()
和for循环:
测试
这里有一个线性时间方法,需要在原始列表上进行两次迭代
t = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)] # test case 1
#t = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)] # test case 2
smallest = {}
inf = float('inf')
for first, second in t:
if smallest.get(first, inf) > second:
smallest[first] = second
result = []
seen = set()
for first, second in t:
if first not in seen and second == smallest[first]:
seen.add(first)
result.append((first, second))
print(result) # [(5, 0), (3, 1), (6, 4)] for test case 1
# [(3, 1), (5, 0), (6, 4)] for test case 2
这是我使用OrderedDict设计的一个紧凑版本,如果新值大于旧值,则跳过替换
from collections import OrderedDict
a = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)]
d = OrderedDict()
for item in a:
# Get old value in dictionary if exist
old = d.get(item[0])
# Skip if new item is larger than old
if old:
if item[1] > old[1]:
continue
#else:
# del d[item[0]]
# Assign
d[item[0]] = item
list(d.values())
返回:
[(5, 0), (3, 1), (6, 4)]
或者如果您使用else语句(注释掉):
在我看来,你需要知道两件事:
itertools.groupby
和min
函数得到#1
import itertools
import operator
lst = [(3, 1), (5, 3), (5, 0), (3, 2), (6, 4)]
# I changed this slightly to make it harder to accidentally succeed.
# correct final order should be [(3, 1), (5, 0), (6, 4)]
tmplst = sorted(lst, key=operator.itemgetter(0))
groups = itertools.groupby(tmplst, operator.itemgetter(0))
# group by first element, in this case this looks like:
# [(3, [(3, 1), (3, 2)]), (5, [(5, 3), (5, 0)]), (6, [(6, 4)])]
# note that groupby only works on sorted lists, so we need to sort this first
min_tuples = {min(v, key=operator.itemgetter(1)) for _, v in groups}
# give the best possible result for each first tuple. In this case:
# {(3, 1), (5, 0), (6, 4)}
# (note that this is a set comprehension for faster lookups later.
现在我们知道了结果集的样子,我们可以重新处理lst
,以使它们按正确的顺序排列
seen = set()
result = []
for el in lst:
if el not in min_tuples: # don't add to result
continue
elif el not in seen: # add to result and mark as seen
result.append(el)
seen.add(el)
我没有看到@Anton vBR的答案就有了这个想法
import collections
inp = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)]
od = collections.OrderedDict()
for i1, i2 in inp:
if i2 <= od.get(i1, i2):
od.pop(i1, None)
od[i1] = i2
outp = list(od.items())
print(outp)
导入集合
inp=[(5,0)、(3,1)、(3,2)、(5,3)、(6,4)]
od=collections.OrderedDict()
对于inp中的i1和i2:
如果i2在t=[(5,3),(3,1),(3,2),(5,0),(6,4)]
@AdamSmith中失败,这是怎么回事?如果结果不是[(5,3),(3,1),(6,4)]
?根据问题:“剩下的元组应该有最小的第二个元素”,因此它仍然应该是[(5,0),…]
,您可以通过颠倒测试顺序来节省时间(如果第一个未出现,第二个==最小的[第一个]
)但这是最微小的优化。看起来很棒@接球很好。顺便说一句,我假设在第二个测试用例中,应该保持原始顺序,即第一个条目是(3,1)
,并且(5,0)
不会向前滑动。如果正确的顺序与第二个元素顺序(即[(3,1),(5,3),…)不对应,则此操作失败
根据问题得出错误的结果。新的解决方案应该有效,但我担心OrderedDicts的“插入顺序”是如何定义的。如果d[first]=item
在d
中处于第一位,是否将其视为插入项?如果不是,您可能会得到无序的结果。@AdamSmith是的,我也感兴趣。我的测试表明了这一点。@AdamSmith“d[first]=如果在d中第一个,项目是否算作插入”@timgeb好的,我接受你得到了一个有效点:)。如果所需的输出是[(3,1),(5,0),(6,4)],我添加了一个else语句请注意,这不是一个特别有效的解决方案,但它可以工作。对于大型列表,这比构建一组seen
并检查成员身份要慢,而且与其他答案存在相同的问题,因为它没有正确选择要保留的元素。对于some_list=[(5,3)、(3,1)、(3,2)而言,此代码失败,例如,(5,0),(6,4)]
。
[(3, 1), (5, 0), (6, 4)]
import itertools
import operator
lst = [(3, 1), (5, 3), (5, 0), (3, 2), (6, 4)]
# I changed this slightly to make it harder to accidentally succeed.
# correct final order should be [(3, 1), (5, 0), (6, 4)]
tmplst = sorted(lst, key=operator.itemgetter(0))
groups = itertools.groupby(tmplst, operator.itemgetter(0))
# group by first element, in this case this looks like:
# [(3, [(3, 1), (3, 2)]), (5, [(5, 3), (5, 0)]), (6, [(6, 4)])]
# note that groupby only works on sorted lists, so we need to sort this first
min_tuples = {min(v, key=operator.itemgetter(1)) for _, v in groups}
# give the best possible result for each first tuple. In this case:
# {(3, 1), (5, 0), (6, 4)}
# (note that this is a set comprehension for faster lookups later.
seen = set()
result = []
for el in lst:
if el not in min_tuples: # don't add to result
continue
elif el not in seen: # add to result and mark as seen
result.append(el)
seen.add(el)
import collections
inp = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)]
od = collections.OrderedDict()
for i1, i2 in inp:
if i2 <= od.get(i1, i2):
od.pop(i1, None)
od[i1] = i2
outp = list(od.items())
print(outp)