Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/354.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python删除重复项_Python_List_Duplicates_Tuples - Fatal编程技术网

python删除重复项

python删除重复项,python,list,duplicates,tuples,Python,List,Duplicates,Tuples,在数组中,我有以下元组: ('0000233/02', 50.0, None, None, None, None, 'Yes') ('0000233/02', 200.0, None, None, None, None, 'Yes') ('0000233/02',50.0,无,无,无,无,'Yes') ('0000233/02',200.0,无,无,无,无,'Yes') 如果我在列表中迭代,我怎样才能完全基于第一个元素消除重复项呢?使用第一个元素作为键将它们放入dict中。如果在添加前

在数组中,我有以下元组:
  ('0000233/02', 50.0, None, None, None, None, 'Yes') 
  ('0000233/02', 200.0, None, None, None, None, 'Yes')
('0000233/02',50.0,无,无,无,无,'Yes')
('0000233/02',200.0,无,无,无,无,'Yes')


如果我在列表中迭代,我怎样才能完全基于第一个元素消除重复项呢?

使用第一个元素作为键将它们放入dict中。如果在添加前选中,则将获得带有该键的第一项,否则将获得最后一项。

先查看:


快速方法:创建一个字典,将要用于比较的元素用作键

# This will leave the last tuple found with that 1st value in the dict:
d = {}
for t in tuples:
    d[t[0]] = t # or .set()

# This will leave the first tuple found, instead of the last:
d = {}
for t in tuples:
    d.setdefault(t[0], t) # setdefault sets the value if it's missing.
特设解决方案:

def unique_elem0( iterable ):
    seen = set()
    seen_add = seen.add
    for element in iterable:
        key = element[0]
        if key not in seen:
            seen_add(key)
            yield element

print list(unique_elem0(lst))
def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in ifilterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

from operator import itemgetter        
print list(unique_everseen(lst, key=itemgetter(0)))
“从”解决方案中复制代码:

def unique_elem0( iterable ):
    seen = set()
    seen_add = seen.add
    for element in iterable:
        key = element[0]
        if key not in seen:
            seen_add(key)
            yield element

print list(unique_elem0(lst))
def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in ifilterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

from operator import itemgetter        
print list(unique_everseen(lst, key=itemgetter(0)))

如果对输入进行排序(或至少将重复项聚集在一起),则有一种稍微不同的方法是使用itertools.groupby:

import itertools, operator

def filter_duplicates(items):
    for key, group in itertools.groupby(items, operator.itemgetter(0)):
        yield next(group)

这将拾取每个重复项束中的第一项(按第一项分组)。这比基于set/dict的方法更有效,因为不需要额外的结构,并且保留了序列的顺序。但是,这取决于成批出现的重复项-如果它们可以出现在流中的任何位置,请使用其他方法之一。

如果您不关心元素在第一个之后的顺序,这将是快速而简单的:

>>> t1= ('0000233/02', 50.0, None, None, None, None, 'Yes')
>>> t2= ('0000233/02', 200.0, None, None, None, None, 'Yes')
>>> t1=(t1[0],)+tuple(set(t1[1:]))
>>> t2=(t2[0],)+tuple(set(t2[1:]))
>>> t1
('0000233/02', 50.0, None, 'Yes')
>>> t2
('0000233/02', 200.0, 'Yes', None)
如果您确实关心订单:

>>> t2= ('0000233/02', 200.0, None, None, None, None, 'Yes')
>>> nd=[]
>>> garbage=[nd.append(i) for i in t2 if not nd.count(i)]
>>> t2=tuple(nd)
>>> t2
('0000233/02', 200.0, None, 'Yes')

您可能会发现这很有趣:。