Python 从DICT列表中删除具有重复值的DICT
我有一份目录如下:Python 从DICT列表中删除具有重复值的DICT,python,list,dictionary,Python,List,Dictionary,我有一份目录如下: [{'ppm_error': -5.441115144810845e-07, 'key': 'Y7', 'obs_ion': 1054.5045550349998}, {'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1047.547178035}, {'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1381.24928035},
[{'ppm_error': -5.441115144810845e-07, 'key': 'Y7', 'obs_ion': 1054.5045550349998},
{'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1047.547178035},
{'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1381.24928035},
{'ppm_error': -2.5532659838679713e-06, 'key': 'Y4', 'obs_ion': 741.339467035},
{'ppm_error': 1.3036219678359603e-05, 'key': 'Y10', 'obs_ion': 1349.712302035},
{'ppm_error': 3.4259216556970878e-06, 'key': 'Y6', 'obs_ion': 941.424286035},
{'ppm_error': 1.1292770047090912e-06, 'key': 'Y2', 'obs_ion': 261.156025035},
{'ppm_error': 1.1292770047090912e-06, 'key': 'Y2', 'obs_ion': 389.156424565},
{'ppm_error': 9.326980606898406e-06, 'key': 'Y5', 'obs_ion': 667.3107950350001}
]
[{'ppm_error': -5.441115144810845e-07, 'key': 'Y7', 'obs_ion': 1054.5045550349998},
{'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1381.24928035},
{'ppm_error': -2.5532659838679713e-06, 'key': 'Y4', 'obs_ion': 741.339467035},
{'ppm_error': 1.3036219678359603e-05, 'key': 'Y10', 'obs_ion': 1349.712302035},
{'ppm_error': 3.4259216556970878e-06, 'key': 'Y6', 'obs_ion': 941.424286035},
{'ppm_error': 1.1292770047090912e-06, 'key': 'Y2', 'obs_ion': 261.156025035},
{'ppm_error': 9.326980606898406e-06, 'key': 'Y5', 'obs_ion': 667.3107950350001}
]
我想删除具有重复键的dict,以便只保留具有唯一“键”的dict。最后的名单上出现哪一条并不重要。因此,最终列表应如下所示:
[{'ppm_error': -5.441115144810845e-07, 'key': 'Y7', 'obs_ion': 1054.5045550349998},
{'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1047.547178035},
{'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1381.24928035},
{'ppm_error': -2.5532659838679713e-06, 'key': 'Y4', 'obs_ion': 741.339467035},
{'ppm_error': 1.3036219678359603e-05, 'key': 'Y10', 'obs_ion': 1349.712302035},
{'ppm_error': 3.4259216556970878e-06, 'key': 'Y6', 'obs_ion': 941.424286035},
{'ppm_error': 1.1292770047090912e-06, 'key': 'Y2', 'obs_ion': 261.156025035},
{'ppm_error': 1.1292770047090912e-06, 'key': 'Y2', 'obs_ion': 389.156424565},
{'ppm_error': 9.326980606898406e-06, 'key': 'Y5', 'obs_ion': 667.3107950350001}
]
[{'ppm_error': -5.441115144810845e-07, 'key': 'Y7', 'obs_ion': 1054.5045550349998},
{'ppm_error': 2.3119997582222951e-07, 'key': 'Y9', 'obs_ion': 1381.24928035},
{'ppm_error': -2.5532659838679713e-06, 'key': 'Y4', 'obs_ion': 741.339467035},
{'ppm_error': 1.3036219678359603e-05, 'key': 'Y10', 'obs_ion': 1349.712302035},
{'ppm_error': 3.4259216556970878e-06, 'key': 'Y6', 'obs_ion': 941.424286035},
{'ppm_error': 1.1292770047090912e-06, 'key': 'Y2', 'obs_ion': 261.156025035},
{'ppm_error': 9.326980606898406e-06, 'key': 'Y5', 'obs_ion': 667.3107950350001}
]
是否可以使用itertools.groupby函数执行此操作,或者是否有其他方法解决此问题?有什么建议吗?如果订单很重要,那么您可以使用收集所有项目,如下所示
from collections import OrderedDict
print OrderedDict((d["key"], d) for d in my_list).values()
print {d["key"]:d for d in my_list}.values()
如果顺序不重要,你可以使用普通字典,像这样
from collections import OrderedDict
print OrderedDict((d["key"], d) for d in my_list).values()
print {d["key"]:d for d in my_list}.values()
我会这样做:
list = [...] # your list
finallist = dict(map(lambda x: (x['key'],x), list)).values()
基本上,这与@thefourtheye在他的回答中提供的解决方案相同…将其转换为numpy数组
a = numpy.array([(d["ppm_error"],d["key"],d["obs_ion"]) for d in my_dicts])
mask =numpy.unique(a[:,1],True)[1]
uniques = a[mask]
然后回到听写
unique_entries = map(dict,[zip(labels,row) for row in uniques])
另一个解决方案是记住处理过的键,如果已经看到键,则返回不同的结果。这可以通过使用备忘录来完成:
def get_key_watcher():
keys_seen=set()
未看到def键(d):
key=d['key']
如果看到输入键,请执行以下操作:
return False#密钥不是新的
其他:
已看到关键点。添加(关键点)
return True#第一次看到密钥
返回键未显示
然后你可以这样使用它:
>>filtered\u dicts=filter(get\u key\u watcher(),dicts)
>>>过滤式口述
[{'ppm_error':-5.441115144810845e-07,'obs_ion':1054.50455550349998,'key':'Y7'},
{'ppm_error':2.3119997582222951e-07,'obs_ion':1047.547178035,'key':'Y9'},
{'ppm_error':-2.5532659838679713e-06,'obs_ion':741.339467035,'key':'Y4'},
{'ppm_error':1.3036219678359603e-05,'obs_ion':1349.712302035,'key':'Y10'},
{'ppm_error':3.4259216556970878e-06,'obs_ion':941.424286035,'key':'Y6'},
{'ppm_error':1.1292770047090912e-06,'obs_ion':261.15620535,'key':'Y2'},
{'ppm_error':9.32698069898406E-06,'obs_ion':667.3107950350001,'key':'Y5'}]
显然,它维持着字典的顺序。然后把字典放在第一位。当你说
key
时,你的意思是key
,对吗?@thefourtheye:是的,我更新了帖子。谢谢你指出。另外,字典的顺序在输出列表中重要吗?不重要。它们的出现顺序无关紧要。酷,我在回答中给出了这两个版本。:)你能描述一下吗?目前这个答案是不完整的。最好的解决方案,因为密钥具有唯一密钥的神奇特性:)非常优雅,我一直在摸索诸如“过滤器”之类的搜索词,下面是一个很好的例子,正好说明我在寻找什么。另一个提醒是python看起来非常像伪代码,我想这就是直觉的定义。