从列表中排序和删除Python
我有一个dict列表,列表中的每个dict都有一个字符串格式的时间戳和一个键。一个特定的键可以在列表中重复多次。我只想保留带有最新时间戳的密钥的dict,并从列表中删除所有其他dict。我实现soluion的一种方法是使用另一个变量,在所有键上循环,并与现有的一个进行比较 有没有更好的方法来解决这个问题,使用列表理解或itertools或任何其他方式 下面是示例输入数据从列表中排序和删除Python,python,list,sorting,dictionary,Python,List,Sorting,Dictionary,我有一个dict列表,列表中的每个dict都有一个字符串格式的时间戳和一个键。一个特定的键可以在列表中重复多次。我只想保留带有最新时间戳的密钥的dict,并从列表中删除所有其他dict。我实现soluion的一种方法是使用另一个变量,在所有键上循环,并与现有的一个进行比较 有没有更好的方法来解决这个问题,使用列表理解或itertools或任何其他方式 下面是示例输入数据 data = [ {'key': 'key1', 'timestamp': '2017-08-03T10:24:21.
data = [
{'key': 'key1', 'timestamp': '2017-08-03T10:24:21.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:22.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:23.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:19.762278'},
{'key': 'key3', 'timestamp': '2017-08-03T10:24:25.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:11.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:45.762278'},
{'key': 'key4', 'timestamp': '2017-08-03T10:24:39.762278'}
]
以下是预期的输出
data = [
{'key': 'key3', 'timestamp': '2017-08-03T10:24:25.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:22.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:45.762278'},
{'key': 'key4', 'timestamp': '2017-08-03T10:24:39.762278'}
]
我在python中的实现如下
from dateutil.parser import parse
def sort_and_eliminate(data):
processed_data = {}
for cur_item in data:
key = cur_item.get('key')
if key not in processed_data:
processed_data[key] = cur_item
else:
ex_item = processed_data.get(key)
ex_ts = parse(ex_item.get("timestamp"))
cur_ts = parse(cur_item.get("timestamp"))
if cur_ts > ex_ts:
processed_data[key] = cur_item
return processed_data.values()
有没有更好的方法来解决这个问题,使用列表理解或itertools或任何其他方法这里有一种方法
根据键和时间戳对字典进行排序
x=sorted(data, key=lambda k: (k['key'],k['timestamp']), reverse=True)
print(x)
[{'key': 'key4', 'timestamp': '2017-08-03T10:24:39.762278'},
{'key': 'key3', 'timestamp': '2017-08-03T10:24:25.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:22.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:19.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:11.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:45.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:23.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:21.762278'}]
创建一个新列表并仅插入该键的第一个匹配项
new_list=[]
temp=None
for values in x:
if values['key']!=temp:
new_list.append(values)
temp=values['key']
print(new_list)
[{'key': 'key4', 'timestamp': '2017-08-03T10:24:39.762278'},
{'key': 'key3', 'timestamp': '2017-08-03T10:24:25.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:22.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:45.762278'}]
希望这有帮助
from datetime import datetime
from operator import itemgetter
from itertools import groupby
from dateutil.parser import parse
expected = [
{'key': 'key3', 'timestamp': '2017-08-03T10:24:25.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:22.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:45.762278'},
{'key': 'key4', 'timestamp': '2017-08-03T10:24:39.762278'}
]
data = [
{'key': 'key1', 'timestamp': '2017-08-03T10:24:21.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:22.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:23.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:19.762278'},
{'key': 'key3', 'timestamp': '2017-08-03T10:24:25.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:11.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:45.762278'},
{'key': 'key4', 'timestamp': '2017-08-03T10:24:39.762278'}
]
# alt way without dateutil
def dtconv(s):
return datetime.strptime(s, "%Y-%m-%dT%H:%M:%S.%f")
ds = sorted(data, key=lambda x: (x['key'], parse(x['timestamp'])), reverse=True)
result = []
for grouper, group in groupby(ds, key=itemgetter('key')):
result.append(next(group))
print("result:")
for r in result:
print(r)
print("expected")
for e in expected:
print(e)
# demonstrate it's equal to expected value
print(sorted(result, key=itemgetter('key')) == sorted(expected, key=itemgetter('key')))
尝试按键和日期戳对列表进行排序。然后,您可以执行一个
groupby
并获取第一个元素,这将是您想要保留的元素。按时间戳字符串的相反顺序对数据进行排序,然后每个唯一键的第一个外观将是您想要保留的
from dateutil.parser import parse
data = [
{'key': 'key1', 'timestamp': '2017-08-03T10:24:21.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:22.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:23.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:19.762278'},
{'key': 'key3', 'timestamp': '2017-08-03T10:24:25.762278'},
{'key': 'key2', 'timestamp': '2017-08-03T10:24:11.762278'},
{'key': 'key1', 'timestamp': '2017-08-03T10:24:45.762278'},
{'key': 'key4', 'timestamp': '2017-08-03T10:24:39.762278'}]
all_keys = [k['key'] for k in data]
all_keys_unique = set(all_keys)
new_dict = {}
for k in all_keys_unique:
#find all values for that key and parse them
values_of_key = [j['timestamp'] for j in data if k == j['key']]
parsed_values = [parse(k2) for k2 in values_of_key]
#use max to find latest time step, works on datetimes
#and add to dictionary
new_dict[k] = max(parsed_values)
print(new_dict)
data = sorted(data, key=lambda x: x["timestamp"], reverse=True)
used_keys, cleaned_data = [ ], [ ]
for item in data:
if not item['key'] in used_keys:
# if a key that we encounter in the list isn't used yet,
# add its corresponding item to cleaned_data and add it to
# used_keys so we know not to use it again.
cleaned_data.append(item)
used_keys.append(item['key'])
只是创建另一个dict,将key值作为key,比较时间戳并插入latest timestamp作为value。刚刚注意到有人发布的基本上就是这个。哦,好的,修好了。这个问题没有提到任何关于保留剩余键的原始顺序的内容,所以我假设按时间戳排序是可以的。是的,按时间戳排序是可以的。不需要订购请解释代码解决问题的原因和方式,而不是只发布代码答案。与中提供的实现相比,这需要更多的时间question@akashdeep即使这是真的。它更清楚,更容易推理。OP要求更好的解决方案,但这并不一定意味着它必须更快。几乎没有理由投票,但这是你的特权。最后两种是为了演示目的。我希望你没有把这些包括在你的时间里?