在Python中按顺序聚合和计算DICT
有一个像这样的“d”:在Python中按顺序聚合和计算DICT,python,dictionary,itertools,ordereddictionary,Python,Dictionary,Itertools,Ordereddictionary,有一个像这样的“d”: [OrderedDict([ ('id', '1'), ('date', '20170101'), ('quantity', '10')]), OrderedDict([ ('id', '2'), ('date', '20170102'), ('quantity', '3')]), OrderedDic
[OrderedDict([
('id', '1'),
('date', '20170101'),
('quantity', '10')]),
OrderedDict([
('id', '2'),
('date', '20170102'),
('quantity', '3')]),
OrderedDict([
('id', '3'),
('date', '20170102'),
('quantity', '1')])]
我试图按“日期”分组,计算数量之和,并显示这两列“日期”和“数量之和”。如果不使用pandas groupby选项,我如何做到这一点
谢谢 这里是纯python方法,这只是一个给您提示的示例。如果您想在纯python中使用,可以使用这个
from collections import OrderedDict
import itertools
data=[OrderedDict([
('id', '1'),
('date', '20170101'),
('quantity', '10')]),
OrderedDict([
('id', '2'),
('date', '20170102'),
('quantity', '3')]),
OrderedDict([
('id', '3'),
('date', '20170102'),
('quantity', '1')])]
def get_quantity(ord_dict):
new_ = []
for g in [list(i) for j, i in itertools.groupby(ord_dict, lambda x: x['date'])]:
if len(g) > 1:
sub_dict={}
temp = []
date = []
for i in g:
temp.append(int(i['quantity']))
date.append(i['date'])
sub_dict['date'] = date[0]
sub_dict['sum_quantity'] = sum(temp)
new_.append(sub_dict)
else:
for i in g:
sub_dict={}
sub_dict['date']=i['date']
sub_dict['sum_quantity']=i['quantity']
new_.append(sub_dict)
return new_
print(get_quantity(data))
输出:
[{'date': '20170101', 'sum_quantity': '10'}, {'date': '20170102', 'sum_quantity': 4}]
我试图按“日期”分组,计算数量之和,并显示这两列“日期”和“数量之和”
该代码将日期作为键,然后该值是数量的总和。在显示所需输出的示例之前,输出只是一种猜测
In[2]: from collections import OrderedDict, defaultdict
...:
...:
...: def solution(data):
...: result = defaultdict(int)
...: for od in data:
...: result[od['date']] += int(od['quantity'])
...: return result
...:
In[3]: data = [
...: OrderedDict([
...: ('id', '1'),
...: ('date', '20170101'),
...: ('quantity', '10')]),
...: OrderedDict([
...: ('id', '2'),
...: ('date', '20170102'),
...: ('quantity', '3')]),
...: OrderedDict([
...: ('id', '3'),
...: ('date', '20170102'),
...: ('quantity', '1')])
...: ]
In[4]: grouped = solution(data)
In[5]: grouped
Out[5]: defaultdict(int, {'20170101': 10, '20170102': 4})
In[6]: print('{:>8}\tSum Quantity'.format('Date'))
...: for k, v in grouped.items():
...: print('{}\t{:>12}'.format(k, v))
...:
Date Sum Quantity
20170101 10
20170102 4
给定的
from collections import OrderedDict, defaultdict
lst = [
OrderedDict([
("id", "1"),
("date", "20170101"),
("quantity", "10")]),
OrderedDict([
("id", "2"),
("date", "20170102"),
("quantity", "3")]),
OrderedDict([
("id", "3"),
("date", "20170102"),
("quantity", "1")])
]
借用配方:
代码
map\u reduce
使用可自定义的键和值构建一个defaultdict
。缩减函数应用于最终的值列表
kfunc = lambda d: d["date"]
vfunc = lambda d: int(d["quantity"])
rfunc = lambda lst_: sum(lst_)
agg = map_reduce(lst, keyfunc=kfunc, valuefunc=vfunc, reducefunc=rfunc)
agg
# defaultdict(None, {'20170101': 10, '20170102': 4})
我们使用列表理解最终结果
[{"date": k, "sum_quantity": v} for k, v in agg.items()]
# [{'date': '20170101', 'sum_quantity': 10}, {'date': '20170102', 'sum_quantity': 4}]
为什么忽略“日期”、“20170101”?他说他只想显示分组的日期和。正如他所说的“这两列‘日期’和‘总量’”,这不是我解析他的英语的方式。@StevenRumbalski,如果我弄错了,请纠正我。是的,我们的想法是对列表中的所有日期进行聚合。“显示这两列”是什么意思,打印出来?你能展示你想要的输出吗?
[{"date": k, "sum_quantity": v} for k, v in agg.items()]
# [{'date': '20170101', 'sum_quantity': 10}, {'date': '20170102', 'sum_quantity': 4}]