使用python中的词典列表_Python_Dictionary

使用python中的词典列表

python dictionary

使用python中的词典列表,python,dictionary,Python,Dictionary,我有一份字典的清单，请看下面 raw_list = [ {"item_name": "orange", "id": 12, "total": 2}, {"item_name": "apple", "id": 12}, {"item_name": "apple", "id":

我有一份字典的清单，请看下面

raw_list = [
    {"item_name": "orange", "id": 12, "total": 2},
    {"item_name": "apple", "id": 12},
    {"item_name": "apple", "id": 34, "total": 22},
]

预期产出应为

[
    {"item_name": ["orange", "apple"], "id": 12, "total": 2},
    {"item_name": "apple", "id": 34, "total": 22},
]

但我是怎么得到的

[
    {"item_name": "orangeapple", "id": 12, "total": 2},
    {"item_name": "apple", "id": 34, "total": 22},
]

这是我的密码

comp_key = "id"
conc_key = "item_name"
res = []
for ele in test_list:
    temp = False
    for ele1 in res:
        if ele1[comp_key] == ele[comp_key]:
            ele1[conc_key] = ele1[conc_key] + ele[conc_key]
            temp = True
            break
    if not temp:
        res.append(ele)

如何解决…？

类似这样的问题-特殊的调味品是

isinstance

东西，以确保将串联值改为列表

请注意，这假定原始列表是按

comp_键

（

id

）排序的，如果不是这样，则会出现错误行为

raw_list = [
    {"item_name": "orange", "id": 12, "total": 2},
    {"item_name": "apple", "id": 12},
    {"item_name": "apple", "id": 34, "total": 22},
]

comp_key = "id"
conc_key = "item_name"
grouped_items = []
for item in raw_list:
    last_group = grouped_items[-1] if grouped_items else None
    if not last_group or last_group[comp_key] != item[comp_key]:  # Starting a new group?
        grouped_items.append(item.copy())  # Shallow-copy the item into the result array
    else:
        if not isinstance(last_group[conc_key], list):
            # The concatenation key is not a list yet, make it so
            last_group[conc_key] = [last_group[conc_key]]
        last_group[conc_key].append(item[conc_key])
print(grouped_items)

您可以使用pandas按分组，并将函数应用于两列，然后转换为dict

[{'id': 12, 'item_name': ['orange', 'apple'], 'total': 2.0},
 {'id': 34, 'item_name': ['apple'], 'total': 22.0}]

您可以使用

itertools.grouper

按id进行分组，并使用

collections.defaultdict

将具有相同键的值组合到列表中

from itertools import groupby
from collections import defaultdict

id_getter = lambda x: x['id']
gp = groupby(sorted(raw_list, key=id_getter), key=id_getter)

out = []
for _,i in gp:
    subdict = defaultdict(list)
    for j in i:
        for k,v in j.items():
            subdict[k].append(v)
    out.append(dict(subdict))

out

使用复杂的数据类型，如嵌套列表和字典，我建议您真正利用

集合

和

itertools

提供的API，您可能会有更好的时间先按

id

对项目进行分组，然后在这些组中工作。Fwiw最好使用一致的数据格式。将输出中的项目名称始终设置为列表。创建和使用数据会更简单。@PaulRooney+1..但当选中此项时，列表中的每一项都将

[{'item_name'：['orange'，'apple']，'id'：[12,12]，'total'：[2]}，{'item_name'：['apple']，'id'：[34]，'total'：[22]

。但是@AKX答案对我的影响更大请查看@paul rooney关于一致数据格式的评论。使用具有不同数据格式的dict处理值（或键）将使您试图实现的目标过于复杂。相反，当您从这个映射中获取某些内容时，只需从该映射中获取您当时需要的单个值。将一个值作为list，另一个值作为string或int是不好的做法。这是

collections

允许dict值使用列表类型默认值（或dict类型等）的唯一原因。事实上，拥有一个

defaultdict（list）

允许您在以后使用pandas时立即将其转换为数据帧。这也是为什么@galaxyan必须应用2条groupby（或更多）语句，然后将其连接起来，而不是简单地分组一次的原因。谢谢@AKX先生。这个答案更具影响力。我得到了预期的输出。谢谢@galaxyan。在cod以上执行时，我得到了无效的语法，因此我改进了上述代码

导入pandas作为pd df=pd.DataFrame（raw_list）dd=pd.concat（[df.groupby（'id'）['item_name']].apply（list），df.groupby（'id'）['total'].apply（sum）]，axis=1）。将索引（）dd重置为dict（'records'）

。

from itertools import groupby
from collections import defaultdict

id_getter = lambda x: x['id']
gp = groupby(sorted(raw_list, key=id_getter), key=id_getter)

out = []
for _,i in gp:
    subdict = defaultdict(list)
    for j in i:
        for k,v in j.items():
            subdict[k].append(v)
    out.append(dict(subdict))

out