Python 如何合并具有相同键且值相同的DICT列表？_Python_Dictionary_Merge

Python 如何合并具有相同键且值相同的DICT列表？

python dictionary merge

Python 如何合并具有相同键且值相同的DICT列表？,python,dictionary,merge,Python,Dictionary,Merge,我是巨蟒纽贝。我已经查看了堆栈溢出，但找不到与此完全类似的问题。我正在尝试合并一个具有相同键且值相同的dict列表（因此在我的示例中，合并名称相同的dict）这是我目前的名单： current = [ {'name' : 'food festival', 'category' : ['Miscellaneous', 'Undefined'], 'venue' : 'venue_1', 'price_1' : 100, 'price_2' : 120, 'start' : '2017-1

我是巨蟒纽贝。我已经查看了堆栈溢出，但找不到与此完全类似的问题。我正在尝试合并一个具有相同键且值相同的dict列表（因此在我的示例中，合并名称相同的dict）

这是我目前的名单：

current = [
    {'name' : 'food festival', 'category' : ['Miscellaneous', 'Undefined'], 'venue' : 'venue_1', 'price_1' : 100, 'price_2' : 120, 'start' : '2017-10-04T14:30:00Z'},
    {'name' : 'food festival', 'category' : ['Miscellaneous', 'Undefined'], 'venue' : 'venue_2', 'price_1' : 150, 'price_2' : 200, 'start' : '2017-11-04T14:30:00Z'},
    {'name' : 'music festival', 'category': ['music', 'pop'], 'venue' : 'venue_3', 'price_1' : 300, 'price_2' : 320, 'start' : '2017-12-04T14:30:00Z'}
    ]

这就是我想要实现的目标：

final = [
  {
    'name': 'food festival',
    'category': ['Miscellaneous', 'Undefined'],
    'shows': [
      {
        'start': '2017-10-04T14:30:00Z',
        'venue': 'venue_1',
        'prices': [
          { 'price_1' : 100 },
          { 'price_2' : 120}
        ]
      },
      {
        'start': '2017-11-04T14:30:00Z',
        'venue': 'venue_2',
        'prices': [
          { 'price_1': 150 },
          { 'price_2' : 200 }
        ]
      }
    ]
  },
  {
    'name': 'music festival',
    'category': ['music', 'pop'],
    'shows': [
      {
        'start': '2017-12-04T14:30:00Z',
        'venue': 'venue_3',
        'prices': [
          { 'price_1' : 300 },
          { 'price_2' : 320}
        ]
      }
   ]
  }
]

您的数据结构有点混乱。我假设输入，

current

，必须按原样固定，但为了更清晰，我对

final

做了一些修改。我认为这种格式的

final

将更易使用，更易于交互，不过如果您真的想要其他版本的

final

，请告诉我

import pprint

current = [
    {'name' : 'food festival', 'category' : ['Miscellaneous', 'Undefined'], 'venue' : 'venue_1', 'price_1' : 100, 'price_2' : 120, 'start' : '2017-10-04T14:30:00Z'},
    {'name' : 'food festival', 'category' : ['Miscellaneous', 'Undefined'], 'venue' : 'venue_2', 'price_1' : 150, 'price_2' : 200, 'start' : '2017-11-04T14:30:00Z'},
    {'name' : 'music festival', 'category': ['music', 'pop'], 'venue' : 'venue_3', 'price_1' : 300, 'price_2' : 320, 'start' : '2017-12-04T14:30:00Z'}
    ]

final = {}

for fest in current:
    name = fest["name"]
    if name not in final:
        final[name] = {"category": fest["category"],
                       "shows": []}

    show = {attr: fest[attr] for attr in ["start", "venue", "price_1", "price_2"]}

    final[name]["shows"].append(show)

pprint.pprint(final)

这具有以下输出：

{'food festival': {'category': ['Miscellaneous', 'Undefined'],
                   'shows': [{'price_1': 100,
                              'price_2': 120,
                              'start': '2017-10-04T14:30:00Z',
                              'venue': 'venue_1'},
                             {'price_1': 150,
                              'price_2': 200,
                              'start': '2017-11-04T14:30:00Z',
                              'venue': 'venue_2'}]},
 'music festival': {'category': ['music', 'pop'],
                    'shows': [{'price_1': 300,
                               'price_2': 320,
                               'start': '2017-12-04T14:30:00Z',
                               'venue': 'venue_3'}]}}

注意：我使用的dict理解可能是特定于Python3的某个版本，我不确定。它可以很容易地替换为

    show = dict((attr, fest[attr]) for attr in ["start", "venue", "price_1", "price_2"])

我没有太大的改变——主要是最后一个，现在是一个

dict

，节日的名称是代表它的dict的一个键，我只是把

price_1

和

price_2

作为键，因为它们只有两个，在我看来，这并不能真正证明字典列表是正确的

另一个建议：您可以使用Python的

None

对象，而不是字符串

“Undefined”

from pprint import pprint as pp


current = [
    {'name' : 'food festival', 'category' : ['Miscellaneous', 'Undefined'], 'venue' : 'venue_1', 'price_1' : 100, 'price_2' : 120, 'start' : '2017-10-04T14:30:00Z'},
    {'name' : 'food festival', 'category' : ['Miscellaneous', 'Undefined'], 'venue' : 'venue_2', 'price_1' : 150, 'price_2' : 200, 'start' : '2017-11-04T14:30:00Z'},
    {'name' : 'music festival', 'category': ['music', 'pop'], 'venue' : 'venue_3', 'price_1' : 300, 'price_2' : 320, 'start' : '2017-12-04T14:30:00Z'}
]


SPECIAL_EVENT_KEYS = ("name", "category")
INVALID_INDEX = -1


def convert_event(event, special_event_keys=SPECIAL_EVENT_KEYS):
    ret = dict()
    prices_list = list()
    for key in event:
        if key in special_event_keys:
            continue
        elif key.startswith("price_"):
            prices_list.append({key: event[key]})
        else:
            ret[key] = event[key]
    ret["prices"] = prices_list
    return ret


def merge_events_data(events, special_event_keys=SPECIAL_EVENT_KEYS):
    ret = list()
    for event in events:
        existing_index = INVALID_INDEX
        for idx, obj in enumerate(ret):
            for key in special_event_keys:
                if obj[key] != event[key]:
                    break
            else:
                existing_index = idx
        if existing_index == INVALID_INDEX:
            new_object = dict()
            for key in special_event_keys:
                new_object[key] = event[key]
            new_object["shows"] = [convert_event(event, special_event_keys=special_event_keys)]
            ret.append(new_object)
        else:
            ret[existing_index]["shows"].append(convert_event(event, special_event_keys=special_event_keys))
    return ret;


def main():
    merged_events = merge_events_data(current)
    print("\nResulting object:\n")
    pp(merged_events)
    #print("Equal:", merged_events == final) # Commented out to avoid including the contents of 'final' in the answer as it would get too large; add it and decomment for testing purpose


if __name__ == "__main__":
    main()

注释：

该算法依赖于这样一个事实：如果两个（输入）事件具有相同的键值：
```
name
```
和
```
category
```
，它们将合并在一起（通过
```
显示
```
列表），否则它们将是合并结果中的单独条目
```
convert\u event
```
：将事件视为初始列表中的事件，并将其转换为输出列表中的事件：
- 去掉
```
名称
```
  和
```
类别
```
  键
- 将字典中的
```
prices.*
```
  条目聚合为与
```
prices
```
  键对应的列表
```
合并事件数据
```
：迭代初始事件列表并
- 如果输出列表中不存在该事件（没有具有匹配的
```
名称
```
  和
```
类别
```
  值的条目），它将创建该事件
- 如果找到这样的事件，则其内容（
```
显示
```
  ）将使用当前事件数据进行扩充
代码与Python3和Python2兼容
从风格和性能角度来看，它肯定可以得到改进

输出：

您对“价格”键的更改看起来比按键合并更复杂。。你确定你想让“价格”成为一个单长度目录的列表吗？是的，这将是最理想的结果-也许我应该在标题和解释中对此更清楚一点。我想像@IzaakvanDongen一样，为什么不使用一个列表，而索引是价格的基础呢价格：[300320']您的代码在哪里？

e:\Work\Dev\StackOverflow\q45794604>c:\Install\x64\Python\3.5.3\python.exe a.py

Merged object:

[{'category': ['Miscellaneous', 'Undefined'],
  'name': 'food festival',
  'shows': [{'prices': [{'price_2': 120}, {'price_1': 100}],
             'start': '2017-10-04T14:30:00Z',
             'venue': 'venue_1'},
            {'prices': [{'price_2': 200}, {'price_1': 150}],
             'start': '2017-11-04T14:30:00Z',
             'venue': 'venue_2'}]},
 {'category': ['music', 'pop'],
  'name': 'music festival',
  'shows': [{'prices': [{'price_2': 320}, {'price_1': 300}],
             'start': '2017-12-04T14:30:00Z',
             'venue': 'venue_3'}]}]