匹配一套字典。最优雅的解决方案。python_Python_Dictionary

匹配一套字典。最优雅的解决方案。python

python dictionary

匹配一套字典。最优雅的解决方案。python,python,dictionary,Python,Dictionary,给出两个字典列表，新的和旧的。字典在两个列表中表示相同的对象。我需要找到差异并生成新的词典列表，其中将仅包含新词典中的对象，以及旧词典中更新的属性。例如：在那个例子中，我想生成一个新的列表，其中只有列表中的新成员才有更新的数据。由id匹配。所以鲍勃将成为波比，比尔将成为科尔·盖伊，瓦西亚将成为男人。艾尔维斯必须缺席给我一个优雅的解决方案。使用更少的迭代循环有办法，我决定。哪一个不是最好的： def match_dict(new_list, old_list) ids_new=

给出两个字典列表，新的和旧的。字典在两个列表中表示相同的对象。
我需要找到差异并生成新的词典列表，其中将仅包含新词典中的对象，以及旧词典中更新的属性。
例如：

在那个例子中，我想生成一个新的列表，其中只有列表中的新成员才有更新的数据。由

id

匹配。所以鲍勃将成为波比，比尔将成为科尔·盖伊，瓦西亚将成为男人。艾尔维斯必须缺席

给我一个优雅的解决方案。使用更少的迭代循环

有办法，我决定。哪一个不是最好的：

 def match_dict(new_list, old_list)
    ids_new=[]
    for item in new_list:
            ids_new.append(item['id'])
    result=[] 
    for item_old in old_medias:
        if item_old['id'] in ids_new:
            for item_new in new_list:
                if item_new['id']=item_old['id']
                    item_new['some_data']=item_old['some_data']
                    result.append(item_new)
    return result

我之所以怀疑，是因为循环中有循环。如果有2000个条目的列表，则该过程将花费相同的时间。

对于旧列表中的每个词典，在新列表中搜索具有相同id的词典，然后执行：

old\u dict.update（new\u dict）

更新后，从新目录列表中删除每个新目录，并在循环后附加剩余的未使用目录。

您需要这样的内容：

l = []
for d in list_old:
    for e in list_new:
        if e['id'] == d['id']:
            l.append(dict(e, **d))
print l

阅读如何合并词典。

如果您的顶级数据结构是一个dict而不是一个列表，您的情况会好得多。那就是：

dict_new.update(dict_old)

但是，对于您实际拥有的，请尝试以下方法：

result_list = []
for item in list_new:
    found_item = [d for d in list_old if d["id"] == item["id"]]
    if found_item:
        result_list.append(dict(item, **found_item[0]))

实际上，在循环中仍然有一个循环（在列表理解中，内部循环是“隐藏的”），所以它仍然是O（n**2）。在大型数据集上，将其转换为dict、更新并将其转换回列表无疑要快得多。

您可以这样做：

def match_dict(new_list, old_list):
    new_dict = dict((obj['id'], obj) for obj in new_list)
    old_dict = dict((obj['id'], obj) for obj in old_list)
    for k in new_dict.iterkeys():
        if k in old_dict:
            new_dict[k].update(old_dict[k])
        else:
            del new_dict[k]
    return new_dict.values()

如果您经常这样做，我建议您将数据存储为以id为键的字典，而不是列表，这样您就不必每次都转换数据

编辑：下面是一个示例，演示如何在字典中存储数据

list_new = [{'desc': 'cool guy', 'id': 1, 'name': 'bob'}, {'desc': 'bad guy', 'id': 2, 'name': 'Bill'}, {'desc': None, 'id': 3, 'name': 'Vasya'}]
# create a dictionary with the value of 'id' as the key
dict_new = dict((obj['id'], obj) for obj in list_new)
# now you can access entries by their id instead of having to loop through the list
print dict_new[2]
# {'id': 2, 'name': 'Bill', 'desc': 'bad guy'}

虽然不太可能做到一行，但这里有一个更简单的版本：

def match_new(new_list, old_list) :
    ids = dict((item['id'], item) for item in new_list)
    return [ids[item['id']] for item in old_list if item['id'] in ids]

在不知道数据约束的情况下，我假设

id

在每个列表中都是唯一的，并且您的列表只包含可散列的imutable类型（string、int等）

# first index each list by id
new = {item['id']: item for item in list_new}
old = {item['id']: item for item in list_old}

# now you can see which ids appeared in the new list
created = set(new.keys())-set(old.keys())
# or which ids were deleted
deleted =  set(old.keys())-set(new.keys())
# or which ids exists in the 2 lists
intersect = set(new.keys()).intersection(set(old.keys()))

# using the same 'conversion to set' trick,
# you can see what is different for each item
diff = {id: dict(set(new[id].items())-set(old[id].items())) for id in intersect}

# using your example data set, diff now contains the differences for items which exists in the two lists:
# {1: {'name': 'bob'}, 2: {'desc': 'bad guy'}, 3: {'name': 'Vasya', 'desc': None}}

# you can now add the new ids to this diff
diff.update({id: new[id] for id in created})
# and get your data back into the original format:
list_diff = [dict(data, **{'id': id}) for id,data in diff.items()]

这使用的是Python3语法，但应该很容易移植到Python2

编辑：以下是为python 2.5编写的相同代码：

new = dict((item['id'],item) for item in list_new)
old = dict((item['id'],item) for item in list_old)

created = set(new.keys())-set(old.keys())
deleted =  set(old.keys())-set(new.keys())
intersect = set(new.keys()).intersection(set(old.keys()))

diff = dict((id,dict(set(new[id].items())-set(old[id].items()))) for id in intersect)

diff.update(dict(id,new[id]) for id in created))
list_diff = [dict(data, **{'id': id}) for id,data in diff.items()]

（请注意，如果没有dict理解，代码的可读性会降低）

您可能会喜欢这个：

def match_dict(new_list, old_list):
    id_new = [item_new.get("id") for item_new in list_new]
    id_old = [item_old.get("id") for item_old in list_old]

    for idx_old in id_old:
        if idx_old in id_new:
            list_new[id_new.index(idx_old)].update(list_old[id_old.index(idx_old)])

    return list_new

from pprint import pprint
pprint(match_dict(list_new, list_old))

输出：

[{'desc': 'cool guy', 'id': 1, 'name': 'boby', 'some_data': '12345'},
 {'desc': 'cool guy', 'id': 2, 'name': 'Bill', 'some_data': '12345'},
 {'desc': 'the man', 'id': 3, 'name': 'vasya', 'some_data': '12345'}]

步骤：

为按id列出的旧列表创建查找字典
循环列出新的dict，如果旧的dict中存在，则为每个dict创建一个合并的dict

代码：

编辑：函数中的变量名称不正确。

是否从某处检索此列表？你能用id作为字典的键来重新构造字典列表吗？我试着用你的代码来区分你的输出和我的输出，但是不起作用（语法错误等）。请修好，谢谢。这是mongodb的字典。我试图通过django管理界面使其可编辑。我有典型的django表单集，不希望它单独推送每个dict，它会在使用表单集的页面中保存每个dict时对数据库进行大量点击。所以我想得到它，匹配它，然后用一次点击来推动。这个不会更新新dict附带的额外数据。它非常好。有5个循环。但是x*5小于x*x。如果x有时等于300。谢谢。你说字典是什么意思？能给我一些文档链接吗？或者看一些例子？我喜欢这个解决方案。美丽的。但它已经由科布拉斯提供了。谢谢。这里唯一的问题是它不会保存新对象。这也可能发生。但我没有提到它。仅供参考，此函数不会返回与原始match_dict（）函数匹配的结果。因为列表颠倒了。@koblas你指的是打字错误吗？新建列表/列表新旧列表/列表旧。这是一个错误。谢谢你指出这一点。

[{'desc': 'cool guy', 'id': 1, 'name': 'boby', 'some_data': '12345'},
 {'desc': 'cool guy', 'id': 2, 'name': 'Bill', 'some_data': '12345'},
 {'desc': 'the man', 'id': 3, 'name': 'vasya', 'some_data': '12345'}]

[od for od in list_old if od['id'] in {nd['id'] for nd in list_new}]

def match_dict(new_list, old_list): 
    old = dict((v['id'], v) for v in old_list)
    return [dict(d, **old[d['id']]) for d in new_list if d['id'] in old]