Python 正在使用另一个词典列表更新词典列表。有没有更快的办法？_Python

Python 正在使用另一个词典列表更新词典列表。有没有更快的办法？

python

Python 正在使用另一个词典列表更新词典列表。有没有更快的办法？,python,Python,我有一个字典列表，我需要用另一个字典列表中的信息更新它。我当前的解决方案（如下）是从第一个列表中选取每一本词典，并将其与第二个列表中的每一本词典进行比较。这是可行的，但有没有一种更快、更优雅的方法可以达到同样的效果 a = [ { "id": 1, "score":200 }, { "id": 2, "score":300 }, { "id":3, "score":400 } ] b = [ { "id": 1, "newscore":500 }, { "id": 2, "newscore":6

我有一个字典列表，我需要用另一个字典列表中的信息更新它。我当前的解决方案（如下）是从第一个列表中选取每一本词典，并将其与第二个列表中的每一本词典进行比较。这是可行的，但有没有一种更快、更优雅的方法可以达到同样的效果

a = [ { "id": 1, "score":200 }, { "id": 2, "score":300 }, { "id":3, "score":400 } ]
b = [ { "id": 1, "newscore":500 }, { "id": 2, "newscore":600 } ]
# update a with data from b
for item in a:
    for replacement in b:
        if item["id"]==replacement["id"]:
            item.update({"score": replacement["newscore"]})

使用第一个数组创建由

id

索引的字典。使用

id

循环第二个数组

for replacement in b:
   v = lookup.get(replacement['id'], None)
   if v is not None:
      v['score'] = replacement['newscore']

这会将

O（n^2）

问题转换为

O（n）

问题。

过程b不是执行len（a）*len（b）循环，而是更容易处理：

In [48]: replace = {d["id"]: {"score": d["newscore"]} for d in b}

In [49]: new_a = [{**d, **replace.get(d['id'], {})} for d in a]

In [50]: new_a
Out[50]: [{'id': 1, 'score': 500}, {'id': 2, 'score': 600}, {'id': 3, 'score': 400}]

请注意，

{**somedict}

语法需要Python的现代版本（>=3.5.）

如果您愿意使用

pandas

并且a、b是pandas数据帧，那么这里是一个单行程序

a.loc[a.id.isin(b.id), 'score'] = b.loc[b.id.isin(a.id), 'newscore']

将a，b转换为数据帧很简单，只需使用

pd.DataFrame.from\u records

如果您可以将“newscore”更改为“score”，则可以使用另一种方法

下面是timeit的结果

In [10]: %timeit c = a.copy(); c.update(b)
702 µs ± 37.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

列表理解：

[i.update({"score": x["newscore"]}) for x in b for i in a if i['id']==x['id']]
print(a)

%timeit [i.update({"score": x["newscore"]}) for x in b for i in a if i['id']==x['id']]

输出：

[{'id': 1, 'score': 500}, {'id': 2, 'score': 600}, {'id': 3, 'score': 400}]

100000 loops, best of 3: 3.9 µs per loop

计时：

[i.update({"score": x["newscore"]}) for x in b for i in a if i['id']==x['id']]
print(a)

%timeit [i.update({"score": x["newscore"]}) for x in b for i in a if i['id']==x['id']]

输出：

[{'id': 1, 'score': 500}, {'id': 2, 'score': 600}, {'id': 3, 'score': 400}]

100000 loops, best of 3: 3.9 µs per loop

首先创建要更新的分数记录：

>>> new_d={d['id']:d for d in b}
>>> new_d
{1: {'id': 1, 'newscore': 500}, 2: {'id': 2, 'newscore': 600}}

然后在上迭代并按id更新：

for d in a:
    if d['id'] in new_d:
        d['score']=new_d[d['id']]['newscore']

>>> a
[{'id': 1, 'score': 500}, {'id': 2, 'score': 600}, {'id': 3, 'score': 400}]

更简单的是：

new_d={d['id']:d['newscore'] for d in b}
for d in a:
    if d['id'] in new_d:
        d['score']=new_d[d['id']]

您是否愿意使用像

pandas

这样的库？

的第一个元素列表是否总是与

中匹配的元素列表元素具有相同的索引？如果您有一个非常大的数据集并且需要这样做，考虑移动到一个更快的语言，如SWIFT或生锈或C。Python努力使明显的东西快。最好保持python的超级可读性，而不是在一个比C慢很多很多倍的解释器中使用比C快一点的复杂代码。1）我认为您的语法是错误的——看起来您想创建一个字典，但实际上您正在尝试创建一个集，这将失败。2）它不是O（n logn）；字典是具有分期O（1）查找的哈希映射。可能需要类似于

{x['id']：x for x in a}

的东西来构建dict而不是一组元组。也可能意味着交换a和b，并且

v=lookup['id']

将在缺少键时键入error，如果v不是None，则永远不会到达

。在准备使用之前，这确实需要一些调整。@DSM-是的，我现在已经解决了。谢谢