如何使用Python将由列表组成的值与字典中的常用项组合起来？_Python_List_Dictionary_Merge_Set Union

如何使用Python将由列表组成的值与字典中的常用项组合起来？

python list dictionary merge

如何使用Python将由列表组成的值与字典中的常用项组合起来？,python,list,dictionary,merge,set-union,Python,List,Dictionary,Merge,Set Union,我有一本类似以下内容的词典： dict1 = {'key1':['1','2','3'],'key2':['3','4','5'],'key3':['6','7','8']} 我想合并至少有一个公共元素的所有键，结果是。例如，生成的字典应该如下所示： dict1 = {'key1':['1','2','3','4','5'],'key3':['6','7','8']} 请注意如何删除键2。消除的是键1还是键2并不重要。我只知道识别重复，但不知道如何以迭代的方式合并它们。谢谢你这样行吗？请注

我有一本类似以下内容的词典：

dict1 = {'key1':['1','2','3'],'key2':['3','4','5'],'key3':['6','7','8']}

我想合并至少有一个公共元素的所有键，结果是。例如，生成的字典应该如下所示：

dict1 = {'key1':['1','2','3','4','5'],'key3':['6','7','8']}

请注意如何删除键2。消除的是键1还是键2并不重要。

我只知道识别重复，但不知道如何以迭代的方式合并它们。谢谢你这样行吗？请注意，由于字典中元素的顺序是任意的，因此无法保证最终将哪些键插入到输出字典中

dict_out = {}
processed = set()
for k1, v1 in dict_in.items():
    if k1 not in processed:
        processed.add(k1)
        vo = v1
        for k2, v2 in dict_in.items():
            if k2 not in processed and set(v1) & set(v2):
                vo = sorted(list(set(vo + v2)))
                processed.add(k2)
        dict_out[k1] = vo

这是为了：

dict_in = {'key1': ['1', '2', '3'], 'key2': ['3', '4', '5'], 'key3': ['6', '7', '8']}

给出：

{'key1': {'1', '2', '3', '4', '5'}, 'key3': ['6', '7', '8']}

以及：

dict_in = {'key1': ['1', '2', '3'], 'key2': ['3', '4', '5'],
           'key3': ['6', '7', '8'], 'key4': ['7', '9']}

给出：

{'key1': {'1', '2', '3', '4', '5'}, 'key3': {'6', '7', '8', '9'}}

最后，为了：

dict_in = {'key1': ['1', '2', '3'], 'key2': ['3', '4', '5'],
           'key3': ['6', '7', '8'], 'key4': ['5', '6', '7']}

它给出：

{'key1': {'1', '2', '3', '4', '5'}, 'key3': {'5', '6', '7', '8'}}

编辑

OP要求，即使是合并结果也应相互合并。为了实现这一点，我们可以将上述代码封装在如下循环中：

d = dict_in processed = set([None]) while processed: dict_out = {} processed = set() for k1, v1 in d.items(): if k1 not in processed: vo = v1 for k2, v2 in d.items(): if k1 is not k2 and set(vo) & set(v2): vo = sorted(list(set(vo + v2))) processed.add(k2) dict_out[k1] = vo d = dict_out
然后，为了：

dict_in = {'key1': ['1', '2', '3'], 'key2': ['3', '4', '5'], 'key3': ['6', '7', '8'], 'key4': ['5', '6', '7']}
我们得到：

{'key4': ['1', '2', '3', '4', '5', '6', '7', '8']}

{'key1': ['1', '2', '3', '4', '5', '6', '7'], 'key4': ['8', '9']}
以及：

dict_in = {'key1': ['1', '2', '3'], 'key2': ['3', '4', '5'], 'key3': ['4', '6', '7'], 'key4': ['8', '9']}
我们得到：

{'key4': ['1', '2', '3', '4', '5', '6', '7', '8']}

{'key1': ['1', '2', '3', '4', '5', '6', '7'], 'key4': ['8', '9']}

如果您想更改原始dict，您需要复制：

vals = {k: set(val) for k, val in dict1.items()} for key, val in dict1.copy().items(): for k, v in vals.copy().items(): if k == key: continue if v.intersection(val): union = list(v.union(val)) dict1[key] = union del vals[k] del dict1[k]
如果要联合所有用户：

vals = {k: set(val) for k, val in dict1.items()} unioned = set() srt = sorted(dict1.keys()) srt2 = srt[:] for key in srt: for k in srt2: if k == key: continue if vals[k].intersection(dict1[key]) and key not in unioned: unioned.add(k) dict1[key] = list(vals[k].union(dict1[key])) srt2.remove(k) for k in unioned: del dict1[k]

我有一个更简洁的方法
我认为它更容易阅读和理解。您可以参考以下内容：

dict1 = {'key1':['1','2','3'],'key2':['3','4','5'],'key3':['6','7','8']} # Index your key of dict l = list(enumerate(sorted(dict1.keys()))) # nested loop for i in xrange(len(dict1)): for j in xrange(i+1,len(dict1)): i_key, j_key = l[i][1], l[j][1] i_value, j_value = set(dict1[i_key]), set(dict1[j_key]) # auto detect: if the values have common element to do union if i_value & j_value: union_list = sorted(list(i_value | j_value)) dict1[i_key] = union_list del dict1[j_key] print dict1 #{'key3': ['6', '7', '8'], 'key1': ['1', '2', '3', '4', '5']}

如果键2和键3共享例如值4怎么办？为什么是键1而不是键2？dicts没有顺序，所以先来的键不是guaranteed@Padraic，因为它们的值（'3'）中有一个公共项，所以键3中的所有项对于键3都是唯一的，所以它仍然是独立的，但是为什么键1仍然存在，而您删除了键2？应该删除键1或键2。我不在乎是哪一个字母写在{'key1'：['1'，'2'，'3'，'4'，'5']，'key3'：['6'，'7'，'8']，'key4'：['9'，'10']}给出：
{'key1'：['1'，'2'，'3'，'4'，'5']，'key3'：['6'，'7'，'8']，'key4'：['9'，'10']
。这不是很正确吗？这非常适合合并一次，但是以前合并的键/值对不能与其他键/值对合并。例如：dict_in={'key1'：['1'，'2'，'3']，'key2'：['3'，'4'，'5']，'key3'：['4'，'6'，'7']，'key4'：['8'，'9']}这就给出了：{'key3'：set（['8'，'3'，'5'，'4'，'7']），'key1'：['1'，'2'，'3']，'key4'，'9'，'10']），这是不正确的，因为key1和key3中都有3。太近了！这种行为是故意的：）我认为这是你真正需要的。我喜欢这个答案，非常好用。安德烈是第一个。。。谢谢你的努力和帮助。@Vincem不用担心，我不完全确定更新值后会发生什么，我添加了另一种处理方法