Python-如何比较多个DICT并删除重复值？_Python

Python-如何比较多个DICT并删除重复值？

python

Python-如何比较多个DICT并删除重复值？,python,Python,正如你在这里看到的，我有一个“主”字典，其中每个值本身就是一个dict。现在，我想比较主口述词的（可以超过2个）“name”值，例如“DE，Stuttgart”与“DE，Dresden”和X，只剩下唯一的“name”值我知道如果x['key']=例如，无structure，但据我所知，我只能使用它来过滤单个词典输入： "DE, Stuttgart": [ { "url": "http://twitter.com/search?q=%23ISIS",

正如你在这里看到的，我有一个“主”字典，其中每个值本身就是一个dict。现在，我想比较主口述词的（可以超过2个）“name”值，例如“DE，Stuttgart”与“DE，Dresden”和X，只剩下唯一的“name”值

我知道如果x['key']=例如，无structure，但据我所知，我只能使用它来过滤单个词典

输入：

"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
],

输出：

"DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS", 
            "query": "%23ISIS", 
            "tweet_volume": 21646, 
            "name": "#ISIS", 
            "promoted_content": null
        }
    ], 
    "DE, Dresden": [
    ],

假设

d1

和

d2

是您的两本词典。您可以通过以下方式获得

d1

中不在

d2

中的按键列表：

[k for k in d if k not in d2]

假设

d1

和

d2

是您的两本词典。您可以通过以下方式获得

d1

中不在

d2

中的按键列表：

[k for k in d if k not in d2]

这将输出任意位置所需的dict。请注意，@niemmi的解决方案效率更高：

main_dict = {"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
]
}

def get_names(main_dict, location):
    return {small_dict["name"] for small_dict in main_dict[location]}

def get_names_from_other_locations(main_dict, location):
    other_locations = [other_loc for other_loc in main_dict if other_loc != location]
    return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}

def get_uniq_names(main_dict, location):
    return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)

def get_dict(main_dict, location, name):
    for small_dict in main_dict[location]:
        if small_dict["name"] == name:
            return small_dict
    return None

print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}

这将输出任意位置所需的dict。请注意，@niemmi的解决方案效率更高：

main_dict = {"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
]
}

def get_names(main_dict, location):
    return {small_dict["name"] for small_dict in main_dict[location]}

def get_names_from_other_locations(main_dict, location):
    other_locations = [other_loc for other_loc in main_dict if other_loc != location]
    return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}

def get_uniq_names(main_dict, location):
    return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)

def get_dict(main_dict, location, name):
    for small_dict in main_dict[location]:
        if small_dict["name"] == name:
            return small_dict
    return None

print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}

您可以收集名称，然后重建原始dict，同时仅保留具有唯一名称的子dict：

main = {
    "DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS",
            "query": "%23ISIS",
            "tweet_volume": 21646,
            "name": "#ISIS",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ],
    "DE, Dresden": [
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ]
}
from collections import Counter
import pprint

names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}

pprint.pprint(result)

输出：

{'DE, Dresden': [],
 'DE, Stuttgart': [{'name': '#ISIS',
                    'promoted_content': None,
                    'query': '%23ISIS',
                    'tweet_volume': 21646,
                    'url': 'http://twitter.com/search?q=%23ISIS'}]}

您可以收集名称，然后重建原始dict，同时仅保留具有唯一名称的子dict：

main = {
    "DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS",
            "query": "%23ISIS",
            "tweet_volume": 21646,
            "name": "#ISIS",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ],
    "DE, Dresden": [
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ]
}
from collections import Counter
import pprint

names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}

pprint.pprint(result)

输出：

{'DE, Dresden': [],
 'DE, Stuttgart': [{'name': '#ISIS',
                    'promoted_content': None,
                    'query': '%23ISIS',
                    'tweet_volume': 21646,
                    'url': 'http://twitter.com/search?q=%23ISIS'}]}

那是什么语法？

null

不是python代码。这是JSON文本吗？@Jean-Françoisfare是的，我很确定它是可以加载的，但不能正确加载。是的，很抱歉，我应该添加它只是作为“print JSON.dumps（）”输出，这是什么语法？

null

不是python代码。这是JSON文本吗？@Jean-Françoisfare是的，我很确定它是可以加载的，但不能正确加载。是的，很抱歉我应该添加它，它只是作为“print JSON.dumps（）”输出。我知道函数的功能，但在我有3个dict的那一刻，它还能工作吗？更新了答案。嘿，Eric，谢谢你的帮助！正如您在编辑中所写，我采用了@niemmi的解决方案，但非常感谢。我以前不知道你可以在python中直接“减”dict。请注意，我减集，而不是dict。我知道函数的功能，但现在我有3个dict，这仍然有效吗？更新了答案。嘿，Eric，谢谢你的帮助！正如您在编辑中所写，我采用了@niemmi的解决方案，但非常感谢。我以前不知道可以在python中直接“减”dicts。请注意，我减集，而不是dicts。效果非常好。我现在在主目录中总共有15个目录，它运行得非常快，正如预期的那样。运行得非常好。我现在在主目录中总共有15个目录，它运行得非常快，正如预期的那样。