Python-如何比较多个DICT并删除重复值?

Python-如何比较多个DICT并删除重复值?,python,Python,正如你在这里看到的,我有一个“主”字典,其中每个值本身就是一个dict。现在,我想比较主口述词的(可以超过2个)“name”值,例如“DE,Stuttgart”与“DE,Dresden”和X,只剩下唯一的“name”值 我知道如果x['key']=例如,无structure,但据我所知,我只能使用它来过滤单个词典 输入: "DE, Stuttgart": [ { "url": "http://twitter.com/search?q=%23ISIS",

正如你在这里看到的,我有一个“主”字典,其中每个值本身就是一个dict。现在,我想比较主口述词的(可以超过2个)“name”值,例如“DE,Stuttgart”与“DE,Dresden”和X,只剩下唯一的“name”值

我知道如果x['key']=例如,无structure,但据我所知,我只能使用它来过滤单个词典

输入:

"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
], 
输出:

"DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS", 
            "query": "%23ISIS", 
            "tweet_volume": 21646, 
            "name": "#ISIS", 
            "promoted_content": null
        }
    ], 
    "DE, Dresden": [
    ], 

假设
d1
d2
是您的两本词典。您可以通过以下方式获得
d1
中不在
d2
中的按键列表:

[k for k in d if k not in d2]

假设
d1
d2
是您的两本词典。您可以通过以下方式获得
d1
中不在
d2
中的按键列表:

[k for k in d if k not in d2]

这将输出任意位置所需的dict。请注意,@niemmi的解决方案效率更高:

main_dict = {"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
]
}

def get_names(main_dict, location):
    return {small_dict["name"] for small_dict in main_dict[location]}

def get_names_from_other_locations(main_dict, location):
    other_locations = [other_loc for other_loc in main_dict if other_loc != location]
    return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}

def get_uniq_names(main_dict, location):
    return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)

def get_dict(main_dict, location, name):
    for small_dict in main_dict[location]:
        if small_dict["name"] == name:
            return small_dict
    return None

print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}

这将输出任意位置所需的dict。请注意,@niemmi的解决方案效率更高:

main_dict = {"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
]
}

def get_names(main_dict, location):
    return {small_dict["name"] for small_dict in main_dict[location]}

def get_names_from_other_locations(main_dict, location):
    other_locations = [other_loc for other_loc in main_dict if other_loc != location]
    return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}

def get_uniq_names(main_dict, location):
    return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)

def get_dict(main_dict, location, name):
    for small_dict in main_dict[location]:
        if small_dict["name"] == name:
            return small_dict
    return None

print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}

您可以收集名称,然后重建原始dict,同时仅保留具有唯一名称的子dict:

main = {
    "DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS",
            "query": "%23ISIS",
            "tweet_volume": 21646,
            "name": "#ISIS",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ],
    "DE, Dresden": [
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ]
}
from collections import Counter
import pprint

names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}

pprint.pprint(result)
输出:

{'DE, Dresden': [],
 'DE, Stuttgart': [{'name': '#ISIS',
                    'promoted_content': None,
                    'query': '%23ISIS',
                    'tweet_volume': 21646,
                    'url': 'http://twitter.com/search?q=%23ISIS'}]}

您可以收集名称,然后重建原始dict,同时仅保留具有唯一名称的子dict:

main = {
    "DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS",
            "query": "%23ISIS",
            "tweet_volume": 21646,
            "name": "#ISIS",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ],
    "DE, Dresden": [
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ]
}
from collections import Counter
import pprint

names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}

pprint.pprint(result)
输出:

{'DE, Dresden': [],
 'DE, Stuttgart': [{'name': '#ISIS',
                    'promoted_content': None,
                    'query': '%23ISIS',
                    'tweet_volume': 21646,
                    'url': 'http://twitter.com/search?q=%23ISIS'}]}


那是什么语法?
null
不是python代码。这是JSON文本吗?@Jean-Françoisfare是的,我很确定它是可以加载的,但不能正确加载。是的,很抱歉,我应该添加它只是作为“print JSON.dumps()”输出,这是什么语法?
null
不是python代码。这是JSON文本吗?@Jean-Françoisfare是的,我很确定它是可以加载的,但不能正确加载。是的,很抱歉我应该添加它,它只是作为“print JSON.dumps()”输出。我知道函数的功能,但在我有3个dict的那一刻,它还能工作吗?更新了答案。嘿,Eric,谢谢你的帮助!正如您在编辑中所写,我采用了@niemmi的解决方案,但非常感谢。我以前不知道你可以在python中直接“减”dict。请注意,我减集,而不是dict。我知道函数的功能,但现在我有3个dict,这仍然有效吗?更新了答案。嘿,Eric,谢谢你的帮助!正如您在编辑中所写,我采用了@niemmi的解决方案,但非常感谢。我以前不知道可以在python中直接“减”dicts。请注意,我减集,而不是dicts。效果非常好。我现在在主目录中总共有15个目录,它运行得非常快,正如预期的那样。运行得非常好。我现在在主目录中总共有15个目录,它运行得非常快,正如预期的那样。