Python-如何比较多个DICT并删除重复值?
正如你在这里看到的,我有一个“主”字典,其中每个值本身就是一个dict。现在,我想比较主口述词的(可以超过2个)“name”值,例如“DE,Stuttgart”与“DE,Dresden”和X,只剩下唯一的“name”值 我知道如果x['key']=例如,无structure,但据我所知,我只能使用它来过滤单个词典 输入:Python-如何比较多个DICT并删除重复值?,python,Python,正如你在这里看到的,我有一个“主”字典,其中每个值本身就是一个dict。现在,我想比较主口述词的(可以超过2个)“name”值,例如“DE,Stuttgart”与“DE,Dresden”和X,只剩下唯一的“name”值 我知道如果x['key']=例如,无structure,但据我所知,我只能使用它来过滤单个词典 输入: "DE, Stuttgart": [ { "url": "http://twitter.com/search?q=%23ISIS",
"DE, Stuttgart": [
{
"url": "http://twitter.com/search?q=%23ISIS",
"query": "%23ISIS",
"tweet_volume": 21646,
"name": "#ISIS",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": null
}
],
"DE, Dresden": [
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": null
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": null
}
],
输出:
"DE, Stuttgart": [
{
"url": "http://twitter.com/search?q=%23ISIS",
"query": "%23ISIS",
"tweet_volume": 21646,
"name": "#ISIS",
"promoted_content": null
}
],
"DE, Dresden": [
],
假设
d1
和d2
是您的两本词典。您可以通过以下方式获得d1
中不在d2
中的按键列表:
[k for k in d if k not in d2]
假设
d1
和d2
是您的两本词典。您可以通过以下方式获得d1
中不在d2
中的按键列表:
[k for k in d if k not in d2]
这将输出任意位置所需的dict。请注意,@niemmi的解决方案效率更高:
main_dict = {"DE, Stuttgart": [
{
"url": "http://twitter.com/search?q=%23ISIS",
"query": "%23ISIS",
"tweet_volume": 21646,
"name": "#ISIS",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
],
"DE, Dresden": [
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
]
}
def get_names(main_dict, location):
return {small_dict["name"] for small_dict in main_dict[location]}
def get_names_from_other_locations(main_dict, location):
other_locations = [other_loc for other_loc in main_dict if other_loc != location]
return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}
def get_uniq_names(main_dict, location):
return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)
def get_dict(main_dict, location, name):
for small_dict in main_dict[location]:
if small_dict["name"] == name:
return small_dict
return None
print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}
这将输出任意位置所需的dict。请注意,@niemmi的解决方案效率更高:
main_dict = {"DE, Stuttgart": [
{
"url": "http://twitter.com/search?q=%23ISIS",
"query": "%23ISIS",
"tweet_volume": 21646,
"name": "#ISIS",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
],
"DE, Dresden": [
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
]
}
def get_names(main_dict, location):
return {small_dict["name"] for small_dict in main_dict[location]}
def get_names_from_other_locations(main_dict, location):
other_locations = [other_loc for other_loc in main_dict if other_loc != location]
return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}
def get_uniq_names(main_dict, location):
return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)
def get_dict(main_dict, location, name):
for small_dict in main_dict[location]:
if small_dict["name"] == name:
return small_dict
return None
print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}
您可以收集名称,然后重建原始dict,同时仅保留具有唯一名称的子dict:
main = {
"DE, Stuttgart": [
{
"url": "http://twitter.com/search?q=%23ISIS",
"query": "%23ISIS",
"tweet_volume": 21646,
"name": "#ISIS",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
],
"DE, Dresden": [
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
]
}
from collections import Counter
import pprint
names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}
pprint.pprint(result)
输出:
{'DE, Dresden': [],
'DE, Stuttgart': [{'name': '#ISIS',
'promoted_content': None,
'query': '%23ISIS',
'tweet_volume': 21646,
'url': 'http://twitter.com/search?q=%23ISIS'}]}
您可以收集名称,然后重建原始dict,同时仅保留具有唯一名称的子dict:
main = {
"DE, Stuttgart": [
{
"url": "http://twitter.com/search?q=%23ISIS",
"query": "%23ISIS",
"tweet_volume": 21646,
"name": "#ISIS",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
],
"DE, Dresden": [
{
"url": "http://twitter.com/search?q=%22Hans+Rosling%22",
"query": "%22Hans+Rosling%22",
"tweet_volume": 44855,
"name": "Hans Rosling",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
"query": "%22Betsy+DeVos%22",
"tweet_volume": 664741,
"name": "Betsy DeVos",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=Nioh",
"query": "Nioh",
"tweet_volume": 24160,
"name": "Nioh",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23FCBWOB",
"query": "%23FCBWOB",
"tweet_volume": 14216,
"name": "#FCBWOB",
"promoted_content": None
},
{
"url": "http://twitter.com/search?q=%23sid2017",
"query": "%23sid2017",
"tweet_volume": 28277,
"name": "#sid2017",
"promoted_content": None
}
]
}
from collections import Counter
import pprint
names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}
pprint.pprint(result)
输出:
{'DE, Dresden': [],
'DE, Stuttgart': [{'name': '#ISIS',
'promoted_content': None,
'query': '%23ISIS',
'tweet_volume': 21646,
'url': 'http://twitter.com/search?q=%23ISIS'}]}
那是什么语法?
null
不是python代码。这是JSON文本吗?@Jean-Françoisfare是的,我很确定它是可以加载的,但不能正确加载。是的,很抱歉,我应该添加它只是作为“print JSON.dumps()”输出,这是什么语法?null
不是python代码。这是JSON文本吗?@Jean-Françoisfare是的,我很确定它是可以加载的,但不能正确加载。是的,很抱歉我应该添加它,它只是作为“print JSON.dumps()”输出。我知道函数的功能,但在我有3个dict的那一刻,它还能工作吗?更新了答案。嘿,Eric,谢谢你的帮助!正如您在编辑中所写,我采用了@niemmi的解决方案,但非常感谢。我以前不知道你可以在python中直接“减”dict。请注意,我减集,而不是dict。我知道函数的功能,但现在我有3个dict,这仍然有效吗?更新了答案。嘿,Eric,谢谢你的帮助!正如您在编辑中所写,我采用了@niemmi的解决方案,但非常感谢。我以前不知道可以在python中直接“减”dicts。请注意,我减集,而不是dicts。效果非常好。我现在在主目录中总共有15个目录,它运行得非常快,正如预期的那样。运行得非常好。我现在在主目录中总共有15个目录,它运行得非常快,正如预期的那样。