Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 词典列表中的词典集_Python_Dictionary_Set - Fatal编程技术网

Python 词典列表中的词典集

Python 词典列表中的词典集,python,dictionary,set,Python,Dictionary,Set,正在尝试在列表中查找一组词典 假设我有以下词典列表: rm_dict = [{'name':'rick','subject':'adventure time mortttty buugh','body':['wubba lubba dub dubbb motha f*&^%!', 'morty get over here!']}, {'name':'rick','subject':'adventure time mortttty buugh','body':['wubba lubba

正在尝试在列表中查找一组词典

假设我有以下词典列表:

rm_dict = [{'name':'rick','subject':'adventure time mortttty buugh','body':['wubba lubba dub dubbb motha f*&^%!', 'morty get over here!']},
 {'name':'rick','subject':'adventure time mortttty buugh','body':['wubba lubba dub dubbb motha f*&^%!', 'morty get over here!']},
 {'name':'morty','subject':'re:adventure time mortttty buugh','body':['youre drunk rick!', 'I'm going to get mom', 'you always do this']}]
尝试设置,我得到一个错误

set(rm_dict)
我获取消息/电子邮件的正文,因为我将使用它定义为唯一的,并创建所有电子邮件正文的列表,然后我将为
set(tuple())
等生成一个生成器

list_of_body = [x['body'] for x in rm_dict]
>>[['wubba lubba dub dubbb motha f*&^%!'],
  ['wubba lubba dub dubbb motha f*&^%!'],
  ['youre drunk rick!']]

[list(item) for item in set(tuple(row) for row in list_of_body)]
>>[['wubba lubba dub dubbb motha f*&^%!'], ['youre drunk rick!']]

这成功地从body的
列表中获取了唯一的body,但我希望从原始列表中获取整个字典。

您的错误消息告诉您一件重要的事情:字典或列表都不可散列,因此不能用作set的成员。解决这个问题的一种方法是使用
str
,它是数据中电子邮件正文的第0个元素

您可以基于列表的一个键“uniqify”列表:

>>> seen = set()
>>> [i for i in rm_dict if i['body'][0] not in seen and not seen.add(i['body'][0])]
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]
>>> seen = set()
>>> emails = []
>>> for i in rm_dict:
...     body = i['body'][0]
...     if body not in seen:
...         emails.append(i)
...         seen.add(body)
...         

>>> emails
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]
这是另一种形式,没有理解:

>>> seen = set()
>>> [i for i in rm_dict if i['body'][0] not in seen and not seen.add(i['body'][0])]
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]
>>> seen = set()
>>> emails = []
>>> for i in rm_dict:
...     body = i['body'][0]
...     if body not in seen:
...         emails.append(i)
...         seen.add(body)
...         

>>> emails
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]

您的错误消息告诉您一些重要信息:字典或列表都不可散列,因此不能用作集合的成员。解决这个问题的一种方法是使用
str
,它是数据中电子邮件正文的第0个元素

您可以基于列表的一个键“uniqify”列表:

>>> seen = set()
>>> [i for i in rm_dict if i['body'][0] not in seen and not seen.add(i['body'][0])]
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]
>>> seen = set()
>>> emails = []
>>> for i in rm_dict:
...     body = i['body'][0]
...     if body not in seen:
...         emails.append(i)
...         seen.add(body)
...         

>>> emails
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]
这是另一种形式,没有理解:

>>> seen = set()
>>> [i for i in rm_dict if i['body'][0] not in seen and not seen.add(i['body'][0])]
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]
>>> seen = set()
>>> emails = []
>>> for i in rm_dict:
...     body = i['body'][0]
...     if body not in seen:
...         emails.append(i)
...         seen.add(body)
...         

>>> emails
[{'name': 'rick',
  'subject': 'adventure time mortttty buugh',
  'body': ['wubba lubba dub dubbb motha f*&^%!']},
 {'name': 'morty',
  'subject': 're:adventure time mortttty buugh',
  'body': ['youre drunk rick!']}]

集合项必须是可散列的,而DICT不是。您可以使用
pickle
序列化所有dict,然后使用
set
获取唯一项,最后将它们反序列化回dict:

import pickle
print(list(map(pickle.loads, set(map(pickle.dumps, rm_dict)))))
这将产生:

[{'name': 'morty', 'subject': 're:adventure time mortttty buugh', 'body': ['youre drunk rick!']}, {'name': 'rick', 'subject': 'adventure time mortttty buugh', 'body': ['wubba lubba dub dubbb motha f*&^%!']}]

集合项必须是可散列的,而DICT不是。您可以使用
pickle
序列化所有dict,然后使用
set
获取唯一项,最后将它们反序列化回dict:

import pickle
print(list(map(pickle.loads, set(map(pickle.dumps, rm_dict)))))
这将产生:

[{'name': 'morty', 'subject': 're:adventure time mortttty buugh', 'body': ['youre drunk rick!']}, {'name': 'rick', 'subject': 'adventure time mortttty buugh', 'body': ['wubba lubba dub dubbb motha f*&^%!']}]

我的示例有点错误,因为我的
body
是多个字符串的列表,而不仅仅是1个。因此,将其转换为元组并使用代码修复了它。非常感谢。我的示例有点错误,因为我的
body
是多个字符串的列表,而不仅仅是1个。因此,将其转换为元组并使用代码修复了它。非常感谢。