Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/list/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/fsharp/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 列表/记录数据操作-删除重复项_Python_List_Dictionary_Duplicates - Fatal编程技术网

Python 列表/记录数据操作-删除重复项

Python 列表/记录数据操作-删除重复项,python,list,dictionary,duplicates,Python,List,Dictionary,Duplicates,在做了一些网络抓取和合并结果之后,我剩下一个字典列表。其中一个键(标题)是列表列表 thelist = [{"name":"a name", "titles":[["foo","bar", ... ],["foo","baz",["..."], ... ]]}, {"name":"another name", "titles":[["foo","bar", ... ],["foo","baz",["..."], ... ]]}, ... ] 目标是消除出现在每个词典标题列表中多个列表中的标题

在做了一些网络抓取和合并结果之后,我剩下一个字典列表。其中一个键(标题)是列表列表

 thelist = [{"name":"a name", "titles":[["foo","bar", ... ],["foo","baz",["..."], ... ]]},
{"name":"another name", "titles":[["foo","bar", ... ],["foo","baz",["..."], ... ]]}, ... ]
目标是消除出现在每个词典标题列表中多个列表中的标题,并用单个标题列表(无重复)替换标题列表

我现在编写的代码可以正确地访问列表列表中的所有项目,但是我在消除重复项方面遇到了困难

match = ""
for dicts in thelist:
    for listoftitles in dicts['titles']:
        for title in listoftitles:
            title = match
        for title in listoftitles:
            if match == title:
                print title
                #del title

看来,比赛永远不等于标题中的价值。我尝试过改变循环的嵌套方式,但到目前为止没有任何效果。我在什么地方迷路了,我不知道还能尝试什么。非常感谢您的建议。

获取无重复列表的惯用方法是
list(set(some_iterable))

加入一个理解列表,我们就可以

thelist = [{'name': 'a name', 'titles': [['foo','bar'],['foo','baz']]}]

print [
    {
        'name': d['name'],
        'titles': list(set(title for lst in d['titles'] for title in lst)) 
    }
    for d in thelist 
]
印刷品

[{'name': 'a name', 'titles': ['baz', 'foo', 'bar']}]

获取无重复列表的惯用方法是
list(set(some_iterable))

加入一个理解列表,我们就可以

thelist = [{'name': 'a name', 'titles': [['foo','bar'],['foo','baz']]}]

print [
    {
        'name': d['name'],
        'titles': list(set(title for lst in d['titles'] for title in lst)) 
    }
    for d in thelist 
]
印刷品

[{'name': 'a name', 'titles': ['baz', 'foo', 'bar']}]

dict是可变的,因此您只需使用
itertools.chain
将列表展平,即可更新原始列表中的每个dict:

l = [{'name': 'a name', 'titles': [['foo','bar'],['foo','baz']]}]

from itertools import chain
for d in l:
    d["titles"] = list(set(chain.from_iterable(d["titles"])))

print(l)
输出:

[{'titles': ['bar', 'baz', 'foo'], 'name': 'a name'}]
[{'name': 'a name', 'titles': ['foo', 'bar', 'baz']}]
如果要保持每个子元素的显示顺序,可以使用
OrderedDict
删除重复:

from itertools import chain
from collections import OrderedDict

for d in l:
    d["titles"] = list(OrderedDict.fromkeys(chain.from_iterable(d["titles"])))

print(l)
输出:

[{'titles': ['bar', 'baz', 'foo'], 'name': 'a name'}]
[{'name': 'a name', 'titles': ['foo', 'bar', 'baz']}]

dict是可变的,因此您只需使用
itertools.chain
将列表展平,即可更新原始列表中的每个dict:

l = [{'name': 'a name', 'titles': [['foo','bar'],['foo','baz']]}]

from itertools import chain
for d in l:
    d["titles"] = list(set(chain.from_iterable(d["titles"])))

print(l)
输出:

[{'titles': ['bar', 'baz', 'foo'], 'name': 'a name'}]
[{'name': 'a name', 'titles': ['foo', 'bar', 'baz']}]
如果要保持每个子元素的显示顺序,可以使用
OrderedDict
删除重复:

from itertools import chain
from collections import OrderedDict

for d in l:
    d["titles"] = list(OrderedDict.fromkeys(chain.from_iterable(d["titles"])))

print(l)
输出:

[{'titles': ['bar', 'baz', 'foo'], 'name': 'a name'}]
[{'name': 'a name', 'titles': ['foo', 'bar', 'baz']}]

哇,太漂亮了,做得很好。我刚开始接触python,没有意识到这些都是选项。非常感谢保罗!哇,太漂亮了,做得很好。我刚开始接触python,没有意识到这些都是选项。非常感谢保罗!