如何在Python中使用索引获取列表的重复字符串
我确实意识到这已经在这里得到了解决(例如),还有更多。。。。。。不过,我希望这个问题是不同的 我需要编写一个程序来检查列表是否有重复项,如果有,则返回重复的元素和索引 样本列表如何在Python中使用索引获取列表的重复字符串,python,string,list,indexing,Python,String,List,Indexing,我确实意识到这已经在这里得到了解决(例如),还有更多。。。。。。不过,我希望这个问题是不同的 我需要编写一个程序来检查列表是否有重复项,如果有,则返回重复的元素和索引 样本列表样本列表 sample = """An article is any member of a class of dedicated words that are used with noun phrases to mark the identifiability of the referent
样本列表
sample = """An article is any member of a class of dedicated words that are used with noun phrases to
mark the identifiability of the referents of the noun phrases. The category of articles constitutes a
part of speech. In English, both "the" and "a" are articles, which combine with a noun to form a noun
phrase."""
sample_list = sample.split()
获取项目唯一集合的常用方法是使用集合,集合将有助于删除重复项
unique_list = list(set(my_list))
len(unique_list)
这是我尝试过的,但老实说,我不知道下一步该怎么做
from functools import partial
def list_duplicates_of(seq,item):
start_at = -1
locs = []
while True:
try:
loc = seq.index(item,start_at+1)
except ValueError:
break
else:
locs.append(loc)
start_at = loc
return locs
dups_in_source = partial(list_duplicates_of, my_list)
for i in my_list:
print(i, dups_in_source(i))
这将返回具有索引和重复索引的所有元素
an [0]
article [1]
.
.
.
form [51]
a [6, 33, 48, 52]
noun [15, 26, 49, 53]
phrase. [54]
这里我只想返回重复的元素及其索引,如下所示
of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
.
.
.
noun [15, 26, 49, 53]
您可以按照以下思路做一些事情:
from collections import defaultdict
indeces = defaultdict(list)
for i, w in enumerate(my_list):
indeces[w].append(i)
for k, v in indeces.items():
if len(v) > 1:
print(k, v)
of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
noun [15, 26, 49, 53]
to [17, 50]
the [19, 22, 25, 28]
这使用
collections.defaultdict
和enumerate
有效地收集每个单词的索引。消除重复项仍然是一个简单的条件理解或循环,使用if语句。Nah、sample\u list
和my\u list
仅区分大小写。@SayandipDutta my bad现在更新了所需的输出。您可能首先要从sample
中删除任何不是字母或空格字符的内容。
an [0]
article [1]
.
.
.
form [51]
a [6, 33, 48, 52]
noun [15, 26, 49, 53]
phrase. [54]
of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
.
.
.
noun [15, 26, 49, 53]
from collections import defaultdict
indeces = defaultdict(list)
for i, w in enumerate(my_list):
indeces[w].append(i)
for k, v in indeces.items():
if len(v) > 1:
print(k, v)
of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
noun [15, 26, 49, 53]
to [17, 50]
the [19, 22, 25, 28]