如何在Python中使用索引获取列表的重复字符串_Python_String_List_Indexing

如何在Python中使用索引获取列表的重复字符串

python string list indexing

如何在Python中使用索引获取列表的重复字符串,python,string,list,indexing,Python,String,List,Indexing,我确实意识到这已经在这里得到了解决（例如），还有更多。。。。。。不过，我希望这个问题是不同的我需要编写一个程序来检查列表是否有重复项，如果有，则返回重复的元素和索引样本列表样本列表 sample = """An article is any member of a class of dedicated words that are used with noun phrases to mark the identifiability of the referent

我确实意识到这已经在这里得到了解决（例如），还有更多。。。。。。不过，我希望这个问题是不同的

我需要编写一个程序来检查列表是否有重复项，如果有，则返回重复的元素和索引

样本列表

样本列表

sample = """An article is any member of a class of dedicated words that are used with noun phrases to
mark the identifiability of the referents of the noun phrases. The category of articles constitutes a
part of speech. In English, both "the" and "a" are articles, which combine with a noun to form a noun
phrase."""

sample_list = sample.split()

获取项目唯一集合的常用方法是使用集合，集合将有助于删除重复项

unique_list = list(set(my_list))
len(unique_list)

这是我尝试过的，但老实说，我不知道下一步该怎么做

from functools import partial

def list_duplicates_of(seq,item):
    start_at = -1
    locs = []
    while True:
        try:
            loc = seq.index(item,start_at+1)
        except ValueError:
            break
        else:
            locs.append(loc)
            start_at = loc
    return locs

dups_in_source = partial(list_duplicates_of, my_list)

for i in my_list:
    print(i, dups_in_source(i))

这将返回具有索引和重复索引的所有元素

an [0]
article [1]
.
.
.
form [51]
a [6, 33, 48, 52]
noun [15, 26, 49, 53]
phrase. [54]

这里我只想返回重复的元素及其索引，如下所示

of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
.
.
.
noun [15, 26, 49, 53]

您可以按照以下思路做一些事情：

from collections import defaultdict

indeces = defaultdict(list)

for i, w in enumerate(my_list):
    indeces[w].append(i)

for k, v in indeces.items():
    if len(v) > 1:
        print(k, v)

of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
noun [15, 26, 49, 53]
to [17, 50]
the [19, 22, 25, 28]

这使用

collections.defaultdict

和

enumerate

有效地收集每个单词的索引。消除重复项仍然是一个简单的条件理解或循环，使用if语句。

Nah、

sample\u list

和

my\u list

仅区分大小写。@SayandipDutta my bad现在更新了所需的输出。您可能首先要从

sample

中删除任何不是字母或空格字符的内容。

an [0]
article [1]
.
.
.
form [51]
a [6, 33, 48, 52]
noun [15, 26, 49, 53]
phrase. [54]

of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
.
.
.
noun [15, 26, 49, 53]

from collections import defaultdict

indeces = defaultdict(list)

for i, w in enumerate(my_list):
    indeces[w].append(i)

for k, v in indeces.items():
    if len(v) > 1:
        print(k, v)

of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
noun [15, 26, 49, 53]
to [17, 50]
the [19, 22, 25, 28]