Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/286.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何防止Wordnet同义词返回重复结果?_Python_Nltk - Fatal编程技术网

Python 如何防止Wordnet同义词返回重复结果?

Python 如何防止Wordnet同义词返回重复结果?,python,nltk,Python,Nltk,我有下面的函数,它将单词拆分,然后返回同义词,我想知道为什么它返回这么多重复的值?如何防止这种情况发生,使其仅显示唯一的值,同时以相同的格式维护结果 from nltk.stem import WordNetLemmatizer from nltk.corpus import wordnet string = 'Crime Count' syns = {w : [] for w in string.split(" ")} for k, v in syns.items():

我有下面的函数,它将单词拆分,然后返回同义词,我想知道为什么它返回这么多重复的值?如何防止这种情况发生,使其仅显示唯一的值,同时以相同的格式维护结果

from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet

string = 'Crime Count'

syns = {w : [] for w in string.split(" ")}
for k, v in syns.items():
    for synset in wordnet.synsets(k):
        for lemma in synset.lemmas():
            if lemma.name() not in syns:
                v.append(lemma.name())

syns
结果:

{'Crime': ['crime',
  'offense',
  'criminal_offense',
  'criminal_offence',
  'offence',
  'law-breaking',
  'crime'],
 'Count': ['count',
  'count',
  'counting',
  'numeration',
  'enumeration',
  'reckoning',
  'tally',
  'count',
  'count',
  'number',
  'enumerate',
  'numerate',
  'count',
  'matter',
  'weigh',
  'consider',
  'count',
  'weigh',
  'count',
  'count',
  'number',
  'count',
  'count',
  'count',
  'bet',
  'depend',
  'look',
  'calculate',
  'reckon',
  'reckon',
  'count']}

在您的示例中,您写道:

if lemma.name() not in syns:
这将检查同义词是否作为键而不是值存在于
syn
中。你可以改为:

if lemma.name() not in v:
为了得到你想要的结果

或者,您可以使用来防止添加重复项

syns = {w : set() for w in string.split(" ")}
for k, v in syns.items():
    for synset in wordnet.synsets(k):
        for lemma in synset.lemmas():
            v.add(lemma.name())

在您的示例中,您写道:

if lemma.name() not in syns:
这将检查同义词是否作为键而不是值存在于
syn
中。你可以改为:

if lemma.name() not in v:
为了得到你想要的结果

或者,您可以使用来防止添加重复项

syns = {w : set() for w in string.split(" ")}
for k, v in syns.items():
    for synset in wordnet.synsets(k):
        for lemma in synset.lemmas():
            v.add(lemma.name())