Nlp 调用NLTK SennaTagger的tag_sents()方法时,列表索引超出范围错误

Nlp 调用NLTK SennaTagger的tag_sents()方法时,列表索引超出范围错误,nlp,nltk,pos-tagger,index-error,senna,Nlp,Nltk,Pos Tagger,Index Error,Senna,IndexError:调用NLTK SennaTagger的tag_sents方法时,列表索引超出范围 一个句子列表作为tag_sentsmethod的输入 运行标记器需要一个senna可执行文件。SENNA toolkit的安装指南可以在这里找到 代码: 输出: Traceback (most recent call last): File "<ipython-input-90-886051c3d91d>", line 1, in <module> tag

IndexError:调用NLTK SennaTagger的tag_sents方法时,列表索引超出范围

一个句子列表作为tag_sentsmethod的输入

运行标记器需要一个senna可执行文件。SENNA toolkit的安装指南可以在这里找到

代码:

输出:

Traceback (most recent call last):

  File "<ipython-input-90-886051c3d91d>", line 1, in <module>
    tagged = pos_tagger.tag_sents(["All the banks are closed", "Today is Sunday"])

  File "F:\Programs\Anaconda3\lib\site-packages\nltk\tag\senna.py", line 55, in tag_sents
    tagged_sents = super(SennaTagger, self).tag_sents(sentences)

  File "F:\Programs\Anaconda3\lib\site-packages\nltk\classify\senna.py", line 161, in tag_sents
    result[tag] = tags[map_[tag]].strip()

IndexError: list index out of rangeenter code here

senna.tag_sents的输入是字符串列表,可以通过[word_tokenizesent for sent in sents]实现

如果您不想在标记之前具体化标记化的内容,请使用map:

>>> tokenized_sents = map(word_tokenize, sents)
>>> senna.tag_sents(tokenized_sents)
[[('All', u'PDT'), ('the', u'DT'), ('banks', u'NNS'), ('are', u'VBP'), ('closed', u'VBN')], [('Today', u'NN'), ('is', u'VBZ'), ('Sunday', u'NNP')]]
>>> from nltk import word_tokenize
>>> from nltk.tag import SennaTagger
>>> senna = SennaTagger('/home/alvas/senna/')
>>> sents = ["All the banks are closed", "Today is Sunday"]

>>> tokenized_sents = [word_tokenize(sent) for sent in sents]
>>> senna.tag_sents(tokenized_sents)
[[('All', u'PDT'), ('the', u'DT'), ('banks', u'NNS'), ('are', u'VBP'), ('closed', u'VBN')], [('Today', u'NN'), ('is', u'VBZ'), ('Sunday', u'NNP')]]
>>> tokenized_sents = map(word_tokenize, sents)
>>> senna.tag_sents(tokenized_sents)
[[('All', u'PDT'), ('the', u'DT'), ('banks', u'NNS'), ('are', u'VBP'), ('closed', u'VBN')], [('Today', u'NN'), ('is', u'VBZ'), ('Sunday', u'NNP')]]