Python 3.x 未找到NLTK conll2002_ned_IIS.pickle
我尝试将NLTK与下面的代码conll2002一起使用,使用 我在解包NLTK Trainer的目录下运行了以下命令Python 3.x 未找到NLTK conll2002_ned_IIS.pickle,python-3.x,nltk,Python 3.x,Nltk,我尝试将NLTK与下面的代码conll2002一起使用,使用 我在解包NLTK Trainer的目录下运行了以下命令 python train_chunker.py conll2002--fileids ned.train--classifier NaiveBayes--filename/nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle 我找到了picle文件(conll2002_ned_NaiveBayes.pickle)并复制了chunk
python train_chunker.py conll2002--fileids ned.train--classifier NaiveBayes--filename/nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle
我找到了picle文件(conll2002_ned_NaiveBayes.pickle)并复制了chunker文件
目录(C:\Users\Administrator\AppData\Roaming\nltk\U data\chunkers)。这是NLTK.download下载包的地方
并尝试执行以下代码:
import nltk
from nltk.corpus import conll2002
tokenizer = nltk.data.load('tokenizers/punkt/dutch.pickle')
tagger = nltk.data.load('taggers/conll2002_ned_IIS.pickle')
chunker = nltk.data.load('chunkers/conll2002_ned_NaiveBayes.pickle')
test_sents = conll2002.tagged_sents(fileids="ned.testb")[0:1000]
print "tagger accuracy on test-set: " + str(tagger.evaluate(test_sents))
test_sents = conll2002.chunked_sents(fileids="ned.testb")[0:1000]
print chunker.evaluate(test_sents)
但运行此代码后,我得到以下错误:
LookupError:
找不到资源u'taggers/conll2002_ned_IIS.pickle'。请
我试图用NLTK.download()GUI加载所有的包和模型,但仍然得到相同的错误
有人知道如何解决这个问题吗?非常感谢
埃里克你必须同时训练打标签者和分块者
python train_chunker.py conll2002 --fileids ned.train --classifier NaiveBayes --filename ~/nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle
这使得:
loading conll2002
using chunked sentences from ned.train
15806 chunks, training on 15806
training ClassifierChunker with ['NaiveBayes'] classifier
Constructing training corpus for classifier.
Training classifier (202644 instances)
training NaiveBayes classifier
evaluating ClassifierChunker
ChunkParse score:
IOB Accuracy: 95.4%
Precision: 66.9%
Recall: 71.9%
F-Measure: 69.3%
dumping ClassifierChunker to /home/hugo/nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle
现在,训练打标签的人:
python train_tagger.py conll2002 --fileids ned.train --classifier IIS --filename ~/nltk_data/chunkers/conll2002_ned_IIS.pickle
其中:
loading conll2002
using tagged sentences from ned.train
15806 tagged sents, training on 15806
training AffixTagger with affix -3 and backoff <DefaultTagger: tag=-None->
training <class 'nltk.tag.sequential.UnigramTagger'> tagger with backoff <AffixTagger: size=3988>
training <class 'nltk.tag.sequential.BigramTagger'> tagger with backoff <UnigramTagger: size=7799>
training <class 'nltk.tag.sequential.TrigramTagger'> tagger with backoff <BigramTagger: size=1451>
training ['IIS'] ClassifierBasedPOSTagger
Constructing training corpus for classifier.
Training classifier (202644 instances)
training IIS classifier
==> Training (10 iterations)
evaluating ClassifierBasedPOSTagger
accuracy: 0.980666
dumping ClassifierBasedPOSTagger to /home/hugo/nltk_data/chunkers/conll2002_ned_IIS.pickle
加载conll2002
使用ned.train的标记句
15806标记Sent,在15806上训练
使用affix-3和backoff训练AffixTagger
带后退的训练标记器
带后退的训练标记器
带后退的训练标记器
培训['IIS']分类标签
构建分类器训练语料库。
训练分类器(202644个实例)
训练IIS分类器
==>培训(10次迭代)
评价分类器
准确度:0.980666
正在将ClassifierBasedPOSTagger转储到/home/hugo/nltk_data/chunkers/conll2002_ned_IIS.pickle
这需要一些时间。。。
现在您应该准备好了…您是否运行了链接问题中指定的
python train\u chunker.py conll2002…
命令?Sebastian,我在解包NLTK Trainer的目录下运行了以下命令。python train_chunker.py conll2002--fileids ned.train--classifier NaiveBayes--filename/nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle。但我还是犯了错误。未找到资源u'taggers/dutch.pickle'。请使用NLTK Downloader获取资源:>>>NLTK.download()在:中搜索:如果命令创建pickle文件;确保将它们复制到nltk_data directorySebastian的相应子目录中,是的,我做到了!请参见问题中的更新。但还是得到了同样的信息。