Python 保存NLTK HMM时出错
我试图用Pickle保存NLTK的HMM Tagger,如下所示。但它给了我如下的错误, 请给我一个解决方案Python 保存NLTK HMM时出错,python,python-2.7,nltk,hidden-markov-models,Python,Python 2.7,Nltk,Hidden Markov Models,我试图用Pickle保存NLTK的HMM Tagger,如下所示。但它给了我如下的错误, 请给我一个解决方案 >>> import nltk >>> import pickle >>> brown_a = nltk.corpus.brown.tagged_sents()[:300] >>> hmm_tagger=nltk.HiddenMarkovModelTagger.train(brown_a) >>>
>>> import nltk
>>> import pickle
>>> brown_a = nltk.corpus.brown.tagged_sents()[:300]
>>> hmm_tagger=nltk.HiddenMarkovModelTagger.train(brown_a)
>>> sent = nltk.corpus.brown.sents()[400]
>>> hmm_tagger.tag(sent)
[(u'He', u'PPS'), (u'is', u'BEZ'), (u'not', u'*'), (u'interested', u'VBN'), (u'in', u'IN'), (u'being', u'NN'), (u'named', u'IN'), (u'a', u'AT'), (u'full-time', u'JJ'), (u'director', u'NN'), (u'.', u'.')]
>>> f = open('my_tagger.pickle', 'wb')
>>> pickle.dump(hmm_tagger, f)
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
pickle.dump(hmm_tagger, f)
File "C:\Python27\lib\pickle.py", line 1376, in dump
Pickler(file, protocol).dump(obj)
File "C:\Python27\lib\pickle.py", line 224, in dump
self.save(obj)
File "C:\Python27\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "C:\Python27\lib\pickle.py", line 425, in save_reduce
save(state)
File "C:\Python27\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "C:\Python27\lib\pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())
File "C:\Python27\lib\pickle.py", line 669, in _batch_setitems
save(v)
File "C:\Python27\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "C:\Python27\lib\pickle.py", line 425, in save_reduce
save(state)
File "C:\Python27\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "C:\Python27\lib\pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())
File "C:\Python27\lib\pickle.py", line 669, in _batch_setitems
save(v)
File "C:\Python27\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "C:\Python27\lib\pickle.py", line 754, in save_global
(obj, module, name))
PicklingError: Can't pickle <function estimator at 0x0575F6F0>: it's not found as nltk.tag.hmm.estimator
>>>
>>导入nltk
>>>进口泡菜
>>>brown\u a=nltk.corpus.brown.tagged\u sents()[:300]
>>>hmm\u tagger=nltk.HiddenMarkovModelTagger.train(brown\u a)
>>>sent=nltk.corpus.brown.sents()[400]
>>>hmm_tagger.tag(已发送)
[(u'He',u'PPS'),(u'is',u'BEZ'),(u'not',u'VBN'),(u'in',u'in'),(u'being',u'NN'),(u'named',u'in'),(u'a',u'AT'),(u'full-time',u'JJ'),(u'director',u'NN,(u'',u'',u'',u'',u'.]
>>>f=打开('my_tagger.pickle','wb')
>>>pickle.dump(hmm\u tagger,f)
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
pickle.dump(hmm\u tagger,f)
文件“C:\Python27\lib\pickle.py”,第1376行,位于转储中
Pickler(文件,协议).dump(obj)
文件“C:\Python27\lib\pickle.py”,第224行,位于转储文件中
自我保存(obj)
文件“C:\Python27\lib\pickle.py”,第331行,保存
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第425行,在save\u reduce中
保存(状态)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存目录中第655行的文件“C:\Python27\lib\pickle.py”
self.\u batch\u setitems(obj.iteritems())
文件“C:\Python27\lib\pickle.py”,第669行,在批处理设置项中
保存(v)
文件“C:\Python27\lib\pickle.py”,第331行,保存
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第425行,在save\u reduce中
保存(状态)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存目录中第655行的文件“C:\Python27\lib\pickle.py”
self.\u batch\u setitems(obj.iteritems())
文件“C:\Python27\lib\pickle.py”,第669行,在批处理设置项中
保存(v)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
文件“C:\Python27\lib\pickle.py”,第754行,在save\u global中
(对象、模块、名称))
PicklingError:无法pickle:找不到nltk.tag.hmm.estimator
>>>
我使用的是Python 2.7.11,在MS-Windows10上使用的是NLTK3.1
提前谢谢。为什么要对模型进行酸洗?布朗语料库上的训练速度非常快。如果你想要一个更好的词类标注者,考虑一下在Python容易使用的词组有很大的酸洗支持,并产生最先进的结果。事实上,现在贴标签的人真的很糟糕 无论如何,这是一个NLTK错误。三种选择:
NLTK.tag.hmm.estimator
import nltk
import dill
brown_a = nltk.corpus.brown.tagged_sents()[:300]
hmm_tagger=nltk.HiddenMarkovModelTagger.train(brown_a)
sent = nltk.corpus.brown.sents()[400]
hmm_tagger.tag(sent)
# [(u'He', u'PPS'), (u'is', u'BEZ'), (u'not', u'*'), (u'interested', u'VBN'), (u'in', u'IN'), (u'being', u'NN'), (u'named', u'IN'), (u'a', u'AT'), (u'full-time', u'JJ'), (u'director', u'NN'), (u'.', u'.')]
with open('my_tagger.dill', 'wb') as f:
dill.dump(hmm_tagger, f)
现在,您可以加载标记器:
import dill
with open('my_tagger.dill', 'rb') as f:
hmm_tagger = dill.load(f)
hmm_tagger.tag(sent)
# [(u'He', u'PPS'), (u'is', u'BEZ'), (u'not', u'*'), (u'interested', u'VBN'), (u'in', u'IN'), (u'being', u'NN'), (u'named', u'IN'), (u'a', u'AT'), (u'full-time', u'JJ'), (u'director', u'NN'), (u'.', u'.')]