Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 保存NLTK HMM时出错_Python_Python 2.7_Nltk_Hidden Markov Models - Fatal编程技术网

Python 保存NLTK HMM时出错

Python 保存NLTK HMM时出错,python,python-2.7,nltk,hidden-markov-models,Python,Python 2.7,Nltk,Hidden Markov Models,我试图用Pickle保存NLTK的HMM Tagger,如下所示。但它给了我如下的错误, 请给我一个解决方案 >>> import nltk >>> import pickle >>> brown_a = nltk.corpus.brown.tagged_sents()[:300] >>> hmm_tagger=nltk.HiddenMarkovModelTagger.train(brown_a) >>>

我试图用Pickle保存NLTK的HMM Tagger,如下所示。但它给了我如下的错误, 请给我一个解决方案

>>> import nltk
>>> import pickle
>>> brown_a = nltk.corpus.brown.tagged_sents()[:300]
>>> hmm_tagger=nltk.HiddenMarkovModelTagger.train(brown_a)
>>> sent = nltk.corpus.brown.sents()[400]
>>> hmm_tagger.tag(sent)
[(u'He', u'PPS'), (u'is', u'BEZ'), (u'not', u'*'), (u'interested', u'VBN'), (u'in', u'IN'), (u'being', u'NN'), (u'named', u'IN'), (u'a', u'AT'), (u'full-time', u'JJ'), (u'director', u'NN'), (u'.', u'.')]
>>> f = open('my_tagger.pickle', 'wb')
>>> pickle.dump(hmm_tagger, f)

Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    pickle.dump(hmm_tagger, f)
  File "C:\Python27\lib\pickle.py", line 1376, in dump
    Pickler(file, protocol).dump(obj)
  File "C:\Python27\lib\pickle.py", line 224, in dump
    self.save(obj)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 425, in save_reduce
    save(state)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 655, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Python27\lib\pickle.py", line 669, in _batch_setitems
    save(v)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 425, in save_reduce
    save(state)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 655, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Python27\lib\pickle.py", line 669, in _batch_setitems
    save(v)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 754, in save_global
    (obj, module, name))
PicklingError: Can't pickle <function estimator at 0x0575F6F0>: it's not found as nltk.tag.hmm.estimator
>>> 
>>导入nltk
>>>进口泡菜
>>>brown\u a=nltk.corpus.brown.tagged\u sents()[:300]
>>>hmm\u tagger=nltk.HiddenMarkovModelTagger.train(brown\u a)
>>>sent=nltk.corpus.brown.sents()[400]
>>>hmm_tagger.tag(已发送)
[(u'He',u'PPS'),(u'is',u'BEZ'),(u'not',u'VBN'),(u'in',u'in'),(u'being',u'NN'),(u'named',u'in'),(u'a',u'AT'),(u'full-time',u'JJ'),(u'director',u'NN,(u'',u'',u'',u'',u'.]
>>>f=打开('my_tagger.pickle','wb')
>>>pickle.dump(hmm\u tagger,f)
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
pickle.dump(hmm\u tagger,f)
文件“C:\Python27\lib\pickle.py”,第1376行,位于转储中
Pickler(文件,协议).dump(obj)
文件“C:\Python27\lib\pickle.py”,第224行,位于转储文件中
自我保存(obj)
文件“C:\Python27\lib\pickle.py”,第331行,保存
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第425行,在save\u reduce中
保存(状态)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存目录中第655行的文件“C:\Python27\lib\pickle.py”
self.\u batch\u setitems(obj.iteritems())
文件“C:\Python27\lib\pickle.py”,第669行,在批处理设置项中
保存(v)
文件“C:\Python27\lib\pickle.py”,第331行,保存
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第425行,在save\u reduce中
保存(状态)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存目录中第655行的文件“C:\Python27\lib\pickle.py”
self.\u batch\u setitems(obj.iteritems())
文件“C:\Python27\lib\pickle.py”,第669行,在批处理设置项中
保存(v)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
文件“C:\Python27\lib\pickle.py”,第754行,在save\u global中
(对象、模块、名称))
PicklingError:无法pickle:找不到nltk.tag.hmm.estimator
>>> 
我使用的是Python 2.7.11,在MS-Windows10上使用的是NLTK3.1


提前谢谢。

为什么要对模型进行酸洗?布朗语料库上的训练速度非常快。如果你想要一个更好的词类标注者,考虑一下在Python容易使用的词组有很大的酸洗支持,并产生最先进的结果。事实上,现在贴标签的人真的很糟糕

无论如何,这是一个NLTK错误。三种选择:

  • 将错误报告给NLTK和/或通过将estimator函数移到_train函数之外以放入模块(以便pickle可以在
    NLTK.tag.hmm.estimator
  • 提供您自己的估计器函数,以便pickle在您自己的模块中找到它
  • 使用pickle替代方法,例如dill或cloudpickle:他们可能能够处理这个估计器函数
  • 下面是如何使用dill转储标记器:

    import nltk
    import dill
    
    brown_a = nltk.corpus.brown.tagged_sents()[:300]
    hmm_tagger=nltk.HiddenMarkovModelTagger.train(brown_a)
    sent = nltk.corpus.brown.sents()[400]
    hmm_tagger.tag(sent)
    # [(u'He', u'PPS'), (u'is', u'BEZ'), (u'not', u'*'), (u'interested', u'VBN'), (u'in', u'IN'), (u'being', u'NN'), (u'named', u'IN'), (u'a', u'AT'), (u'full-time', u'JJ'), (u'director', u'NN'), (u'.', u'.')]
    
    with open('my_tagger.dill', 'wb') as f:
        dill.dump(hmm_tagger, f)
    
    现在,您可以加载标记器:

    import dill
    
    with open('my_tagger.dill', 'rb') as f:
        hmm_tagger = dill.load(f)
    
    hmm_tagger.tag(sent)
    # [(u'He', u'PPS'), (u'is', u'BEZ'), (u'not', u'*'), (u'interested', u'VBN'), (u'in', u'IN'), (u'being', u'NN'), (u'named', u'IN'), (u'a', u'AT'), (u'full-time', u'JJ'), (u'director', u'NN'), (u'.', u'.')]