Python 如何在nltk中使用hunpos标记文本文件？_Python_Nltk_Corpus_Pos Tagger

Python 如何在nltk中使用hunpos标记文本文件？

python

Python 如何在nltk中使用hunpos标记文本文件？,python,nltk,corpus,pos-tagger,Python,Nltk,Corpus,Pos Tagger,有人能帮我在nltk中标注语料库的语法吗我要进口什么我如何给语料库贴上标签？请参阅下面的代码我觉得问题在于你没有标记这些单词，但是代码可能不起作用还有其他原因（这是HunposTagger，不是HunposTagger）。我根据你的问题做了这个简化的例子。如果您还有任何问题，请发表评论我从这里得到了一切： python hunpos.py [（'so'，'RB'），（'how'，'WRB'），（'do'，'VBP'），（'i'，'FW'），（'hunpos'，'NN'），（'tag'，

有人能帮我在nltk中标注语料库的语法吗

我要进口什么

我如何给语料库贴上标签？请参阅下面的代码

我觉得问题在于你没有标记这些单词，但是代码可能不起作用还有其他原因（这是HunposTagger，不是HunposTagger）。我根据你的问题做了这个简化的例子。如果您还有任何问题，请发表评论

我从这里得到了一切：

python hunpos.py

[（'so'，'RB'），（'how'，'WRB'），（'do'，'VBP'），（'i'，'FW'），（'hunpos'，'NN'），（'tag'，'NN'），（'my'，'PRP$），（'ntuen'，'NN'），（'i'，'FW'），（'ca'，'MD，（'get'，'VB，（'t'，'DT，（'following'，'JJ'），（'code'，'NN'），（'to'，'to'），（'work和'，'VB等）。]

我从这里得到了一切：

python hunpos.py

我设法将其编码到nltk中。我得把每个句子分成一行。谢谢然后运行这个命令ht.tag（file.readline（）.split（）），我设法将它编码到nltk中。我得把每个句子分成一行。谢谢然后运行此命令ht.tag（file.readline（）.split（））

import nltk 
from nltk.corpus import PlaintextCorpusReader  
from nltk.corpus.util import LazyCorpusLoader  

corpus_root = './'  
reader = PlaintextCorpusReader (corpus_root, '.*')  

ntuen = LazyCorpusLoader ('ntumultien', PlaintextCorpusReader, reader)  
ntuen.fileids()  
isinstance (ntuen, PlaintextCorpusReader)  


# So how do I hunpos tag `ntuen`? I can't get the following code to work.
# please help me to correct my python syntax errors, I'm new to python 
# but i really need this to work. sorry
##from nltk.tag import hunpos.HunPosTagger
ht = HunPosTagger('english.model')
for sentence in ntu.sent() ##looping through the no. of sentence
     ht.tag(ntusent()[i])

import nltk 
from nltk.tag.hunpos import HunposTagger
from nltk.tokenize import word_tokenize

corpus = "so how do i hunpos tag my ntuen ? i can't get the following code to work."
#please help me to correct my python syntax errors, i'm new to python 
#but i really need this to work. sorry
##from nltk.tag import hunpos.HunPosTagger
ht = HunposTagger('en_wsj.model')
print ht.tag(word_tokenize(corpus))