Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/302.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在nltk中使用hunpos标记文本文件?_Python_Nltk_Corpus_Pos Tagger - Fatal编程技术网

Python 如何在nltk中使用hunpos标记文本文件?

Python 如何在nltk中使用hunpos标记文本文件?,python,nltk,corpus,pos-tagger,Python,Nltk,Corpus,Pos Tagger,有人能帮我在nltk中标注语料库的语法吗 我要进口什么 我如何给语料库贴上标签?请参阅下面的代码 我觉得问题在于你没有标记这些单词,但是代码可能不起作用还有其他原因(这是HunposTagger,不是HunposTagger)。我根据你的问题做了这个简化的例子。如果您还有任何问题,请发表评论 我从这里得到了一切: python hunpos.py [('so','RB'),('how','WRB'),('do','VBP'),('i','FW'),('hunpos','NN'),('tag',

有人能帮我在nltk中标注语料库的语法吗

  • 我要进口什么

  • 我如何给语料库贴上标签?请参阅下面的代码


  • 我觉得问题在于你没有标记这些单词,但是代码可能不起作用还有其他原因(这是HunposTagger,不是HunposTagger)。我根据你的问题做了这个简化的例子。如果您还有任何问题,请发表评论

    我从这里得到了一切:

    python hunpos.py

    [('so','RB'),('how','WRB'),('do','VBP'),('i','FW'),('hunpos','NN'),('tag','NN'),('my','PRP$),('ntuen','NN'),('i','FW'),('ca','MD,('get','VB,('t','DT,('following','JJ'),('code','NN'),('to','to'),('work和','VB等)。]

    我觉得问题在于你没有标记这些单词,但是代码可能不起作用还有其他原因(这是HunposTagger,不是HunposTagger)。我根据你的问题做了这个简化的例子。如果您还有任何问题,请发表评论

    我从这里得到了一切:

    python hunpos.py

    [('so','RB'),('how','WRB'),('do','VBP'),('i','FW'),('hunpos','NN'),('tag','NN'),('my','PRP$),('ntuen','NN'),('i','FW'),('ca','MD,('get','VB,('t','DT,('following','JJ'),('code','NN'),('to','to'),('work和','VB等)。]


    我设法将其编码到nltk中。我得把每个句子分成一行。谢谢然后运行这个命令ht.tag(file.readline().split()),我设法将它编码到nltk中。我得把每个句子分成一行。谢谢然后运行此命令ht.tag(file.readline().split())
    import nltk 
    from nltk.corpus import PlaintextCorpusReader  
    from nltk.corpus.util import LazyCorpusLoader  
    
    corpus_root = './'  
    reader = PlaintextCorpusReader (corpus_root, '.*')  
    
    ntuen = LazyCorpusLoader ('ntumultien', PlaintextCorpusReader, reader)  
    ntuen.fileids()  
    isinstance (ntuen, PlaintextCorpusReader)  
    
    
    # So how do I hunpos tag `ntuen`? I can't get the following code to work.
    # please help me to correct my python syntax errors, I'm new to python 
    # but i really need this to work. sorry
    ##from nltk.tag import hunpos.HunPosTagger
    ht = HunPosTagger('english.model')
    for sentence in ntu.sent() ##looping through the no. of sentence
         ht.tag(ntusent()[i])
    
    import nltk 
    from nltk.tag.hunpos import HunposTagger
    from nltk.tokenize import word_tokenize
    
    corpus = "so how do i hunpos tag my ntuen ? i can't get the following code to work."
    #please help me to correct my python syntax errors, i'm new to python 
    #but i really need this to work. sorry
    ##from nltk.tag import hunpos.HunPosTagger
    ht = HunposTagger('en_wsj.model')
    print ht.tag(word_tokenize(corpus))