DKPro核心Groovy,NP识别不工作

DKPro核心Groovy,NP识别不工作,groovy,nlp,dkpro-core,Groovy,Nlp,Dkpro Core,我是groovy的新手,我正在尝试使用DKPro内核来实现一些nlp功能。在这一点上,我试图识别文本中的姓名短语。我可以正确识别标记、句子和命名实体,但由于某些原因,NP类无法识别标记、句子和命名实体。我的代码如下所示,请指出错误 #!/usr/bin/env groovy @Grab(group='de.tudarmstadt.ukp.dkpro.core', version='1.5.0', module='de.tudarmstadt.ukp.dkpro.core.stanfo

我是groovy的新手,我正在尝试使用DKPro内核来实现一些nlp功能。在这一点上,我试图识别文本中的姓名短语。我可以正确识别标记、句子和命名实体,但由于某些原因,NP类无法识别标记、句子和命名实体。我的代码如下所示,请指出错误

#!/usr/bin/env groovy
@Grab(group='de.tudarmstadt.ukp.dkpro.core', version='1.5.0',
      module='de.tudarmstadt.ukp.dkpro.core.stanfordnlp-gpl')
@Grab(group='de.tudarmstadt.ukp.dkpro.core',
    module='de.tudarmstadt.ukp.dkpro.core.io.text-asl',
    version='1.5.0')
@Grab(group='de.tudarmstadt.ukp.dkpro.core',
    module='de.tudarmstadt.ukp.dkpro.core.opennlp-asl',
    version='1.5.0')
@Grab(group='de.tudarmstadt.ukp.dkpro.core',
    module='de.tudarmstadt.ukp.dkpro.core.io.text-asl',
    version='1.5.0')
@Grab(group='de.tudarmstadt.ukp.dkpro.core',
    module='de.tudarmstadt.ukp.dkpro.core.stanfordnlp-gpl',
    version='1.5.0')

import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
import org.apache.uima.fit.component.JCasConsumer_ImplBase;
import org.apache.uima.fit.util.JCasUtil;
import org.apache.uima.jcas.JCas;
import de.tudarmstadt.ukp.dkpro.core.api.ner.type.NamedEntity;
import de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence;
import de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.NP;    
import de.tudarmstadt.ukp.dkpro.core.stanfordnlp.*;

import static org.apache.uima.fit.pipeline.SimplePipeline.*;
import static org.apache.uima.fit.factory.JCasFactory.*;
import static org.apache.uima.fit.factory.AnalysisEngineFactory.*;
import static org.apache.uima.fit.util.JCasUtil.*;

import de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.*;
import de.tudarmstadt.ukp.dkpro.core.api.ner.type.*;


def doc = createJCas();
doc.documentText = """It is unfortunate that many Nigerians, especially the younger ones, 
express surprise at the mention of elephants and lions being found within the borders of the country. 
Admittedly, the number of these animals has diminished greatly over the years due to the activities of poachers thus pushing 
some of these animals to the verge of extinction. For example, it was discovered last year 
that there are not more than 34 lions in the wild. However there should be cause for 
optimism as a rundown of just a few animals across these parks show. The Yankari 
Game Reserve in Bauchi is Nigeria's most famous and arguably the best park for observing 
wildlife. Buffaloes, waterbucks, bushbucks, hyenas, leopards, baboons, elephants and lions 
are some of the animals that can be found here. 
"The animals are best seen during the dry season, 
especially from January to April," a 
tour guide told this reporter during a safari at Yankari. """
doc.documentLanguage = "en";

runPipeline(doc,
  createEngineDescription(StanfordSegmenter),
  createEngineDescription(StanfordPosTagger),
  createEngineDescription(StanfordNamedEntityRecognizer));

// for (Token token : select(doc, Token)) {  
    // println token.coveredText + "\n\n\n"
    // }
// for (Sentence sentence : select(doc, Sentence)) {  
    // println sentence.coveredText + "\n\n\n"
    // }
for (Sentence sentence : JCasUtil.select(doc, Sentence.class)) {
println sentence.getCoveredText()+"\n\n"
for (NP nounphrase : JCasUtil.selectCovered(doc, NP.class, sentence)) { 
    println "||" + nounphrase.getCoveredText() + "||\n\n"
    }
}   
// for (Token token : select(doc, Token)) { 
    // def entity=selectCovering(NamedEntity,token).value
    // if(entity.toString().length()>2)
    // println token.coveredText +"\n\n" + entity.toString() + "\n\n\n"
    // }

在我的输出中,句子被正确地重新命名,但是没有为命名短语打印任何内容。

NP是选区结构的一部分。您的脚本不包括选区分析器。将解析器添加到管道(例如斯坦福解析器)后,您还可以访问NPs:

runPipeline(doc,
  createEngineDescription(StanfordSegmenter),
  createEngineDescription(StanfordPosTagger),
  createEngineDescription(StanfordParser),
  createEngineDescription(StanfordNamedEntityRecognizer));

披露:我是DKPro核心项目的开发人员。

提供的代码不起作用。缺少一些属性文件。
在[classpath:/de/tudarmstadt/ukp/dkpro/core/stanfordnlp/lib/tagger-en-wsj-0-18-bidirectional-distsim.properties]找不到文件。
@Opal代码为我运行,我可能抓错了,抱歉:(