Stanford nlp 在核心nlp中拆分运行时和配置模式_Stanford Nlp

Stanford nlp 在核心nlp中拆分运行时和配置模式

stanford-nlp

Stanford nlp 在核心nlp中拆分运行时和配置模式,stanford-nlp,Stanford Nlp,我正在使用斯坦福大学的核心nlp管道来执行一些基本任务。下面是教程中的示例代码副本 public static void testcoreNLP(String inputText) throws IOException { Properties props = new Properties(); props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref"); props.put("coref.

我正在使用斯坦福大学的核心nlp管道来执行一些基本任务。下面是教程中的示例代码副本

public static void testcoreNLP(String inputText) throws IOException {

  Properties props = new Properties();
  props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
  props.put("coref.md.type", "rule");
  props.put("coref.mode", "statistical");
  props.put("coref.doClustering", "true");
  props.put("ner.model", "edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz,edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz,edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz");
  StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

  StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
  Annotation document = new Annotation(inputText);

  pipeline.annotate(document);
  List<CoreMap> sentences = document.get(SentencesAnnotation.class);

  for(CoreMap sentence: sentences) {
        for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
          String word = token.get(TextAnnotation.class);
          String pos = token.get(PartOfSpeechAnnotation.class);
          String ne = token.get(NamedEntityTagAnnotation.class);      
      System.out.println("word: " + word + " pos: " + pos + " ne:" + ne);
    }  
}

publicstaticvoidtestcorenlp（stringinputText）抛出IOException{
Properties props=新属性（）；
props.put（“注释器”、“标记化、ssplit、pos、引理、ner、解析、dcoref”）；
道具放置（“coref.md.type”、“rule”）；
道具放置（“核心模式”、“统计”）；
道具放置（“coref.docclustering”、“true”）；
props.put（“ner.model”、“edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz、edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz、edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz”）；
StanfordCoreNLP管道=新的StanfordCoreNLP（道具）；
StanfordCoreNLP管道=新的StanfordCoreNLP（道具）；
注释文档=新注释（输入文本）；
管道注释（文件）；
列出句子=document.get（SentencesAnnotation.class）；
for（CoreMap句子：句子）{
for（CoreLabel标记：句子.get（TokensAnnotation.class））{
String word=token.get（TextAnnotation.class）；
String pos=token.get（speechannotation.class的一部分）；
字符串ne=token.get（NamedEntityTagAnnotation.class）；
系统输出打印项次（“字：+字+”位置：+pos+“ne:+ne”）；
}  
}

我的方法

testcoreNLP

（接受字符串inputText）正在其

for循环中被另一个方法调用（textPreprocessor（）预处理文本）
据我所知，每次使用totestcoreNLP
方法时，都会加载所有模型文件（特定于域的经过训练的模型文件），每次运行大约需要3-5秒
如何从运行时分离模型加载？
您的Java应用程序只需要构建一次管道。您可以将管道加载代码从方法中取出，然后执行一次