Stanford nlp 如何使用引号注释器_Stanford Nlp

Stanford nlp 如何使用引号注释器

stanford-nlp

Stanford nlp 如何使用引号注释器,stanford-nlp,Stanford Nlp,运行 ./corenlp.sh-注释器引号-输出格式xml-文件输入.txt 在修改后的输入文件上 “斯坦福大学”位于加利福尼亚州。这是一所伟大的大学，创建于1891年产生以下输出： <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet href="CoreNLP-to-HTML.xsl" type="text/xsl"?> <root> <document> <sent

运行

./corenlp.sh-注释器引号-输出格式xml-文件输入.txt

在修改后的输入文件上

“斯坦福大学”位于加利福尼亚州。这是一所伟大的大学，创建于1891年

产生以下输出：

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="CoreNLP-to-HTML.xsl" type="text/xsl"?>
<root>
  <document>
    <sentences/>
  </document>
</root>

也许我误解了这个注释器的预期用途，但我希望它能标记出句子中介于“注释”和“注释”之间的部分

当我使用“常用”注释器tokenize、ssplit、pos、lemma、ner运行脚本时，它们都工作得很好，但是添加引号并不会改变输出。我使用的是stanford-corenlp-full-2015-12-09版本。

如何使用quote注释器以及它的作用是什么？

如果用Java代码构建一个StanfordCoreNLP对象，并使用quote注释器运行它，最终的注释对象将具有引号

import java.io.*;
import java.util.*;
import edu.stanford.nlp.io.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.trees.TreeCoreAnnotations.*;
import edu.stanford.nlp.semgraph.*;
import edu.stanford.nlp.ling.CoreAnnotations.*;
import edu.stanford.nlp.util.*;

public class PipelineExample {

    public static void main (String[] args) throws IOException {
        // build pipeline
        Properties props = new Properties();
        props.setProperty("annotators","tokenize, ssplit, quote");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
        String text = "\"Stanford University\" is located in California. It is a great university, founded in 1891.";
        Annotation annotation = new Annotation(text);
        pipeline.annotate(annotation);
        System.out.println(annotation.get(CoreAnnotations.QuotationsAnnotation.class));
    }
}

目前没有输出（json、xml、文本等）输出引号。我要注意，我们应该将其添加到未来版本的输出中。

更新：JSONOutputter和TextOutputter在提交时启用了JSON输出和引号文本。XML输出仍然没有实现，因为当前的XML结构基于句子，引号可以跨多个句子，这使得很好地实现它是非常重要的。它似乎没有包含在

3.7.0

中。您能再做一次记录吗？或者我们应该提交一个问题吗？有人能提供关于如何在CoreNLP中集成此pipleline以及如何运行它以获得所需输出的信息吗？对我来说，它给出了错误：线程中的异常“main”“java.lang.IllegalArgumentException:注释器”quote“需要注释”CorefChainAnnotation“。此注释器的通常要求是：标记化、ssplit、pos、引理、ner我使用的是3.9.1版本如果您想要引用属性，您还需要运行coref才能默认使用引用注释器。如果要在没有coref（并且没有属性）的情况下运行它，请使用quote.attributeQuotes=false属性