Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/339.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何在CoreNLP完成ssplit后获得句子的原始文本?_Java_Stanford Nlp - Fatal编程技术网

Java 如何在CoreNLP完成ssplit后获得句子的原始文本?

Java 如何在CoreNLP完成ssplit后获得句子的原始文本?,java,stanford-nlp,Java,Stanford Nlp,CoreNLP的标记化改变了句子文本。将由空白分隔的标记缝合在一起不是真正的重建。如果句子包含圆括号和其他标点符号,事情就会变得复杂。请参阅下面的代码块 Properties props = new Properties(); props.setProperty("annotators", "tokenize, ssplit"); pipeline = new StanfordCoreNLP(props); Annotation document = new Annotation(paragr

CoreNLP的标记化改变了句子文本。将由空白分隔的标记缝合在一起不是真正的重建。如果句子包含圆括号和其他标点符号,事情就会变得复杂。请参阅下面的代码块

Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit");
pipeline = new StanfordCoreNLP(props);

Annotation document = new Annotation(paragraph);
pipeline.annotate(document);

List<CoreMap>sentences = document.get(SentencesAnnotation.class);

List<String> sentenceList = new ArrayList<>();
for (CoreMap sentence : sentences) 
{
    //How to get the original text of sentence?
}
Properties=newproperties();
props.setProperty(“注释器”、“标记化、ssplit”);
管道=新StanfordCoreNLP(道具);
注释文件=新注释(段落);
管道注释(文件);
listQuences=document.get(SentencesAnnotation.class);
List sentenceList=新的ArrayList();
for(CoreMap句子:句子)
{
//如何获得句子的原文?
}

回答我自己的问题。这很容易。在问题代码块中插入以下行以代替注释

String sentenceString = Sentence.listToOriginalTextString(sentence.get(TokensAnnotation.class));
for (CoreMap sentence : sentences) 
{
    String sentenceStr = sentence.get(CoreAnnotations.TextAnnotation.class)
}