Java 如何使用Stanford TokensRegex?
我正在尝试使用StanfordJava 如何使用Stanford TokensRegex?,java,regex,stanford-nlp,Java,Regex,Stanford Nlp,我正在尝试使用StanfordTokensRegex。然而,我在匹配器的行中得到了错误(参见注释),它说()。请你尽力帮助我。下面是我的代码: String file = "A store has many branches. A manager may manage at most 2 branches."; Properties props = new Properties(); props.put("annotators", "tokenize, ssplit, pos, lemm
TokensRegex
。然而,我在匹配器的行中得到了错误(参见注释),它说()。请你尽力帮助我。下面是我的代码:
String file = "A store has many branches. A manager may manage at most 2 branches.";
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation document = new Annotation(file);
pipeline.annotate(document);
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for(CoreMap sentence: sentences) {
TokenSequencePattern pattern = TokenSequencePattern.compile("[]");
TokenSequenceMatcher matcher = pattern.getMatcher(sentence); // ERROR HERE!
while( matcher.find()){
JOptionPane.showMessageDialog(rootPane, "It has been found");
}
}
String file=“一个商店有许多分支机构。一个经理最多可以管理两个分支机构。”;
Properties props=新属性();
props.put(“注释器”、“标记化、ssplit、pos、引理、ner、解析、dcoref”);
StanfordCoreNLP管道=新的StanfordCoreNLP(道具);
注释文档=新注释(文件);
管道注释(文件);
列出句子=document.get(coreanotations.SentencesAnnotation.class);
for(CoreMap句子:句子){
TokenSequencePattern=TokenSequencePattern.compile(“[]”);
TokenSequenceMatcher matcher=pattern.getMatcher(句子);//此处出错!
while(matcher.find()){
showMessageDialog(rootPane,“它已被找到”);
}
}
错误来自模式。getMatcher(句子)
此处,因为getMatcher(*)
此方法仅将List
作为其输入参数。我做了以下几件事:
List<CoreLabel> tokens = new ArrayList<CoreLabel>();
for(CoreMap sentence: sentences) {
// **using TokensRegex**
for (CoreLabel token: sentence.get(TokensAnnotation.class))
tokens.add(token);
TokenSequencePattern p1 = TokenSequencePattern.compile("A store has");
TokenSequenceMatcher matcher = p1.getMatcher(tokens);
while (matcher.find())
System.out.println("found");
// **looking for the POS**
for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
String word = token.get(TextAnnotation.class);
// this is the POS tag of the token
String pos = token.get(PartOfSpeechAnnotation.class);
System.out.println("word is "+ word +", pos is " + pos);
}
}
List tokens=new ArrayList();
for(CoreMap句子:句子){
//**使用TokensRegex**
for(CoreLabel标记:句子.get(TokensAnnotation.class))
令牌。添加(令牌);
TokenSequencePattern p1=TokenSequencePattern.compile(“存储区有”);
TokenSequenceMatcher matcher=p1.getMatcher(令牌);
while(matcher.find())
System.out.println(“找到”);
//**正在查找POS**
for(CoreLabel标记:句子.get(TokensAnnotation.class)){
String word=token.get(TextAnnotation.class);
//这是令牌的POS标记
String pos=token.get(speechannotation.class的一部分);
System.out.println(“单词为“+word+”,位置为“+pos”);
}
}
以上代码未优化。请根据需要调整它们。错误是什么?您使用了一个空字符类?这是正确的答案。我也想知道如何使用POS作为模式的一部分。