斯坦福NLP注释排名或分数_Nlp_Stanford Nlp

斯坦福NLP注释排名或分数

nlp stanford-nlp

斯坦福NLP注释排名或分数,nlp,stanford-nlp,Nlp,Stanford Nlp,我正在使用斯坦福CoreNLP管道，并从语句SANNotation获得TreeAnnotation和BasicDependenceAnnotation 我正在寻找一种方法来告诉解析器对POS标记和依赖结构有多确定我记得早些时候，当我在修补斯坦福NLP库时，我看到在某个地方，对于同一句话，返回了具有不同排名的多棵树。我找不到任何关于如何从解析器或管道获取此信息的信息 DependencyScoring类似乎是在TypedDependency上运行的，据我所知，这不是管道作为注释过程的一部分生成

我正在使用斯坦福CoreNLP管道，并从

语句SANNotation

获得

TreeAnnotation

和

BasicDependenceAnnotation

我正在寻找一种方法来告诉解析器对POS标记和依赖结构有多确定

我记得早些时候，当我在修补斯坦福NLP库时，我看到在某个地方，对于同一句话，返回了具有不同排名的多棵树。我找不到任何关于如何从解析器或管道获取此信息的信息

DependencyScoring

类似乎是在

TypedDependency

上运行的，据我所知，这不是管道作为注释过程的一部分生成的东西

编辑：代码详细信息：

Annotation document = new Annotation("This is my sentence");
pipeline.annotate(document); 
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
...
Tree tree = sentence1.get(TreeAnnotation.class);
SemanticGraph dependencies = sentence1.get(CollapsedCCProcessedDependenciesAnnotation.class);

Annotation document=新注释（“这是我的句子”）；
管道注释（文件）；
列出句子=document.get（SentencesAnnotation.class）；
...
Tree-Tree=sentence1.get（TreeAnnotation.class）；
SemanticGraph dependencies=sentence1.get（CollapsedCCProcessedDependenciesAnnotation.class）；

如果您使用的是默认的CoreNLP管道（即，使用

parse

注释器而不是

depprase

），那么您看到的依赖项解析来自对句子的选区解析的确定性转换。在这里你能得到的最佳“分数”是查看最终产生依赖性分析（转换后）的候选选区分析

然而，您需要突破CoreNLP管道来完成这项特定的工作。如果您有一个

LexicalizedParser

实例，您可以获得k个最佳解析（附带分数），如下所示：

List mycentence=。。。
LexicalizedParser=LexicalizedParser.loadModel（“edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz”）；
ParserQuery pq=parser.ParserQuery（）；
if（pq.parse（mycentence））{
//获得最佳解析和关联分数
Tree parse=pq.getBestPCFGParse（）；
双倍分数=pq.getPCFGScore（）；
//打印解析
parse.pennPrint（）；
// ----
//获取最佳解析的集合
List bestParses=pq.getbestpcfgpasses（）；
// ----
//将选区分析转换为依赖项表示
语法结构gs=parser.treebankLanguagePack（）
.grammaticstructurefactory（）.newgrammaticstructure（parse）；
List dependencies=gs.TypedDependenciescProcessed（）；
System.out.println（依赖项）；
}

相关Javadoc：

（注意：未测试的代码，但这应该可以工作。）

如何生成依赖项解析？您是从

parse

注释器获取它们的吗？如果是这种情况，依赖关系实际上是由确定性转换产生的——您唯一的概率度量将来自转换开始的PCFG解析。如果情况确实如此，我可以提供更多细节。基本上，我会“注释文档=新注释（“这是我的句子”）；pipeline.Annotation（文档）；List句子=document.get（SentencesAnnotation.class）；”然后获取树注释和依赖关系图。是的，请详细说明PCFG方法。@JonGauthier，是否有可能看到依赖关系中单词对的可能性/概率？例如，遇到“MD”->“JJ”或“will”->“able”关系的可能性有多大？如果你愿意，我可以把它作为一个单独的问题发布。

List<CoreLabel> mySentence = ...

LexicalizedParser parser = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
ParserQuery pq = parser.parserQuery();
if (pq.parse(mySentence)) {
  // Get best parse and associated score
  Tree parse = pq.getBestPCFGParse();
  double score = pq.getPCFGScore();

  // Print parse
  parse.pennPrint();

  // ----
  // Get collection of best parses
  List<ScoredObject<Tree>> bestParses = pq.getBestPCFGParses();

  // ----
  // Convert a constituency parse to dependency representation
  GrammaticalStructure gs = parser.treebankLanguagePack()
      .grammaticalStructureFactory().newGrammaticalStructure(parse);
  List<TypedDependency> dependencies = gs.typedDependenciesCCprocessed();
  System.out.println(dependencies);
}