Python NLTK中使用Stanford解析器的依赖关系树结果与Stanford解析器不匹配

Python NLTK中使用Stanford解析器的依赖关系树结果与Stanford解析器不匹配,python,python-2.7,nlp,nltk,stanford-nlp,Python,Python 2.7,Nlp,Nltk,Stanford Nlp,我试图比较来自NLTK的Stanford解析器的结果,但我不知道为什么在与NLTK进行比较时会得到不同的结果 我检查了相关问题,但这对我帮助不大 stan_dep_parser = StanfordDependencyParser() # stanford parser from NLTK dependency_parser =stan_dep_parser.raw_parse("Four men died in an accident") dep = dependency_parser.ne

我试图比较来自NLTK的Stanford解析器的结果,但我不知道为什么在与NLTK进行比较时会得到不同的结果 我检查了相关问题,但这对我帮助不大

stan_dep_parser = StanfordDependencyParser() # stanford parser from NLTK 
dependency_parser =stan_dep_parser.raw_parse("Four men died in an accident")
dep = dependency_parser.next()
for triple in dep.triples():
   print triple[1],"(",triple[0][0],", ",triple[2][0],")"
电流输出:

nsubj ( died ,  men )
nummod ( men ,  Four )
nmod ( died ,  accident )
case ( accident ,  in )
det ( accident ,  an )
预期产出根据:

NLTK版本:3.2.4
斯坦福解析器:斯坦福解析器-3.8.0-models

我自己解决了这个问题:

我找到了这个句子的“根”或“头”:

final_dependency = []
sentence = "Four men died in an accident"
dependency_tree = StanfordDependencyParser()
dependency_parser = dependency_tree.raw_parse(sentence)
parsetree = list(dependency_parser)[0]
for k in parsetree.nodes.values():
       if k["head"] == 0:
            final_dependency.append(str(k["rel"])  + "(" + "Root" + "-" 
                + str(k["head"]) + "," + str(k["word"]) + "-" + str(k["address"]) + ")" )

然后,我用简单的字符串操作在预期的输出中添加带有单词的数字,因为数字是句子中每个单词的索引

您下载了哪个版本的斯坦福解析器?另外,您使用的是哪个版本的NLTK?您使用的是哪个模型,是不是
englishPCFG.ser.gz
?nltk版本:3.2.4斯坦福解析器版本:Stanford-Parser-3.8.0-modelsIt为选区树提供了相同的结果,但我不知道为什么依赖性解析器的结果不同。除了缺少根(可以推断)之外,解析有什么不同?@aab yes根缺失,带单词的数字也缺失。有可能得到数字吗?
final_dependency = []
sentence = "Four men died in an accident"
dependency_tree = StanfordDependencyParser()
dependency_parser = dependency_tree.raw_parse(sentence)
parsetree = list(dependency_parser)[0]
for k in parsetree.nodes.values():
       if k["head"] == 0:
            final_dependency.append(str(k["rel"])  + "(" + "Root" + "-" 
                + str(k["head"]) + "," + str(k["word"]) + "-" + str(k["address"]) + ")" )