Python 如何在NLTK中重新格式化Malt解析器的输出?

Python 如何在NLTK中重新格式化Malt解析器的输出?,python,parsing,nlp,nltk,Python,Parsing,Nlp,Nltk,因此,我终于找到了如何使用NLTK中从“”提供的malt包装器,并且能够成功地将我的句子分块,但是我的句子以一种我不熟悉的格式出现 例如,解析“这是一个句子”返回: 解析更复杂的句子会返回: >>> import nltk >>> parser = nltk.parse.malt.MaltParser(working_dir="/path/to/dir",mco="engmalt.linear-1.7",additional_java_args=['-Xmx5

因此,我终于找到了如何使用NLTK中从“”提供的malt包装器,并且能够成功地将我的句子分块,但是我的句子以一种我不熟悉的格式出现

例如,解析“这是一个句子”返回:

解析更复杂的句子会返回:

>>> import nltk
>>> parser = nltk.parse.malt.MaltParser(working_dir="/path/to/dir",mco="engmalt.linear-1.7",additional_java_args=['-Xmx512m'])
>>> txt = "This is a test sentence"
>>> graph = parser.raw_parse(txt)
>>> graph.tree().pprint()
(This (sentence is a test))
>>> import nltk
>>> parser = nltk.parse.malt.MaltParser(working_dir="/path/to/dir",mco="engmalt.linear-1.7",additional_java_args=['-Xmx512m'])
>>> txt = "A ceasefire for east Ukraine has been agreed during talks in Minsk."
>>> graph = parser.raw_parse(txt)
>>> graph.tree().pprint()
(agreed
   (ceasefire A (for (Ukraine east)))
   has
   been
   (during (talks (in Minsk)))
   .)
请有人解释一下这个输出格式是什么,或者我如何解析它,使它看起来像原来的句子:

(This (is a test sentence))
A (ceasefire (for (east Ukraine))) has been (agreed (during (talks (in Minsk))).)
如果有帮助,
graph
是一个nltk依赖图,
graph.tree()
是一个nltk树


提前感谢。

MaltParser是一个数据驱动的“依赖解析”系统,可用于从树库数据中导出解析模型,并使用导出模型解析新数据

文件engmalt.poly-1.7.mco和engmalt.linear-1.7.mco包含单个malt配置,用于使用malt解析器解析英文文本

这两种模型的不同之处在于,engmalt.poly-1.7.mco使用带多项式核的支持向量机进行分类,而engmalt.linear-1.7.mco使用线性支持向量机。虽然后一种解析器速度更快,但前者需要更少的内存,并且两种模型的解析精度相似。还有我们输出的解析文本的书写方式

使用engmalt.poly-1.7.mco,输出解析文本在依赖项注释/依赖项图中表示,其中engmalt.linear-1.7.mco以线性方式表示

请遵循以下输出。希望这有帮助

使用mco=“engmalt.linear-1.7”

使用mco=“engmalt.poly-1.7”

对于新的复句,使用mco=“engmalt.linear-1.7”

>>> import nltk
>>> parser = nltk.parse.malt.MaltParser(working_dir="/path/to/dir",mco="engmalt.linear-1.7",additional_java_args=['-Xmx512m'])
>>> txt = "This is a test sentence"
>>> graph = parser.raw_parse(txt)
>>> graph.tree().pprint()
(This (sentence is a test))
>>> import nltk
>>> parser = nltk.parse.malt.MaltParser(working_dir="/path/to/dir",mco="engmalt.poly-1.7",additional_java_args=['-Xmx512m'])
>>> txt = "This is a test sentence"
>>> graph = parser.raw_parse(txt)
>>> graph.tree().pprint()
(is This (a (sentence test)))
>>> import nltk
>>> parser = nltk.parse.malt.MaltParser(working_dir="/path/to/dir",mco="engmalt.linear-1.7",additional_java_args=['-Xmx512m'])
>>> txt = "A ceasefire for east Ukraine has been agreed during talks in Minsk."
>>> graph = parser.raw_parse(txt)
>>> graph.tree().pprint()
(A\n  (agreed\n    (been ceasefire for east Ukraine has)\n    (during (Minsk talks in)))\n  .)