Python nltk分块_Python_Nlp_Nltk_Chunking

Python nltk分块

python nlp

Python nltk分块,python,nlp,nltk,chunking,Python,Nlp,Nltk,Chunking,如何从给定句型的句子中获取所有语块。例如果我解析，我将获得 (S (NP money/NN market/NN) fund/NN) 我想还有另一个选择，那就是 (S money/NN (NP market/NN fund/NN)) 我想你的问题是关于得到一个句子最有可能的语法分析。我说得对吗？如果是，请参阅。@mbatchkarov关于nbest_解析文档的说明是正确的。有关代码示例，请参见： import nltk # Define the cfg grammar. grammar =

如何从给定句型的句子中获取所有语块。例

如果我解析，我将获得

(S (NP money/NN market/NN) fund/NN)

我想还有另一个选择，那就是

(S money/NN (NP market/NN fund/NN))

我想你的问题是关于得到一个句子最有可能的语法分析。我说得对吗？如果是，请参阅。

@mbatchkarov关于nbest_解析文档的说明是正确的。有关代码示例，请参见：

import nltk
# Define the cfg grammar.
grammar = nltk.parse_cfg("""
S -> NP
S -> NN NP
S -> NP NN
NP -> NN NN
NN -> 'market'
NN -> 'money'
NN -> 'fund'
""")

# Make your string into a list of tokens.
sentence = "money market fund".split(" ")

# Load the grammar into the ChartParser.
cp = nltk.ChartParser(grammar)

# Generate and print the nbest_parse from the grammar given the sentence tokens.
for tree in cp.nbest_parse(sentence):
    print tree

这不是分块，它被称为parsing，即使我寻找所有可能的分块，也不会解析更多的分块计算消耗吗？分块也被称为浅层解析。浅层解析是指当您关心大型NPs，而忽略NPs内部的顺序和位置时，正常的正则表达式chunker可能会工作。但您的问题需要复杂的NPs顺序（即深度解析），因此需要一个解析器。即使我使用iter_parse进行解析，seam也会为RegexpParser提供相同的ansewr。首先，您创建所需的CFG语法。和终端节点（即您的词汇/单词）也需要在语法中。然后调用ChartParser来加载您定义的语法。然后，根据传递到nbest_语法分析中的句子列表，尝试获得最佳语法分析。我一直在考虑只使用正则表达式而不使用语法…在regexp的情况下，如果您有一个类似nnn的字符串，并且正在寻找表达式nn，那么rexep允许您拥有与本例（0，2）和（1，3）中的模式匹配的索引列表.

(S money/NN (NP market/NN fund/NN))

import nltk
# Define the cfg grammar.
grammar = nltk.parse_cfg("""
S -> NP
S -> NN NP
S -> NP NN
NP -> NN NN
NN -> 'market'
NN -> 'money'
NN -> 'fund'
""")

# Make your string into a list of tokens.
sentence = "money market fund".split(" ")

# Load the grammar into the ChartParser.
cp = nltk.ChartParser(grammar)

# Generate and print the nbest_parse from the grammar given the sentence tokens.
for tree in cp.nbest_parse(sentence):
    print tree