Python 如何仅打印NLTK分块的字符串结果?
我正在使用NLTK和正则表达式来分析我的文本。该模型正确地识别了我定义的区块,但最终,所有标记的单词和“我的区块”都会显示在打印结果中。问题是如何只打印文本的分块部分(“我的分块”) 下面是我的代码示例:Python 如何仅打印NLTK分块的字符串结果?,python,regex,nltk,chunking,Python,Regex,Nltk,Chunking,我正在使用NLTK和正则表达式来分析我的文本。该模型正确地识别了我定义的区块,但最终,所有标记的单词和“我的区块”都会显示在打印结果中。问题是如何只打印文本的分块部分(“我的分块”) 下面是我的代码示例: import re import nltk text = ['The absolutely kind professor asked students out whom he met in class'] for item in text: tokenized = nltk.wor
import re
import nltk
text = ['The absolutely kind professor asked students out whom he met in class']
for item in text:
tokenized = nltk.word_tokenize(item)
tagged = nltk.pos_tag(tokenized)
chunk = r"""My_Chunk: {<RB.?>*<NN.?>*<VBD.?>}"""
chunkParser = nltk.RegexpParser(chunk)
chunked = chunkParser.parse(tagged)
print(chunked)
chunked.draw()
这应该做到:
for a in chunked:
if isinstance(a, nltk.tree.Tree):
if a.label() == "My_Chunk":
print(a)
print(" ".join([lf[0] for lf in a.leaves()]))
print()
#(My_Chunk absolutely/RB kind/NN professor/NN asked/VBD)
#absolutely kind professor asked
#(My_Chunk met/VBD)
#met
for a in chunked:
if isinstance(a, nltk.tree.Tree):
if a.label() == "My_Chunk":
print(a)
print(" ".join([lf[0] for lf in a.leaves()]))
print()
#(My_Chunk absolutely/RB kind/NN professor/NN asked/VBD)
#absolutely kind professor asked
#(My_Chunk met/VBD)
#met