Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/svg/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 与命名实体连用的形容词_Python_Entity Framework_Nltk_Named Entity Recognition - Fatal编程技术网

Python 与命名实体连用的形容词

Python 与命名实体连用的形容词,python,entity-framework,nltk,named-entity-recognition,Python,Entity Framework,Nltk,Named Entity Recognition,我使用了下面给出的python代码来提取文本中的命名实体。现在我需要从文本中有命名实体的句子中获取形容词。i、 与命名实体一起使用的形容词。如果有“NE”,我可以修改代码以检查树是否有“JJ”,或者是否有其他方法 def tokenize(text): sentences = nltk.sent_tokenize(text) sentences = [nltk.word_tokenize(sent) for sent in sentences] sentences = [nltk.pos_

我使用了下面给出的python代码来提取文本中的命名实体。现在我需要从文本中有命名实体的句子中获取形容词。i、 与命名实体一起使用的形容词。如果有“NE”,我可以修改代码以检查树是否有“JJ”,或者是否有其他方法

def tokenize(text): 
sentences = nltk.sent_tokenize(text) 
sentences = [nltk.word_tokenize(sent) for sent in sentences] 
sentences = [nltk.pos_tag(sent) for sent in sentences] 
return sentences 

text=open("file.txt","r").read() 
sentences=tokenize(text) 
chunk_sent=nltk.batch_ne_chunk(sentences,binary=True)
print chunk_sent[1]
输出:

树('S',[(“,'POS'),('completed','NNP'),('in','in'),('speech','NN'),(',',','),树('NE',[('Gautam','NNP')),(' , ('that','DT'),('big','JJ'),('assembly','NN'),('of','IN'),('provisive','JJ'),('sages','NNP'),('a','DT'),('full',' ","和","CC","适当","回答","NN","in","单词","NNS","辅音 ('mode','NN'),('of','IN'),('life','NN'),('mode','NN'),('

虽然这句话在NE之前没有JJ,但我怎样才能让JJ和NE一起使用呢

def ne(tree):
    names = []
    if hasattr(tree, 'node') and tree.node:
      if tree.node == 'NE':
        names.append(' '.join([child[0] for child in tree]))
    else:
        for child in tree:
            names.extend(ne(child))

return names

names = []
for item in chunk_sent:
   names.extend(ne(item))
print names

那是什么语言?@Rob看起来像PythonYes,它是一个Python代码。NLTK是Python的,但是OP的缩进在帖子中是一团混乱。。。
>>> from nltk.corpus import brown
>>> from nltk import batch_ne_chunk as bnc
>>> from nltk.tree import Tree
>>> sentences = brown.tagged_sents()[0:5]
>>> chunk_sents = bnc(sentences)
>>> 
>>> for sent in chunk_sents:
...     for i,j in zip(sent[:-1], sent[1:]):
...             if type(j) is Tree and i[1].startswith("JJ"):
...                     print i,j
... 
('Grand', 'JJ-TL') (PERSON Jury/NN-TL)
('Executive', 'JJ-TL') (ORGANIZATION Committee/NN-TL)