Python 使用gensim';获取三角形时出错;词组
我想提取给定句子中的所有大字和三角Python 使用gensim';获取三角形时出错;词组,python,nlp,data-mining,text-mining,gensim,Python,Nlp,Data Mining,Text Mining,Gensim,我想提取给定句子中的所有大字和三角 from gensim.models import Phrases documents = ["the mayor of new york was there", "Human Computer Interaction is a great and new subject", "machine learning can be useful sometimes","new york mayor was present", "I love machine lear
from gensim.models import Phrases
documents = ["the mayor of new york was there", "Human Computer Interaction is a great and new subject", "machine learning can be useful sometimes","new york mayor was present", "I love machine learning because it is a new subject area", "human computer interaction helps people to get user friendly applications"]
sentence_stream = [doc.split(" ") for doc in documents]
bigram = Phrases(sentence_stream, min_count=1, threshold=2, delimiter=b' ')
trigram = Phrases(bigram(sentence_stream, min_count=1, threshold=2, delimiter=b' '))
for sent in sentence_stream:
#print(sent)
bigrams_ = bigram[sent]
trigrams_ = trigram[bigrams_]
print(bigrams_)
print(trigrams_)
该代码适用于bigrams,并捕获“纽约”和“机器学习”广告bigrams
然而,当我尝试插入三叉图时,我得到了以下错误
TypeError: 'Phrases' object is not callable
请让我知道,如何更正我的代码
我遵循gensim的操作。根据,您可以执行以下操作:
from gensim.models import Phrases
from gensim.models.phrases import Phraser
phrases = Phrases(sentence_stream)
bigram = Phraser(phrases)
trigram = Phrases(bigram[sentence_stream])
bigram
,作为一个Phrases
对象,无法再次调用,因为您正在这样做。然后我得到一个错误,错误是:TypeError:'int'对象不可iterable:(@Volka我看了一下这些文档,发现了一些我认为有用的东西。@Volka不需要……我链接到的文档包含了您需要的所有信息。我认为您需要使用那里记录的短语类。@Volka此外,我认为bigram=短语(句子流)
就足够了……我怀疑其他参数可能会造成麻烦。@Volka很有趣……我试着在文档中运行代码,但对我来说不起作用。也许你应该提出一个新问题。。。