Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Gensim中id2word\u令牌2ID使用混乱_Python_Python 2.7_Python 3.x_Gensim - Fatal编程技术网

Python Gensim中id2word\u令牌2ID使用混乱

Python Gensim中id2word\u令牌2ID使用混乱,python,python-2.7,python-3.x,gensim,Python,Python 2.7,Python 3.x,Gensim,为了明确起见,我想得到您的反馈,以下代码/gensim使用是否正确 提前感谢您宝贵的时间 import gensim train = ["John likes to watch movies Mary likes movies too" , "John also likes to watch football games" ] test = ["Football is my dream"] train_texts = [[word for word in docu

为了明确起见,我想得到您的反馈,以下代码/gensim使用是否正确

提前感谢您宝贵的时间

import gensim    

train = ["John likes to watch movies Mary likes movies too" ,
         "John also likes to watch football games" ]

test = ["Football is my dream"]

train_texts = [[word for word in document.lower().split()] for document in train]
test_texts = [[word for word in document.lower().split()] for document in test]

dictionary =gensim.corpora.Dictionary(train_texts)

train_corpus = [dictionary.doc2bow(text) for text in train_texts]
test_corpus = [dictionary.doc2bow(text) for text in test_texts]

ldaModel = gensim.models.LdaModel(corpus=train_corpus , 
             id2word=dictionary , num_topics=2)
bound_perplex = ldaModel.bound(test_corpus)

代码的用法绝对正确,但对于较大的文档,最好使用语料库流

您可以在此处获得有关数据流的更多信息-


  • 我和其他人一起调查。这是应该的。非常感谢。