Python gensim-LdaModel中主题词概率矩阵的提取_Python_Gensim_Lda_Topic Modeling

Python gensim-LdaModel中主题词概率矩阵的提取

python

Python gensim-LdaModel中主题词概率矩阵的提取,python,gensim,lda,topic-modeling,Python,Gensim,Lda,Topic Modeling,我有LDA模型和文档主题概率 # build the model on the corpus ldam = LdaModel(corpus=corpus, num_topics=20, id2word=dictionary) # get the document-topic probabilities theta, _ = ldam.inference(corpus) 我还需要所有主题的词分布，即主题词概率矩阵。有没有办法提取这些信息谢谢可通过以下方式访问主题术语表（lambda）： t

我有LDA模型和文档主题概率

# build the model on the corpus
ldam = LdaModel(corpus=corpus, num_topics=20, id2word=dictionary) 
# get the document-topic probabilities
theta, _ = ldam.inference(corpus)

我还需要所有主题的词分布，即主题词概率矩阵。有没有办法提取这些信息

谢谢

可通过以下方式访问主题术语表（lambda）：

topics_terms = ldam.state.get_lambda()

如果您想要概率分布，只需将其标准化：

topics_terms_proba = np.apply_along_axis(lambda x: x/x.sum(),1,topics_terms)

当我使用ldam.state.get_lambda（）时，我得到一个numpy矩阵，但是没有列名。如何识别单词？要知道哪个单词对应于给定索引，请使用

ldam.id2word

。例如，

ldam.id2word[0]

是对应于矩阵第一列的单词。