如何将LDA模型应用于一组新文档
我使用下面的LDA模型获取1000个文档的主题如何将LDA模型应用于一组新文档,lda,topic-modeling,Lda,Topic Modeling,我使用下面的LDA模型获取1000个文档的主题 `#creation de count vectorizer pour l entree de lda vectorizer = CountVectorizer(analyzer='word', min_df=7, # minimum reqd occurences of a word
`#creation de count vectorizer pour l entree de lda
vectorizer = CountVectorizer(analyzer='word',
min_df=7, # minimum reqd occurences of a word
max_df=80, # maximum reqd occurences of a word
stop_words='english', # remove stop words
lowercase=True, # convert all words to lowercase
token_pattern='[a-zA-Z0-9]{3,}', # num chars > 3
# max_features=50000, # max number of uniq words
)
data_vectorized = vectorizer.fit_transform(data_lemmatized)
# Materialize the sparse data
data_dense = data_vectorized.todense()
# Build LDA Model
lda_model = LatentDirichletAllocation(n_components=14, # Number of topics
max_iter=10, # Max learning iterations
learning_method='online',
random_state=100, # Random state
batch_size=128, # n docs in each learning iter
evaluate_every = -1,
n_jobs = -1, # Use all available CPUs
)
lda_output = lda_model.fit_transform(data_vectorized)
lda_output = lda_model.transform(data_vectorized)`
现在我想使用相同的模型获取100个新文档的主题(从第14个模型的主题中获取每个文档的主题),如何将此模型应用到新的数据集??有人能帮帮我吗