Python 如何选择k均值中频率最大的簇

Python 如何选择k均值中频率最大的簇,python,k-means,gensim,word2vec,Python,K Means,Gensim,Word2vec,我从Gensim word2vec创建了一个k均值聚类,其中k的值为3。现在我想检索集群和频率最高的值 import gensim from gensim.models import Word2Vec import nltk from nltk.tokenize import sent_tokenize from sklearn.cluster import KMeans import numpy as np text = "Thank you for keeping me updated on

我从Gensim word2vec创建了一个k均值聚类,其中k的值为3。现在我想检索集群和频率最高的值

import gensim
from gensim.models import Word2Vec
import nltk
from nltk.tokenize import sent_tokenize
from sklearn.cluster import KMeans
import numpy as np
text = "Thank you for keeping me updated on this issue. I'm happy to hear that the issue got resolved after all and you can now use the app in its full functionality again. Also many thanks for <pre> your suggestions. We hope to improve this feature in the future. In case you experience any <pre> further problems with the app, please don't hesitate to contact me again."
sentences = sent_tokenize(text)
word_text = [[text for text in sentences.split()] for sentences in sentences]
model = Word2Vec(word_text, min_count=1)
x = model[model.wv.vocab]
n_clusters = 3
kmeans = KMeans(n_clusters=n_clusters)
kmeans = kmeans.fit(x)

您可以找到每个数据点的标签:

labels=kmeans.labels_

现在,您可以使用以下方法查找每个群集的样本数:

np.uniquelabels,返回\单位计数=真

您可以使用
kmeans.cluster\u centers\u

你所说的频率最多是什么意思?我的意思是哪个簇的元素数量最多。