Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/324.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用sklearn从k-means聚类中获取质心行索引_Python_Pandas_Scikit Learn - Fatal编程技术网

Python 使用sklearn从k-means聚类中获取质心行索引

Python 使用sklearn从k-means聚类中获取质心行索引,python,pandas,scikit-learn,Python,Pandas,Scikit Learn,何乐而不为 我有一个panda数据框,我想从中对所有行进行聚类,并获得每个聚类质心的行索引。我正在使用sklearn,这就是我所拥有的: import pandas as pd import numpy as np from sklearn.cluster import KMeans X = pd.DataFrame(np.random.rand(10,5)) kmeans = KMeans(n_clusters=3) Y = pd.DataFrame(kmeans.fit_predict(

何乐而不为

我有一个panda数据框,我想从中对所有行进行聚类,并获得每个聚类质心的行索引。我正在使用sklearn,这就是我所拥有的:

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans


X = pd.DataFrame(np.random.rand(10,5))
kmeans = KMeans(n_clusters=3)
Y = pd.DataFrame(kmeans.fit_predict(X.as_matrix()),columns=['cluster ID'] )
Z =pd.DataFrame(kmeans.cluster_centers_[Y['cluster ID']])
result = pd.concat([X , Y, Z], axis=1)  
pd.DataFrame(result)
有没有办法得到离质心最近的行的索引

thxthx。此代码可用于:

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from scipy.spatial.distance import cdist

X = pd.DataFrame(np.random.rand(10,5))
model= KMeans(n_clusters=3)
clusassign = model.fit_predict(X.as_matrix())
min_dist = np.min(cdist(X.as_matrix(), model.cluster_centers_, 'euclidean'), axis=1)
Y = pd.DataFrame(min_dist, index=X.index, columns=['Center_euclidean_dist'])
Z = pd.DataFrame(clusassign, index=X.index, columns=['cluster_ID'])
PAP = pd.concat([Y,Z], axis=1)
grouped = PAP.groupby(['cluster_ID'])
grouped.idxmin()
Thx。此代码可用于:

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from scipy.spatial.distance import cdist

X = pd.DataFrame(np.random.rand(10,5))
model= KMeans(n_clusters=3)
clusassign = model.fit_predict(X.as_matrix())
min_dist = np.min(cdist(X.as_matrix(), model.cluster_centers_, 'euclidean'), axis=1)
Y = pd.DataFrame(min_dist, index=X.index, columns=['Center_euclidean_dist'])
Z = pd.DataFrame(clusassign, index=X.index, columns=['cluster_ID'])
PAP = pd.concat([Y,Z], axis=1)
grouped = PAP.groupby(['cluster_ID'])
grouped.idxmin()

您可以使用
kmeans.cluster\u centers.
获取群集中心。然后找到距离该簇中所有元素的距离,以获得距离最小的元素。您可以使用
kmeans.cluster\u centers\u
获得簇中心。然后找到距离该簇中所有元素的距离,以获得距离最小的元素。您可以使用此选项:
np.argmin(np.linalg.norm(X_std-kmp.cluster\u centers\u0,axis=1))
您可以使用此选项:
np.argmin(np.linalg.norm(X_std-kmp.cluster\u centers\u0,axis=1))