Python 2.7 接近群集的百分比（以kmeans为单位）_Python 2.7_Scipy_K Means

Python 2.7 接近群集的百分比（以kmeans为单位）

python-2.7

Python 2.7 接近群集的百分比（以kmeans为单位）,python-2.7,scipy,k-means,Python 2.7,Scipy,K Means,我正在尝试用python进行集群以下是我的脚本： import scipy import numpy as np from scipy.cluster.vq import kmeans2 from collections import Counter def dist(a,b): return np.linalg.norm(a-b) X = scipy.randn(100, 2) k=int(raw_input("Give value of k: ")) pts = 100*np.ra

我正在尝试用python进行集群

以下是我的脚本：

import scipy
import numpy as np
from scipy.cluster.vq import kmeans2
from collections import Counter

def dist(a,b): return np.linalg.norm(a-b)

X = scipy.randn(100, 2)

k=int(raw_input("Give value of k: "))

pts = 100*np.random.random((12,2))
centroids, assigned_clusters = kmeans2(X, k)
for pt in pts:
    #print tuple(pt)
    distperxy=dict()
    i=1
    for cent in centroids:
        distperxy[i]=(float(dist(pt,list(cent))))
        i+=1
    mnmm=min(distperxy.iterkeys(), key=(lambda key: distperxy[key]))
    tot = sum(distperxy.values())
    perc=distperxy[mnmm]/tot
    print mnmm,perc

基本上，我试图为每个点确定最近的簇以及接近该簇的百分比。例如，如果点A靠近k-均值中的质心1，其中k=3。百分比计算是否如上述代码所述，即，将到最近簇的距离除以到簇的距离之和？

您最好精确定义“接近百分比”。在我看来，你需要某种确定性的估计，告诉你一个点在a簇和B簇中有多确定。就在a簇和B簇之间的边界上的点应该具有相同的确定性。如果你想找到计算的方法，你应该参考一些统计教科书或者发明你自己的。请小心为您介绍的任何人定义您发明的度量，因为“接近度百分比”不是标准度量。建议，对于簇c集合中的每个点a和簇c，计算1/r或1/r^2，其中r是点a和簇c质心之间的距离，将它们放在一个数组中，然后除以数组的和。这将为您提供一个“接近百分比”数组。或者，您可以使用r的任何递减函数，但同样，您应该为任何使用结果的人定义术语。