Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/299.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python SKK-means:点到群集中心的距离_Python_Scikit Learn_Cluster Analysis_Distance_K Means - Fatal编程技术网

Python SKK-means:点到群集中心的距离

Python SKK-means:点到群集中心的距离,python,scikit-learn,cluster-analysis,distance,k-means,Python,Scikit Learn,Cluster Analysis,Distance,K Means,我正在使用sklearn k-means聚类,我想知道如何计算和存储从数据中的每个点到最近的聚类的距离,供以后使用。我的代码: import numpy as np import matplotlib.pyplot as plt import scipy.sparse as sp from sklearn.metrics.pairwise import euclidean_distances from datetime import datetime from sklearn.cluster i

我正在使用sklearn k-means聚类,我想知道如何计算和存储从数据中的每个点到最近的聚类的距离,供以后使用。我的代码:

import numpy as np
import matplotlib.pyplot as plt
import scipy.sparse as sp
from sklearn.metrics.pairwise import euclidean_distances
from datetime import datetime
from sklearn.cluster import KMeans
from sklearn.datasets.samples_generator import make_blobs

def learn(records):

    data = [getDataFromTransaction(t) for t in records]
    batch_size = 45
    X = np.array(data)
    centers = [[1, 1, 1], [-1, -1, -1], [1, -1, 1]]
    n_clusters = len(centers)
    #X, labels_true = make_blobs(n_samples=20, centers=centers, 
    cluster_std=0.7)
    ##############################################################################
    # Compute clustering with Means
    k_means = KMeans(init='k-means++', n_clusters=3, n_init=10)
    k_means.fit(X)
    k_means_labels = k_means.labels_
    k_means_cluster_centers = k_means.cluster_centers_
    k_means_labels_unique = np.unique(k_means_labels)
    colors = ['#4EACC5', '#FF9C34', '#4E9A06']
    plt.figure()
    plt.hold(True)
    for k, col in zip(range(n_clusters), colors):
            my_members = k_means_labels == k
            cluster_center = k_means_cluster_centers[k]
            plt.plot(X[my_members, 0], X[my_members, 1], 'w',
                    markerfacecolor=col, marker='.')
            plt.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,
                    markeredgecolor='k', markersize=6)


    plt.title('KMeans')    
    plt.grid(True)
    plt.savefig('./'+str("clustering")+'k_.png')
    plt.show(0)
    plt.show()

抱歉格式不好,感谢您提供的任何帮助。

在k-Means中,将点分配给群集,从而使距群集中心的平方偏差之和最小化。因此,您所要做的就是采用欧几里德范数,即在k均值中指定的每个点和簇中心之间的差值

以下是伪代码:

for i in NumClusters:
    dataInCluster = data[clusterLabels[cluster==i].rowNames,]
    distance = norm(dataInCluster-clusterCenter[i])

然后,您可以将距离作为附加列添加到数据中。

在k-Means中,将点分配给簇,从而使距簇中心的平方偏差之和最小化。因此,您所要做的就是采用欧几里德范数,即在k均值中指定的每个点和簇中心之间的差值

以下是伪代码:

for i in NumClusters:
    dataInCluster = data[clusterLabels[cluster==i].rowNames,]
    distance = norm(dataInCluster-clusterCenter[i])
然后可以将距离作为附加列添加到数据中