Python 等群聚类算法_Python_Cluster Computing_Sklearn Pandas

Python 等群聚类算法

python cluster-computing

Python 等群聚类算法,python,cluster-computing,sklearn-pandas,Python,Cluster Computing,Sklearn Pandas,我有300个收集点，我需要根据地理坐标对其进行聚类。但是我的集群应该有一个上限8，下限5。如何在Python中实现这一点回答您的问题。您需要使用地理坐标数据更改位置，使用经纬度更改x，y dfcluster = DataFrame(position, columns=['x', 'y']) kmeans = KMeans(n_clusters=4).fit(dfcluster) centroids = kmeans.cluster_centers_ #for plot # plt.scatte

我有300个收集点，我需要根据地理坐标对其进行聚类。但是我的集群应该有一个上限8，下限5。如何在Python中实现这一点

回答您的问题。您需要使用

地理坐标

数据更改

位置

，使用

经纬度

更改

x，y

dfcluster = DataFrame(position, columns=['x', 'y'])
kmeans = KMeans(n_clusters=4).fit(dfcluster)
centroids = kmeans.cluster_centers_
#for plot
# plt.scatter(dfcluster['x'], dfcluster['y'], c=kmeans.labels_.astype(float), s=50, alpha=0.5)
# plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
# plt.show()
dfcluster['cluster'] = kmeans.labels_
dfcluster=dfcluster.drop_duplicates(['x', 'y'], keep='last')
dfcluster = dfcluster.sort_values(['cluster', 'x', 'y'], ascending=True)

n=8
dfcluster1=dfcluster.head(n)
n=5
dfcluster2=dfcluster.tail(n)

同样，对于相同的组使用

从

pip install size constrated clustering

或

pip install git开始+https://github.com/jingw2/size_constrained_clustering.git

您可以使用

最小最大流量

或

启发式

n_samples = 2000
n_clusters = 3
X = np.random.rand(n_samples, 2)

model = equal.SameSizeKMeansMinCostFlow(n_clusters)

#model = equal.SameSizeKMeansHeuristics(n_clusters)
model.fit(X)
centers = model.cluster_centers_
labels = model.labels_

回答你的问题。您需要使用

地理坐标

数据更改

位置

，使用

经纬度

更改

x，y

dfcluster = DataFrame(position, columns=['x', 'y'])
kmeans = KMeans(n_clusters=4).fit(dfcluster)
centroids = kmeans.cluster_centers_
#for plot
# plt.scatter(dfcluster['x'], dfcluster['y'], c=kmeans.labels_.astype(float), s=50, alpha=0.5)
# plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
# plt.show()
dfcluster['cluster'] = kmeans.labels_
dfcluster=dfcluster.drop_duplicates(['x', 'y'], keep='last')
dfcluster = dfcluster.sort_values(['cluster', 'x', 'y'], ascending=True)

n=8
dfcluster1=dfcluster.head(n)
n=5
dfcluster2=dfcluster.tail(n)

同样，对于相同的组使用

从

pip install size constrated clustering

或

pip install git开始+https://github.com/jingw2/size_constrained_clustering.git

您可以使用

最小最大流量

或

启发式

n_samples = 2000
n_clusters = 3
X = np.random.rand(n_samples, 2)

model = equal.SameSizeKMeansMinCostFlow(n_clusters)

#model = equal.SameSizeKMeansHeuristics(n_clusters)
model.fit(X)
centers = model.cluster_centers_
labels = model.labels_

请分享一个需要的输出，并解释你想要什么。我想要这样的输出，经纬度路线代码18.2521536 76.4982399集群——U 01 18.2526484 76.4976308集群——U 01 18.2526006 76.4972857集群——U 01 18.2533365 76.4975484集群——U 01 18.2535941 76.4987773集群——U 01 18.2535462 76.4986933集群——U 01 18.2503783 76.5116291集群——U 0218.251238376.5085317集群--U 02 18.250626876.5082113集群--U 02 18.251620476.5064285集群--U 02我有300个这样的坐标，它们必须以8分钟6的最大集群大小进行集群。请共享所需的输出并解释您想要的。我想要这样的输出，经纬度路线代码18.2521536 76.4982399集群——U 01 18.2526484 76.4976308集群——U 01 18.2526006 76.4972857集群——U 01 18.2533365 76.4975484集群——U 01 18.2535941 76.4987773集群——U 01 18.2535462 76.4986933集群——U 01 18.2503783 76.5116291集群——U 0218.251238376.5085317集群--U 02 18.250626876.5082113集群--U 02 18.251620476.5064285集群--U 02我有300个这样的坐标，这些坐标必须以8分钟6的最大集群大小进行集群