Python 如何根据大小筛选DBSCAN生成的群集?
我已经应用了DBSCAN对数据集执行聚类,该数据集由点云中每个点的X、Y和Z坐标组成。我只想绘制少于100个点的簇。这就是我到目前为止所做的:Python 如何根据大小筛选DBSCAN生成的群集?,python,machine-learning,scikit-learn,unsupervised-learning,dbscan,Python,Machine Learning,Scikit Learn,Unsupervised Learning,Dbscan,我已经应用了DBSCAN对数据集执行聚类,该数据集由点云中每个点的X、Y和Z坐标组成。我只想绘制少于100个点的簇。这就是我到目前为止所做的: clustering = DBSCAN(eps=0.1, min_samples=20, metric='euclidean').fit(only_xy) plt.scatter(only_xy[:, 0], only_xy[:, 1], c=clustering.labels_, cmap='rainbow') clusters = c
clustering = DBSCAN(eps=0.1, min_samples=20, metric='euclidean').fit(only_xy)
plt.scatter(only_xy[:, 0], only_xy[:, 1],
c=clustering.labels_, cmap='rainbow')
clusters = clustering.components_
#Store the labels
labels = clustering.labels_
#Then get the frequency count of the non-negative labels
counts = np.bincount(labels[labels>=0])
print(counts)
Output:
[1278 564 208 47 36 30 191 54 24 18 40 915 26 20
24 527 56 677 63 57 61 1544 512 21 45 187 39 132
48 55 160 46 28 18 55 48 35 92 29 88 53 55
24 52 114 49 34 34 38 52 38 53 69]
因此,我已经找到了每个簇中的点数,但我不确定如何仅选择点数小于100的簇 您可以找到计数小于100的标签索引:
ls, cs = np.unique(labels,return_counts=True)
dic = dict(zip(ls,cs))
idx = [i for i,label in enumerate(labels) if dic[label] <100 and label >= 0]
from collections import Counter
labels_with_morethan100=[label for (label,count) in Counter(clustering.labels_).items() if count>100]
clusters_biggerthan100= clustering.components_[np.isin(clustering.labels_[clustering.labels_>=0], labels_with_morethan100)]
我认为如果您运行此代码,您可以获得标签,以及大小超过100的集群的集群组件:
ls, cs = np.unique(labels,return_counts=True)
dic = dict(zip(ls,cs))
idx = [i for i,label in enumerate(labels) if dic[label] <100 and label >= 0]
from collections import Counter
labels_with_morethan100=[label for (label,count) in Counter(clustering.labels_).items() if count>100]
clusters_biggerthan100= clustering.components_[np.isin(clustering.labels_[clustering.labels_>=0], labels_with_morethan100)]
你还想继续吗?在这里发布比解决这个问题需要更长的时间。