python在使用Dataframe时数组中的索引过多
python程序做dunn索引来评估集群性能,学习一些网站上编写的相关程序,需要计算集群之间的最小距离和一个集群中的最大距离:python在使用Dataframe时数组中的索引过多,python,Python,python程序做dunn索引来评估集群性能,学习一些网站上编写的相关程序,需要计算集群之间的最小距离和一个集群中的最大距离: import pandas as pd import numpy as np from sklearn.metrics.pairwise import euclidean_distances ... def delta_fast(ck,cl,distances): values = distances[np.where(ck)][:,np.where(cl)]
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import euclidean_distances
...
def delta_fast(ck,cl,distances):
values = distances[np.where(ck)][:,np.where(cl)]
print(values)
def dunn_fast(points,labels):
distances = euclidean_distances(points)
print("distances")
print(distances)
print(distances.shape[0])
print(distances.shape[1])
ks = np.sort(np.unique(labels))
print("ks")
print(ks)
deltas = np.ones([len(ks),len(ks)]) * 1000000
big_deltas = np.zeros([len(ks),1])
l_range = list(range(0,len(ks)))
for k in l_range:
for l in (l_range[0:k] + l_range[k+1:]):
deltas[k,l] = delta_fast((labels == ks[k]),(labels == ks[l]),distances)
距离是一个数据帧(1406*1406)
但是,它有以下错误:
Traceback (most recent call last):
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 100, in <module>
get_group_members_cluster_info(cluster_method,cluster_number)
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 89, in get_group_members_cluster_info
dunn_fast(cal_cluster_data_df,cluster_data_label_df)
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 48, in dunn_fast
deltas[k,l] = delta_fast((labels == ks[k]),(labels == ks[l]),distances)
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 12, in delta_fast
values = distances[np.where(ck)][:,np.where(cl)]
IndexError: too many indices for array
回溯(最近一次呼叫最后一次):
文件“F:/MyDocument/F/MyDocument/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py”,第100行,在
获取组成员群集信息(群集方法,群集编号)
文件“F:/MyDocument/F/MyDocument/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py”,第89行,在获取组成员集群信息中
dunn_fast(校准群集数据、群集数据标签)
文件“F:/MyDocument/F/MyDocument/Training/Python/PyCharmProject/FaceBookCrawl/FB_group\u user\u dunnIndex.py”,第48行,dunn\u fast
delta[k,l]=delta_fast((标签==ks[k]),(标签==ks[l]),距离)
文件“F:/MyDocument/F/MyDocument/Training/Python/PyCharmProject/FaceBookCrawl/FB_group\u user\u dunnIndex.py”,第12行,以delta\u fast格式
值=距离[np.where(ck)][:,np.where(cl)]
索引器:数组的索引太多
这句话似乎是错误的:
值=距离[np.where(ck)][:,np.where(cl)]
你能告诉我原因和解决方法吗?我猜你的np.where(ck)
给出了一个数组,不能用来迭代数组。您还可以为您的错误发布一个最小的工作示例吗?在“距离”(Dataframe)中,没有“标签”列,这句话是否合适:值=距离[np.where(labels==ks[k])[:,np.where(labels==ks[l])]