Error handling TypeError:float()参数必须是字符串或数字,array=np.array(array,dtype=dtype,order=order,copy=copy)
我将K-means聚类应用于cvs和excel文件中的数据帧 参考: 我尝试使用csv文件中的数据运行代码,数据如下所示: 但是会收到以下错误: 回溯(最近一次呼叫最后一次): 文件“”,第1行,在 运行文件('/Users/nadiastraton/Documents/workspacePython/02450Toolbox_Python/Thesis/Scripts/Clustering/cluster3.py',wdir='/Users/nadiastraton/Documents/workspacePython/02450Toolbox_Python/Thesis/Scripts/Clustering') runfile中的文件“/Applications/anaconda2/lib/python2.7/site packages/spyderlib/widgets/externalshell/sitecustomize.py”,第699行 execfile(文件名、命名空间) 文件“/Applications/anaconda2/lib/python2.7/site packages/spyderlib/widgets/externalshell/sitecustomize.py”,第81行,在execfile中 execfile(文件名,*其中) 文件“/Users/cluster3.py”,第46行,在 估计拟合(x.as_矩阵) 文件“/Applications/anaconda2/lib/python2.7/site-packages/sklearn/cluster/k_-means_uuu.py”,第812行,适合 X=自检查拟合数据(X) 文件“/Applications/anaconda2/lib/python2.7/site packages/sklearn/cluster/k_means_uu.py”,第786行,在检查拟合数据中 X=检查数组(X,接受稀疏=csr',dtype=np.float64) 文件“/Applications/anaconda2/lib/python2.7/site packages/sklearn/utils/validation.py”,第373行,在check_数组中 array=np.array(array,dtype=dtype,order=order,copy=copy) TypeError:float()参数必须是字符串或数字 打印(文档) 已尝试修复错误: (est.fit(x.as_矩阵)代替est.fit(x)) 和 (c=labels.astype(np.int)而不是c=labels.astype(np.float))-(我的文件中的所有值都是int。) 但是,从np.float更改为np.int并不能解决此问题Error handling TypeError:float()参数必须是字符串或数字,array=np.array(array,dtype=dtype,order=order,copy=copy),error-handling,syntax-error,k-means,Error Handling,Syntax Error,K Means,我将K-means聚类应用于cvs和excel文件中的数据帧 参考: 我尝试使用csv文件中的数据运行代码,数据如下所示: 但是会收到以下错误: 回溯(最近一次呼叫最后一次): 文件“”,第1行,在 运行文件('/Users/nadiastraton/Documents/workspacePython/02450Toolbox_Python/Thesis/Scripts/Clustering/cluster3.py',wdir='/Users/nadiastraton/Documents/wo
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import pandas as pd
from sklearn.cluster import KMeans
np.random.seed(5)
centers = [[1, 1], [-1, -1], [1, -1]]
data=pd.read_csv('/DataVisualisationSample.csv')
print(data.head())
x = pd.DataFrame(data,columns = ['Post_Share_Count','Post_Like_Count','Comment_Count'])
y = pd.DataFrame(data,columns = ['Comment_Like_Count'])
print(x.info())
estimators = {'k_means_data_3': KMeans(n_clusters=3),
'k_means_data_8': KMeans(n_clusters=12),
'k_means_data_bad_init': KMeans(n_clusters=3, n_init=1,
init='random')}
fignum = 1
for name, est in estimators.items():
fig = plt.figure(fignum, figsize=(4, 3))
plt.clf()
ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=48, azim=134)
plt.cla()
est.fit(x.as_matrix)
labels = est.labels_
ax.scatter(x[:, 2], x[:, 0], x[:, 1], c=labels.astype(np.int))
ax.w_xaxis.set_ticklabels([])
ax.w_yaxis.set_ticklabels([])
ax.w_zaxis.set_ticklabels([])
ax.set_xlabel('Post_Share_Count')
ax.set_ylabel('Post_Like_Count')
ax.set_zlabel('Comment_Count')
fignum = fignum + 1
# Plot the ground truth
fig = plt.figure(fignum, figsize=(4, 3))
plt.clf()
ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=48, azim=134)
plt.cla()
for name, label in [('Popular', 0),
('Not Popular', 1),
('Least Popular', 2)]:
ax.text3D(x[y == label, 2].mean(),
x[y == label, 0].mean() + 1.5,
x[y == label, 1].mean(), name,
horizontalalignment='center',
bbox=dict(alpha=.5, edgecolor='w', facecolor='w'))
# Reorder the labels to have colors matching the cluster results
y = np.choose(y, [1, 2, 0]).astype(np.int)
ax.scatter(x[:, 2], x[:, 0], x[:, 1], c=y).astype(np.int)
ax.w_xaxis.set_ticklabels([])
ax.w_yaxis.set_ticklabels([])
ax.w_zaxis.set_ticklabels([])
ax.set_xlabel('Post_Share_Count')
ax.set_ylabel('Post_Like_Count')
ax.set_zlabel('Comment_Count')
plt.show()