Python sklean fit_预测不接受二维numpy数组
我尝试使用三种不同的聚类算法进行一些聚类分析。我正在从stdin加载数据,如下所示Python sklean fit_预测不接受二维numpy数组,python,numpy,scikit-learn,Python,Numpy,Scikit Learn,我尝试使用三种不同的聚类算法进行一些聚类分析。我正在从stdin加载数据,如下所示 import sklearn.cluster as cluster X = [] for line in sys.stdin: x1, x2 = line.strip().split() X.append([float(x1), float(x2)]) X = numpy.array(X) 然后将我的集群参数和类型存储在一个数组中 clustering_configs = [ ###
import sklearn.cluster as cluster
X = []
for line in sys.stdin:
x1, x2 = line.strip().split()
X.append([float(x1), float(x2)])
X = numpy.array(X)
然后将我的集群参数和类型存储在一个数组中
clustering_configs = [
### K-Means
['KMeans', {'n_clusters' : 5}],
### Ward
['AgglomerativeClustering', {
'n_clusters' : 5,
'linkage' : 'ward'
}],
### DBSCAN
['DBSCAN', {'eps' : 0.15}]
]
我试着用for循环调用它们
for alg_name, alg_params in clustering_configs:
class_ = getattr(cluster, alg_name)
instance_ = class_(alg_params)
instance_.fit_predict(X)
除了instance.fit\u prefict(X)
功能外,其他功能都正常工作。我收到一个错误
Traceback (most recent call last):
File "meta_cluster.py", line 47, in <module>
instance_.fit_predict(X)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 830, in fit_predict
return self.fit(X).labels_
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 812, in fit
X = self._check_fit_data(X)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 789, in _check_fit_data
X.shape[0], self.n_clusters))
TypeError: %d format: a number is required, not dict
回溯(最近一次呼叫最后一次):
文件“meta_cluster.py”,第47行,在
实例拟合预测(X)
文件“/usr/local/lib/python2.7/dist packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_u.py”,第830行,在fit_中
返回self.fit(X)标签_
文件“/usr/local/lib/python2.7/dist packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_u.py”,第812行
X=自检查拟合数据(X)
文件“/usr/local/lib/python2.7/dist-packages/scikit_-learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_-means_u.py”,第789行,在检查拟合数据中
X.shape[0],self.n_簇)
类型错误:%d格式:需要数字,而不是dict
有人知道我哪里会出错吗?我阅读了sklearn文档,它声称您只需要一个类似数组或稀疏矩阵的,shape=(n_样本,n_特征)
,我相信我有
有什么建议吗?谢谢
class sklearn.cluster.KMeans(n_clusters=8, init='k-means++', n_init=10, max_iter=300, tol=0.0001, precompute_distances='auto', verbose=0, random_state=None, copy_x=True, n_jobs=1, algorithm='auto')[source]
你称之为KMeans类的方式是
KMeans(n_clusters=5)
使用当前代码调用
KMeans({'n_clusters': 5})
这导致alg_参数作为Dict而不是类参数传递。其他算法也是如此。有没有一种简单的方法可以将这些值从字典中提取出来并转换成必要的格式?@wKavey:
KMeans(**{'n_clusters':5})
所以在我的例子中,实例=类(**alg_参数)
应该可以(只要alg_参数
不包含任何非函数/类的kwargs键)。