在Python中绘制Sklearn生成的Scipy稀疏矩阵
我使用在Python中绘制Sklearn生成的Scipy稀疏矩阵,python,numpy,scipy,scikit-learn,Python,Numpy,Scipy,Scikit Learn,我使用sklearn的Kmeans算法编写了60个文档的集群代码: 选择1:获取令牌(可能没有那么重要): 选择2:对文档进行矢量化和聚类 tfidf = TfidfVectorizer(tokenizer=tokenize, ngram_range = (1, 5)) tfs = tfidf.fit_transform(token_dict.values()) X = tfs print("n_samples: %d, n_features: %d" % X.shape) km = KMe
sklearn
的Kmeans算法编写了60个文档的集群代码:
选择1:获取令牌(可能没有那么重要):
选择2:对文档进行矢量化和聚类
tfidf = TfidfVectorizer(tokenizer=tokenize, ngram_range = (1, 5))
tfs = tfidf.fit_transform(token_dict.values())
X = tfs
print("n_samples: %d, n_features: %d" % X.shape)
km = KMeans(n_clusters=3, init='k-means++', max_iter=100, n_init=10, tol = 1e-8, verbose=True)
print("Clustering sparse data with %s" % km)
t0 = time()
km.fit(X)
print("done in %0.3fs" % (time() - t0))
labels = km.labels_
centroids = km.cluster_centers_
figure = pl.figure(1)
ax = Axes3D(figure)
ax.scatter(X[:, 0], X[:, 1], X[:, 2])
pl.show()
X是一个Scipy稀疏矩阵,看起来像
(0, 4558) 0.076421768112
(0, 5427) 0.015537938012
(0, 12380) 0.00517931267068
(0, 12554) 0.00517931267068
(0, 522) 0.116643751329
(0, 14100) 0.0120665949651
(0, 6851) 0.0723995697903
(0, 13100) 0.144799139581
(0, 14642) 0.0241331899
...
得到的误差为
Traceback (most recent call last):
File "features.py", line 185, in <module>
ob = keywords(['#happy', '#sad', '#feelingsick'])
File "features.py", line 106, in keywords
ax.scatter(X[:, 0], X[:, 1], X[:, 2])
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/mpl_toolkits/mplot3d/axes3d.py", line 2180, in scatter
patches = Axes.scatter(self, xs, ys, s=s, c=c, *args, **kwargs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 6337, in scatter
self.add_collection(collection)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 1481, in add_collection
self.update_datalim(collection.get_datalim(self.transData))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/collections.py", line 185, in get_datalim
offsets = np.asanyarray(offsets, np.float_)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/numeric.py", line 512, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
ValueError: setting an array element with a sequence.
回溯(最近一次呼叫最后一次):
文件“features.py”,第185行,在
ob=关键词([“#快乐”,“悲伤”,“感觉不舒服])
文件“features.py”,第106行,关键字
散度(X[:,0],X[:,1],X[:,2])
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/Python/mpl_toolkits/mplot3d/axes3d.py”,第2180行,分散显示
面片=轴。散点(self、xs、ys、s=s、c=c、*args、**kwargs)
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/Python/matplotlib/axes.py”,第6337行,分散显示
self.add_集合(集合)
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/Python/matplotlib/axes.py”,第1481行,在add_集合中
self.update_datalim(collection.get_datalim(self.transData))
get_datalim中的文件“/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/Python/matplotlib/collections.py”,第185行
偏移量=np.asanyarray(偏移量,np.float)
asanyarray中的文件“/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/Python/numpy/core/numeric.py”,第512行
返回数组(a,dtype,copy=False,order=order,subok=True)
ValueError:使用序列设置数组元素。
这与前面提到的错误几乎相同,但我不确定如何解决它。目标是绘制簇(而不仅仅是质心)
提前谢谢 X.todense()
将稀疏矩阵转换为规则密集矩阵X.A
(或toarray()
生成常规numpy数组。X.todense()
将稀疏矩阵转换为规则密集矩阵。X.A
(或toarray()
生成常规numpy数组。
Traceback (most recent call last):
File "features.py", line 185, in <module>
ob = keywords(['#happy', '#sad', '#feelingsick'])
File "features.py", line 106, in keywords
ax.scatter(X[:, 0], X[:, 1], X[:, 2])
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/mpl_toolkits/mplot3d/axes3d.py", line 2180, in scatter
patches = Axes.scatter(self, xs, ys, s=s, c=c, *args, **kwargs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 6337, in scatter
self.add_collection(collection)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 1481, in add_collection
self.update_datalim(collection.get_datalim(self.transData))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/collections.py", line 185, in get_datalim
offsets = np.asanyarray(offsets, np.float_)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/numeric.py", line 512, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
ValueError: setting an array element with a sequence.